版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hongjunliu1989/article/details/83348288
浮点数学习总结
代码片段
DecimalFormat df = new DecimalFormat("#0.00");
// 0.12
System.out.println(df.format(0.1240));
// 0.12
System.out.println(df.format(0.1250));
// 0.13
System.out.println(df.format(0.1251));
// 0.14
System.out.println(df.format(0.1350));
概述
浮点数能表示形如 Num = M * 2E的小数
32位的单精度float的二进制结构表示为:
0 - 0000 0000 - 0000 0000 0000 0000 0000 000
1bit位为符号位
8bit位为指数位
23bit位为尾数位
概念定义
k 指数位的位数
n 尾数位的位数
e k位指数位表示的无符号数
bias = 2^(k-1) - 1 偏置
f = 0.XXXX n位尾数位表示的无符号数
根据指数位的不同分为以下三种类型的浮点数
1. 规格化数
指数位不全为0 且 不全为1
E = e - bias
M = 1 + f
2. 非规格化数
指数位全部为0
E = 1 - bias
M = f
3. 特殊值
指数位全部为1,根据尾数位的不同分为以下两种
1)尾数位全部为0,当符号位为0表示正无穷,符号位为1表示负无穷
2)尾数位不全为0,表示NaN(Not a Number)
数值举例
以6bit位表示的浮点数进行举例,我们规定k = 3, n=2,结构如下:
0 - 000 - 00
其中:
bias = 2(k-1) - 1
= 2(3-1) - 1
= 4 - 1
= 3
1. 非规格化数
其中
E = 1 - bias = -2;
能够表示的浮点数如下:
0 - 000 - 00 M = 0.00 = 0 -> Num = M * 2^E = 0 * 2^(-2) =0
0 - 000 - 01 M = 0.01 = 1/2^2 = 1/4 -> Num = (1/4) * 2^(-2) = (1/4)*(1/4) = 1/16
0 - 000 - 10 .... Num = (1/2) * (1/4) = 2/16
0 - 000 - 11 .... Num = (3/4) * (1/4) = 3/16
2. 规格化数
有e = 1, E = e - bias = 1 - 3 = -2, 能够表示的浮点数如下:
0 - 001 - 00 M = 1 + 0.00 = 1 -> Num = M * 2^E = 1 * 2^(-2) = 1/4 = 4/16
0 - 001 - 01 M = 1 + 0.01 = 5/4 -> Num = (5/4) * (1/4) = 5/16
0 - 001 - 10 M = 1 + 0.10 = 6/4 -> Num = (3/2) * (1/4) = 3/8 = 6/16
0 - 001 - 11 M = 1 + 0.11 = 7/4 -> Num = 7/16
有e = 2,E = -1,能够表示的浮点数如下:
0 - 010 - 00 M = 1 -> Num = 1 * (1/2) = 8/16
0 - 010 - 01 M = 5/4 -> Num = (5/4)* (1/2) = 5/8 = 10/16
0 - 010 - 10 M = 6/4 -> Num = 6/8 = 12/16
0 - 010 - 11 M = 7/4 -> Num = 7/8 = 14/16
有E = 0,能够表示的浮点数如下:
0 - 011 - 00 M = 1 -> Num = 1 * 1 = 1 = 16/16
0 - 011 - 01 M = 5/4 -> Num = 5/4 = 20/16
0 - 011 - 10 M = 6/4 -> Num = 6/4 = 24/16
0 - 011 - 11 M = 7/4 -> Num = 7/4 = 24/16
有E = 1,能够表示的浮点数如下:
0 - 100 - 00 M = 1 -> Num = 1 * 2 = 2
0 - 100 - 01 M = 5/4 -> Num = 5/2
0 - 100 - 10 M = 6/4 -> Num = 6/2
0 - 100 - 11 M = 7/4 -> Num = 7/2
有E = 2,能够表示的浮点数如下:
0 - 101 - 00 M = 1 -> Num = 1 * 4 = 4
0 - 101 - 01 M = 5/4 -> Num = 5
0 - 101 - 10 M = 6/4 -> Num = 6
0 - 101 - 11 M = 7/4 -> Num = 7
有E = 3,能够表示的浮点数如下:
0 - 110 - 00 M = 1 -> Num = 1 * 8 = 8
0 - 110 - 01 M = 5/4 -> Num = 10
0 - 110 - 10 M = 6/4 -> Num = 12
0 - 110 - 11 M = 7/4 -> Num = 14
负无穷<-14------------------0------------------1-2-3-4-5-6-7-8--10--12--+14> 正无穷
6bit位表示的浮点数在数轴上的分布如上所示, 可以看到数值不是均匀分布的,且其中存在无数的小数无法精确表示
由此可以推导出32位的单精度浮点数和64位的双精度浮点数都有存在这些性质,所以当无法精确表示小数时,需要采取舍入策略
IEEE754标准默认采用round to nearest原则进行舍入
round to nearest(to nearest, ties to even)舍入策略如下:
- 将小数舍入到与原值最接近的可以精确表示的浮点数
- 当与小数最接近的左右可以精确表示的浮点数与原值的差值相等时,采用向偶数舍入的策略(最低有效位为偶数,二进制中0为偶数)
以10进制举例,舍入到整数,如下:
原值 | round to nearest的近似值 | 描述 |
---|---|---|
1.4 | 1 | 距离1最近 |
1.5 | 2 | 1(左) - 1.5 - 2(右) 1、2和1.5之间的差值相等,向偶数舍入,结果为2 |
1.6 | 2 | 距离2最近 |
2.5 | 2 | 2(左) - 2.5 - 3(右) 2、3和2.5之间的差值相等,向偶数舍入,结果为2 |
回到代码片段
System.out.println(df.format(0.1250));
----0.12-----0.1250---0.13---->
0.12、0.13和0.125之间的差值相等,向偶数舍入为0.12
System.out.println(df.format(0.1251));
-----0.12 ---- 0.1251 ------0.13----->
0.1251距离0.13更近,所以直接舍入为0.13
System.out.println(df.format(0.1350));
-----0.13-----0.1350-----0.14----->
0.13、0.14和0.135之间的差值相等,向偶数舍入为0.14
总结
round to nearest, ties to even