Thursday, February 4, 2010

3. Floating-Point Arithmetic











 < Day Day Up > 







3. Floating-Point Arithmetic



This section concerns arithmetic on the floating-point types: float and double.





3.1. Floating-point arithmetic is inexact



Prescription: Don't use floating-point where exact results are required; instead, use an integral type or BigDecimal.



Avoid floating-point loop indices.



Avoid using the ++ and -- operators on floating-point variables, as these operators have no effect on most floating-point values.



Avoid testing floating-point values for equality.



Prefer double to float.



References: Puzzles 2, 28, and 34; [JLS3 4.2.3], [EJ Item 31], and [IEEE-754].







3.2. NaN is not equal to any floating-point value, including itself



Prescription: Avoid testing floating-point values for equality. This is not always sufficient to avoid problems, but it's a good start.



References: Puzzle 29; [JLS 15.21.1] and [IEEE-754].







3.3. Conversions from int to float, long to float, and long to double are lossy



Prescription: Avoid computations that mix integral and floating-point types. Prefer integral arithmetic to floating-point.



References: Puzzles 34 and 87; [JLS 5.1.2].







3.4. The BigDecimal(double) constructor returns the exact value of its floating-point argument



Prescription: Always use the BigDecimal(String) constructor; never use BigDecimal(double).



References: Puzzle 2.

















     < Day Day Up > 



    0 comments:

    Post a Comment