Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
463 views
in Technique[技术] by (71.8m points)

cuda - IEEE-754 standard on NVIDIA GPU (sm_13)

If I perform a float (single precision) operation on a Host and a Device (GPU arch sm_13) , then will the values be different ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

A good discussion of this is availble in a whitepaper from NVIDIA. Basically:

  • IEEE-754 is implemented by almost everything currently;
  • Even between faithful implementation of this standard, you can still see differences in results (famously, Intel's doing 80-bit internally for double precision), or high optimization settings with your compiler can change results
  • Compute capability 2.0 and later NVIDIA cards support IEEE-754 in both single and double precision, with only very small caveats
    • Some rounding modes aren't supported for some operations - this is only relevant if you explicitly change rounding modes in your code
    • There's some subtleties involving fused multiply and adds
    • CUDA also provides (slightly) lower precision but faster implementations of several operations, and of course if you use those explicitly or implicitly (with compiler options) you naturally won't get full ieee-754 results
  • Compute capability 1.3 cards support ieee-754 as above in double precision but not in single precision; (single precision doesn't support denormal - eg very small - numbers, no FMAs, square root and division aren't fully accurate)
  • Compute capability 1.2 cards only have single precision and those aren't full ieee-754 as above.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...