Introduction to the floating-point storage mode of C++ 11/22 Update SLTechnology News&Howtos

Introduction to the floating-point storage mode of C++

2025-11-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "introduction of C++ floating-point storage". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn the "introduction to C++ floating-point storage".

Catalogue

Floating point type and its storage mode

1. IEEE floating point standard

Second, storage mode

The regulations of IEEE 754 for the significant number M and index E.

Key points:

According to the different values in the index domain, it can be divided into the following three cases:

Floating point type and its storage mode

Sometimes variables are needed to store numbers with decimal points, or to store maximum or minimum numbers. Such numbers can be stored in floating-point format (named because the decimal point is "floating"). The C language provides three floating-point types, corresponding to three different floating-point formats.

When the precision requirement is not strict (less than six places after the decimal point), the float type is a very suitable type. Double provides higher precision, which is sufficient for most programs. Longdouble supports extremely high precision requirements and is rarely used.

The C standard does not specify the precision provided by the float, double, and long double types, because different computers can store floating-point numbers in different ways. Most modern computers follow the specification of the IEEE754 standard (that is, IEC 60559), so it is also used here as an example.

1. IEEE floating point standard

The IEEE standard developed by IEEE provides two main floating-point formats: single-precision (32-bit) and double-precision (64-bit). Numerical values are stored in the form of scientific notation, and each number consists of three parts: symbols, exponents, and decimals. The number of digits in the exponential part indicates the possible size of the value, while the number of digits in the decimal part indicates the accuracy. In the single-precision format, the exponential length is 8 digits, while the decimal part accounts for 23 digits. Therefore, the maximum value that a single precision number can represent is about 3.40 × 1038, where the precision is 6 decimal digits.

The IEEE standard also describes two other formats: single expansion precision and double expansion precision. The standard does not specify the number of digits in these formats, but requires a single extension precision type of at least 43 bits and a double extension precision type of at least 79 bits.

Type minimum maximum precision remarks ● float1.175 49 × 10-383.402 82 × 1038 after 6-digit single precision 32-digit ● double2.225 07 × 10-3081.797 69 × 10308 double precision 64 digits after 15-digit decimal point

The above table shows the floating-point type characteristics implemented according to the IEEE standard. [the minimum positive value of normalization is given in the table, and the number of non-normalization can be even smaller.] The long double type is not shown in this table because its length varies from machine to machine, and the most common sizes are 80 and 128 bits.

Second, storage mode

For floating-point data, the first thing we need to understand is that floating-point numbers and integers are encoded differently, and the IEEE floating-point standard identifies a floating-point number in the following form.

V = (- 1) S M 2e

(- 1) S represents a symbolic bit, a positive number when Spati0, and a negative number when Signor1.

M represents a significant number and is a binary decimal with a value greater than or equal to 1 and less than 2.

2e represents the exponential bit.

Next, I'll use float as an example. The double principle is the same, but the number of digits is different.

For example: decimal number: 88.8125-> binary: 1011000.1101

Then 1000.1101 is transformed into the form of the above formula M, and its range is [1Power2), so move the decimal point 6 places to the left to get 1.0110001101 × 26 (if you don't understand here, compare the decimal point with the decimal point, move the decimal point one place to the left and multiply the decimal point by 10, binary times 2).

In the end, we get S = 0, M = 1.0110001101, E = 6, but it's not that simple. Let's move on.

The regulations of IEEE 754 for the significant number M and index E.

1. Significant number M:

one

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.