Floating Register Compressed Decimal

IBM has recently developed a decimal floating point format which it is including on its new z9 computers. These computers replace the z990, the previous top-of-the-line z/Architecture machine from IBM, z/Architecture being the 64-bit extension to the architecture which began with System/360 and continued with extensions to System/370 and ESA/390.

This section refers to instructions which implement operations on numbers in that format and in related formats.

This format is also described on this page.

The basic characteristics of this data type are as follows:

Three data types are defined. All three data types feature a five-bit field which contains both the first decimal digit of the mantissa (or coefficient) of the floating-point number, and the first two bits of the exponent (which is in binary form), those two bits being allowed to take only the values 00, 01, or 10, but not 11.

This provides an efficient means of coding decimal floating-point numbers, as in each case, the remaining digits of the mantissa are all contained within 10-bit fields. Had there been no extra decimal digit left over, of course, a simple binary exponent field would have been just as efficient, and simpler, but as it happened, the coding scheme used allowed efficient coding to be retained while providing an exponent field which was neither too large nor too small, particularly for 32-bit and 64-bit floating-point values, and it also ensured that the size of the exponent field would monotonically increase as the length of the number increased.

Since this data type permits unnormalized values to be represented, not only are instructions provided which follow the "ideal exponent" rules described in the standard, which are the humanized floating-point instructions given below, but instructions are also provided for conventional unnormalized operation, for the purpose of carrying out significance arithmetic, and for conventional normalized arithmetic.

In addition, another type is provided that allows only normalized numbers to be represented, and which may include a partial decimal digit appended at the end of the number depending on the value of the first digit. This type is called numeric floating register compressed decimal. The coding of the first digit is shown in the table below:

0000   1 ... -       1 ... 0
0001   1 ... 1/5     1 ... 2
0010   1 ... 2/5     1 ... 4
0011   1 ... 3/5     1 ... 6
0100   1 ... 4/5     1 ... 8
----------------------------
0101   2 ... -       2 ... 0
0110   2 ... 1/2     2 ... 5

0111   3 ... -       3 ... 0
1000   3 ... 1/2     3 ... 5

1001   4 ... -       4 ... 0
1010   4 ... 1/2     4 ... 5
----------------------------
1011   5             5 ... 0

1100   6             6 ... 0

1101   7             7 ... 0

1110   8             8 ... 0

1111   9             9 ... 0

The four-bit field containing the first and last digits of the mantissa is referred to as the compound field. In the case of 32, 64, and 128-bit numbers, it replaces the combination field, and so the length of the exponent field is increased by one bit, leading to the range of exponents being first divided by three and then multiplied by two as compared to that in the standard format.

In alternate precisions, the general rule is that the compound field encodes the most significant digit of the number, and then the remaining digits are encoded as appropriate, with a combination field if the number of remaining digits is of the form 3n+1.

For the regular floating register compressed decimal type, when a compound field is present, the values 11110 and 11111 are used, as provided for by the revised IEEE 754 standard to encode infinity and NaN. When one is not present, inadmissible codes for the 7-bit or 10-bit field including the most significant digit of the number will be used.

For the numeric floating register compressed decimal type, gradual underflow is provided for by replacing the compound field with a four-bit field containing a single BCD digit when the exponent is at its minimum value; thus, in that case, the most significant digit may be zero. For this type, whether or not a combination field is present, the values E and F in the compound field, when the exponent is at its minimum value, encode infinity and NaN respectively.

The instructions which deal with these numbers have the opcodes shown below:

001000 100xxx  SWFRC   Swap Floating Register Compressed
001000 101xxx  CFRC    Compare Floating Register Compressed
001000 102xxx  LFRC    Load Floating Register Compressed
001000 103xxx  STFRC   Store Floating Register Compressed
001000 104xxx  AFRC    Add Floating Register Compressed
001000 105xxx  SFRC    Subtract Floating Register Compressed
001000 106xxx  MFRC    Multiply Floating Register Compressed
001000 107xxx  DFRC    Divide Floating Register Compressed

001000 112xxx  LUFRC   Load Unnormalized Floating Register Compressed
001000 113xxx  STUFRC  Store Unnormalized Floating Register Compressed
001000 114xxx  AUFRC   Add Unnormalized Floating Register Compressed
001000 115xxx  SUFRC   Subtract Unnormalized Floating Register Compressed
001000 116xxx  MUFRC   Multiply Unnormalized Floating Register Compressed
001000 117xxx  DUFRC   Divide Unnormalized Floating Register Compressed

001000 124xxx  AFRCH   Add Floating Register Compressed Humanized
001000 125xxx  SFRCH   Subtract Floating Register Compressed Humanized
001000 126xxx  MFRCH   Multiply Floating Register Compressed Humanized
001000 127xxx  DFRCH   Divide Floating Register Compressed Humanized

001100 100xxx  SWDRC   Swap Double Register Compressed
001100 101xxx  CDRC    Compare Double Register Compressed
001100 102xxx  LDRC    Load Double Register Compressed
001100 103xxx  STDRC   Store Double Register Compressed
001100 104xxx  ADRC    Add Double Register Compressed
001100 105xxx  SDRC    Subtract Double Register Compressed
001100 106xxx  MDRC    Multiply Double Register Compressed
001100 107xxx  DDRC    Divide Double Register Compressed

001100 112xxx  LUDRC   Load Unnormalized Double Register Compressed
001100 113xxx  STUDRC  Store Unnormalized Double Register Compressed
001100 114xxx  AUDRC   Add Unnormalized Double Register Compressed
001100 115xxx  SUDRC   Subtract Unnormalized Double Register Compressed
001100 116xxx  MUDRC   Multiply Unnormalized Double Register Compressed
001100 117xxx  DUDRC   Divide Unnormalized Double Register Compressed

001100 124xxx  AFDCH   Add Double Register Compressed Humanized
001100 125xxx  SFDCH   Subtract Double Register Compressed Humanized
001100 126xxx  MFDCH   Multiply Double Register Compressed Humanized
001100 127xxx  DFDCH   Divide Double Register Compressed Humanized

001200 100xxx  SWQRC   Swap Quad Register Compressed
001200 101xxx  CQRC    Compare Quad Register Compressed
001200 102xxx  LQRC    Load Quad Register Compressed
001200 103xxx  STQRC   Store Quad Register Compressed
001200 104xxx  AQRC    Add Quad Register Compressed
001200 105xxx  SQRC    Subtract Quad Register Compressed
001200 106xxx  MQRC    Multiply Quad Register Compressed
001200 107xxx  DQRC    Divide Quad Register Compressed

001200 112xxx  LUQRC   Load Unnormalized Quad Register Compressed
001200 113xxx  STUQRC  Store Unnormalized Quad Register Compressed
001200 114xxx  AUQRC   Add Unnormalized Quad Register Compressed
001200 115xxx  SUQRC   Subtract Unnormalized Quad Register Compressed
001200 116xxx  MUQRC   Multiply Unnormalized Quad Register Compressed
001200 117xxx  DUQRC   Divide Unnormalized Quad Register Compressed

001200 124xxx  AFDCH   Add Quad Register Compressed Humanized
001200 125xxx  SFDCH   Subtract Quad Register Compressed Humanized
001200 126xxx  MFDCH   Multiply Quad Register Compressed Humanized
001200 127xxx  DFDCH   Divide Quad Register Compressed Humanized

001300 100xxx  SWNFRC  Swap Numerical Floating Register Compressed
001300 101xxx  CNFRC   Compare Numerical Floating Register Compressed
001300 102xxx  LNFRC   Load Numerical Floating Register Compressed
001300 103xxx  STNFRC  Store Numerical Floating Register Compressed
001300 104xxx  ANFRC   Add Numerical Floating Register Compressed
001300 105xxx  SNFRC   Subtract Numerical Floating Register Compressed
001300 106xxx  MNFRC   Multiply Numerical Floating Register Compressed
001300 107xxx  DNFRC   Divide Numerical Floating Register Compressed

001300 110xxx  SWNDRC  Swap Numerical Double Register Compressed
001300 111xxx  CNDRC   Compare Numerical Double Register Compressed
001300 112xxx  LNDRC   Load Numerical Double Register Compressed
001300 113xxx  STNDRC  Store Numerical Double Register Compressed
001300 114xxx  ANDRC   Add Numerical Double Register Compressed
001300 115xxx  SNDRC   Subtract Numerical Double Register Compressed
001300 116xxx  MNDRC   Multiply Numerical Double Register Compressed
001300 117xxx  DNDRC   Divide Numerical Double Register Compressed

001300 120xxx  SWNQRC  Swap Numerical Quad Register Compressed
001300 121xxx  CNQRC   Compare Numerical Quad Register Compressed
001300 122xxx  LNQRC   Load Numerical Quad Register Compressed
001300 123xxx  STNQRC  Store Numerical Quad Register Compressed
001300 124xxx  ANQRC   Add Numerical Quad Register Compressed
001300 125xxx  SNQRC   Subtract Numerical Quad Register Compressed
001300 126xxx  MNQRC   Multiply Numerical Quad Register Compressed
001300 127xxx  DNQRC   Divide Numerical Quad Register Compressed

Targeted Arithmetic

As well, a few additional instructions are provided for the regular register compressed formats that provide targeted arithmetic. These instructions, so that they can retain the same standard format as other instructions with a register type, where a zero base register indicates a register-to-register instruction, use the 32-bit prefix form to allow additional room in the instruction for the target exponent.

141100 00nnnn 114xxx  ATFRC   Add Targeted Floating Register Compressed
141100 00nnnn 115xxx  STFRC   Subtract Targeted Floating Register Compressed
141100 00nnnn 116xxx  MTFRC   Multiply Targeted Floating Register Compressed
141100 00nnnn 117xxx  DTFRC   Divide Targeted Floating Register Compressed

141100 00nnnn 124xxx  AETFRC  Add Extensibly Targeted Floating Register Compressed
141100 00nnnn 125xxx  SETFRC  Subtract Extensibly Targeted Floating Register Compressed
141100 00nnnn 126xxx  METFRC  Multiply Extensibly Targeted Floating Register Compressed
141100 00nnnn 127xxx  DETFRC  Divide Extensibly Targeted Floating Register Compressed

141110 00nnnn 114xxx  ATDRC   Add Targeted Double Register Compressed
141110 00nnnn 115xxx  STDRC   Subtract Targeted Double Register Compressed
141110 00nnnn 116xxx  MTDRC   Multiply Targeted Double Register Compressed
141110 00nnnn 117xxx  DTDRC   Divide Targeted Double Register Compressed

141110 00nnnn 124xxx  AETDRC  Add Extensibly Targeted Double Register Compressed
141110 00nnnn 125xxx  SETDRC  Subtract Extensibly Targeted Double Register Compressed
141110 00nnnn 126xxx  METDRC  Multiply Extensibly Targeted Double Register Compressed
141110 00nnnn 127xxx  DETDRC  Divide Extensibly Targeted Double Register Compressed

141120 00nnnn 114xxx  ATQRC   Add Targeted Quad Register Compressed
141120 00nnnn 115xxx  STQRC   Subtract Targeted Quad Register Compressed
141120 00nnnn 116xxx  MTQRC   Multiply Targeted Quad Register Compressed
141120 00nnnn 117xxx  DTQRC   Divide Targeted Quad Register Compressed

141120 00nnnn 124xxx  AETQRC  Add Extensibly Targeted Quad Register Compressed
141120 00nnnn 125xxx  SETQRC  Subtract Extensibly Targeted Quad Register Compressed
141120 00nnnn 126xxx  METQRC  Multiply Extensibly Targeted Quad Register Compressed
141120 00nnnn 127xxx  DETQRC  Divide Extensibly Targeted Quad Register Compressed

In these instructions, the field marked xxx contains the destination register, the index register or source register, and the base register in the usual manner for memory-reference instructions. The field marked nnnn contains a twelve-bit target exponent value in excess-6,176 format, matching the exponent in the largest size of register compressed decimal numbers.

For decimal fixed-point arithmetic where all the numbers involved have the same exponent value, only a small range of exponent values is useful, since otherwise multiplication and division cannot produce a usable result. However, the inputs to a targeted instruction may have any exponent, and so the target exponent of the result can be one applicable to holding the result of an operation on two operands whose exponents are themselves determined through previous targeted operations, but which differ from that which is specified for the result.

A targeted arithmetic operation has the final operand aligned so that its exponent has the value specified as the target. This permits fixed-point arithmetic to be carried out automatically, without separate instructions for alignment, and in addition it has the benefit that since the fixed-point quantities are valid floating-point quantities, they are tagged with an indication of their magnitude. Normally, fixed-point arithmetic depends on adjustment steps being carried out after multiplies and divides, and the fixed-point quantities, being no different from the patterns of bits that represent integers, can easily be used incorrectly in calculations that assume a different location of the radix point.

Extensibly targeted arithmetic operations are carried out without rounding, and overflows from the most significant part of the mantissa will be ignored unless integer overflows are trapped, so they behave like integer operations in this respect as well. Ordinary targeted arithmetic operations, on the other hand, do not do this, so as to produce valid numerical results that can be incorporated into floating-point calculations.

This is inspired by a capability provided by the NORC computer.

Humanized Arithmetic

For this type, add, subtract, divide, and multiply humanized instructions are defined.

These operations accept both normalized and unnormalized numbers as operands, and may produce unnormalized results, but they do so based on a different criterion from the conventional unnormalized instructions previously described.

The divide instructons always produce a normalized result, but they explicitly accept unnormalized inputs without creating an exception.

The multiply instructions produce a result having the same number of significant digits (where possible) as there would be digits in the product of the mantissas of the two operands, where these mantissas are treated as decimal integers.

The add and subtract instructions produce a result that includes, as its least significant digit, a digit having the same place value as the lesser of the place values of the least significant digits of the two operands.

When two numbers are added together using an unnormalized operation, they are brought into alignment, and digit positions not part of either operand originally before alignment are omitted from the result; with a humanized operation, only digits not part of both operands originally before alignment are omitted from the result.

These rules correspond to the "ideal exponent" rules used with the new Decimal Floating Point architecture to be specified in the revised version of the IEEE 754 standard, and implemented in the IBM z9 computer.

A Note on Alternate Precisions

Note that the use of a combination field, while it is appropriate with floating-point sizes of 32, 64, and 128 bits, may not necessarily work well with floating-point sizes of 48 and 96 bits, 36 and 72 bits, 30, 60, and 120 bits, or 40 and 80 bits.

This is because the overall length of the field in memory allocated to a floating-point number determines the number of decimal digits of precision it may have. Given that the compressed decimal format involves placing three digits at a time in a 10-bit long field, and the design of the combination field was predicated on there being one digit left over after a number of such fields for each of the three formats defined, we can conclude that there are three possible cases:

3n decimal digits: The mantissa consists entirely of 10-bit fields, and the exponent is a plain binary field.
3n+1 decimal digits: Adding one bit to the four bits required for the first digit produces a field with 32 possible values, so a digit with three values can be prepended to the exponent field, following the ingenious scheme developed at IBM.
3n+2 decimal digits: Two decimal digits have 100 possible values, while seven bits have 128 possible values. This discrepancy does not appear to be worth exploiting, and so, again, the exponent would be a plain binary field.

Given these three choices of format, it seems as though decimal floating-point when implemented across varying word sizes, if it is desired to maintain a relatively close correspondence with the exponent sizes provided by the existing IBM formats, and to follow the same rule as they in regards to choice of exponent bias, might lead to the following formats:

Value size:   Exponent Values      Exponent Bias       Precision in Digits Sign Exponent Coefficient  Conventional
                                                                                                      Exponent Bias

 32 bits      3 *     64 =    192     101 (    96+ 5)   2 * 3 + 1 =  7      1    6+(2-)   20+(3+)         94
 64 bits      3 *    256 =    768     398 (   384+14)   5 * 3 + 1 = 16      1    8+(2-)   50+(3+)        382
128 bits      3 *  4,096 = 12,288   6,176 ( 6,144+32)  11 * 3 + 1 = 34      1   12+(2-)  110+(3+)      6,142

 36 bits                      256     134 (   128+ 6)   2 * 3 + 2 =  8      1    8        20+7           126
 72 bits                    2,048   1,040 ( 1,024+16)   6 * 3     = 18      1   11        60           1,038

 48 bits                    1,024     521 (   512+ 9)   3 * 3 + 2 = 11      1   10        30+7           510
 96 bits      3 *  1,024 =  3,072   1,559 ( 1,536+23)   8 * 3 + 1 = 25      1   10+(2-)   80+(3+)      1,534

 30 bits                      512     260 (   256+ 4)   2 * 3     =  6      1    9        20             254
 60 bits                      512     269 (   256+13)   5 * 3     = 15      1    9        50             254
120 bits                    4,096   2,078 ( 2,048+30)  10 * 3 + 2 = 32      1   12       100+7         2,046

 40 bits                      512     263 (   256+ 7)   3 * 3     =  9      1    9        30             254
 80 bits                    4,096   2,066 ( 2,048+18)   6 * 3 + 2 = 20      1   12        60+7         2,046

The notations (2-) and (3+) above refer to components of the 5-bit field which combines a value from 0 to 2 for the beginning of the exponent with a value from 0 to 9 for the beginning of the mantissa included in IBM's decimal floating point format.

The final column, Conventional Exponent Bias, shows what the exponent bias would be, if the radix point of the coefficient (or mantissa) were regarded, as has been the more common convention, as being at the beginning of the field rather than at the end of the field. This is derived by subtracting the precision of the number in digits to the exponent bias value normally given for the format, which has that number of digits, less two, added to half the exponent range.

An exponent in excess-n notation has n subtracted from the exponent to determine the power of the radix by which the mantissa is to be multiplied, and so, if we regard the mantissa as a fraction instead of an integer, we are making it smaller, and that power needs to be increased. Therefore, n, which is subtracted from it, is decreased. Thus, the difference between this floating-point format and conventional formats, which place the radix point in front of the mantissa and simply choose an exponent bias which is half the exponent range without adjustment, is not as large as it seems at first.

Since it is felt that each of the series of word sizes would normally be used independently, strict monotonicity between series is not treated as an overriding goal. In one case, the series of 30, 60, and 120 bits, even monotonicity in the growth of the exponent field within a series had to be set aside in order to achieve a reasonably large exponent field for the 30 bit size without this leading to excessively-large exponent fields for the other sizes.

Note that, in the absence of a 5-bit field combining the start of the exponent and mantissa, it is assumed that no limitation is placed on the range of the exponent field in order to indicate infinity and NaN values. Thus, either the 7 bit field giving the first two digits of the mantissa, or the 10 bit field giving the first three digits of the mantissa, would presumably be used for that purpose, two of the 28 or 24 unused combinations of bits serving this purpose.

In the case of the numerical register compressed decimal floating-point data type, for the 32, 64, and 128 bit-long data types, the five-bit combination field representing the first two bits of the exponent and the first digit of the number is replaced by a four-bit field representing the first digit of the number, an extra partial digit appended to the end of the number, and, in effect, the last two bits of the exponent if it is thought of as applying to a mixed-radix system with radices alternating between 2, 2.5, and 2 in a cycle of three.

For other lengths, this four bit field representing the first digit of the number must be retained, and therefore the presence of a seven-bit field containing the next two digits of the number, or a combination field, following the form used in the previous numerical format, but in this case containing the second most significant digit of the number, is determined by the number of digits represented (ignoring the final appended partial digit) less one.

The resulting numerical formats are:

Value size:   Exponent Values         Precision in Digits Sign Exponent Compound Mantissa

 32 bits                      128        2 * 3 + 1 =  7      1    7         4       20
 64 bits                      512        5 * 3 + 1 = 16      1    9         4       50
128 bits                    8,192       11 * 3 + 1 = 34      1   13         4      110

 36 bits      3 *     64 =    192        2 * 3 + 2 =  8      1    6+(2-)    4       20+(3+)
 72 bits                    1,024        6 * 3     = 18      1   10         4       50+7

 48 bits      3 *    256 =    768        3 * 3 + 2 = 11      1    8+(2-)    4       30+(3+)
 96 bits                    2,048        8 * 3 + 1 = 25      1   11         4       80

 30 bits                      256        2 * 3     =  6      1    8         4       10+7
 60 bits                      256        5 * 3     = 15      1    8         4       40+7
120 bits      3 *   1,024 = 3,072       10 * 3 + 2 = 32      1   10+(2-)    4      100+(3+)

 40 bits                      256        3 * 3     =  9      1    8         4       20+7
 80 bits      3 *   1,024 = 3,072        6 * 3 + 2 = 20      1   10+(2-)    4       60+(3+)

In this format, unlike the one supporting unnormalized operation, the decimal point of the mantissa field lies before the most significant digit, and the exponent bias is always one-half the number of possible values for the exponent.

When the exponent is at its minimum value, the four bit compound field instead contains a single BCD digit, which may be zero, to allow gradual underflow as with the standard floating-point type.

The layout of the formats in the different sizes for these two types of floating-point number are illustrated below: