FP4 / FP8 / BF16 / FP16 / TF32 / FP32 / FP64 / MPFR-style high-precision demo
An educational page for experiencing the low precision, mixed-precision accumulation, scaling, and comparison against high-precision reference values used in AI.
In this single-file version, the reference is computed with JavaScript double precision. For true MPFR table generation, use the "Legacy MPFR graph" tab.
Converts the input vector \(\mathbf{x}=(x_1,\ldots,x_n)\) into a probability distribution.
When \(x_i\) is large, \(\exp(x_i)\) overflows, so implementations subtract the maximum \(m=\max_j x_j\).
This transformation is mathematically identical but far more stable numerically.
The dot product is a basic operation in matrix multiplication and neural networks.
On AI accelerators, the inputs \(x_i,y_i\) are often kept in low precision such as FP8 or FP16, while the accumulated sum \(s\) is held in a wider format such as FP32.
The goal is to reduce memory bandwidth and compute with low-precision inputs while suppressing accumulated rounding error with a high-precision accumulator.
Plots the positive values representable in small floating-point formats on a number line. Useful for explaining the trade-off between range and significant digits.
This page introduces part of the work of the High Performance Computing Laboratory.
The keywords are AI, high-performance computing, and high-precision computing. AI centers on massive low-precision computation of roughly 4–16 bits, whereas scientific computing is dominated by high-precision computation of 64 bits or more. To bridge this gap, the High Performance Computing Laboratory pursues techniques to accelerate high-precision scientific computing on AI-oriented hardware.
Floating-point numbers represent a real number as a sign, an exponent, and a mantissa. For normalized numbers the conceptual form is as follows.
In low-precision formats, how the bits are split between exponent and mantissa matters. For example, FP8 E4M3 has more mantissa (precision-leaning), while FP8 E5M2 has more exponent (range-leaning).
The figure below lines up the floating-point formats used on this page with lengths proportional to their actual bit widths, taking MPFR 128-bit as the maximum width. You can compare the lengths of FP4/FP8 against FP64/MPFR 128-bit at a glance, without horizontal scrolling.
Bar length represents total bit width. Within each bar, red is the sign, blue the exponent, and green the mantissa. A longer exponent means a wider range; a longer mantissa resolves nearby values more finely. MPFR is shown not as a fixed-length IEEE format but as an arbitrary-precision format with the specified mantissa precision.
| Topic | Tab to try | What you can observe |
|---|---|---|
| 4–16 bit low-precision AI computation | Function quantization, Softmax stabilization | Rounding error, saturation, and the effect of scaling |
| Mixed-precision computation | Dot product & accumulation | Effect of low-precision input + high-precision accumulator |
| High-precision computation | Legacy MPFR graph | Function tables/graphs at a specified MPFR bit width |
| Understanding formats | Format visualization & overview | Exponent/mantissa allocation and density of representable values |
In addition to the original input form, the evaluation functions from the uploaded mathfunc.php and mathfunc_mpfr.php have been integrated into this script. Where mpfr_gexpr is available it evaluates at the specified MPFR bit width; otherwise it falls back to PHP standard double precision.