From 1c619f1eadc0f936f66965d27b6437c0e95ce81f Mon Sep 17 00:00:00 2001 From: "Desmond A. Kirkpatrick" Date: Fri, 31 Jan 2025 21:03:53 -0800 Subject: [PATCH] put back converter doc --- doc/components/floating_point.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/doc/components/floating_point.md b/doc/components/floating_point.md index 641df2312..c11fbbc1e 100644 --- a/doc/components/floating_point.md +++ b/doc/components/floating_point.md @@ -77,3 +77,21 @@ It has options to control its performance: - `adderGen`: used to specify the kind of [Adder] used for key functions like the mantissa addition. Defaults to [NativeAdder], but you can select a [ParallelPrefixAdder] of your choice. - `seGen`: type of sign extension routine used, base class is [PartialProductSignExtension]. - `ppTree`: used to specify the type of ['ParallelPrefix'](https://intel.github.io/rohd-hcl/rohd_hcl/ParallelPrefix-class.html) used in the other critical functions like leading-one detect. + +## FloatingPointConverter + +A [FloatingPointConverter] component translates arbitrary width floating-point logic structures from one size to another, including handling sub-normals, infinities, and performs RNE rounding. + +Here is an example using the converter to translate from 32-bit single-precision floating point to 16-bit brain (bfloat16) floating-point format. + +```dart + final fp32 = FloatingPoint32(); + final bf16 = FloatingPointBF16(); + + final one = FloatingPoint32Value.getFloatingPointConstant( + FloatingPointConstants.one); + + fp32.put(one); + FloatingPointConverter(fp32, bf16); + expect(bf16.floatingPointValue.toDouble(), equals(1.0)); +```