Skip to content

Commit

Permalink
Fix markdown in float8 example (keras-team#1871)
Browse files Browse the repository at this point in the history
* Fix markdown in float8 example

* Fix markdown in float8 example

* Fix KerasNLP's PROJECT_URL
  • Loading branch information
james77777778 authored May 28, 2024
1 parent 59daf9f commit 6aa9bf4
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 19 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@
In detail, there are two distinct types of FP8: E4M3 and E5M2, useful in
different parts of training.
- E4M3: It consists of 1 sign bit, 4 exponent bits and 3 bits of mantissa. It
can store values up to +/-448 and nan.
can store values up to +/-448 and nan.
- E5M2: It consists of 1 sign bit, 5 exponent bits and 2 bits of mantissa. It
can store values up to +/-57344, +/-inf and nan. The tradeoff of the
increased dynamic range is lower precision of the stored values.
can store values up to +/-57344, +/-inf and nan. The tradeoff of the
increased dynamic range is lower precision of the stored values.
Typically, E4M3 is best used during the forward pass because activations and
weights require more precision. In the backward pass, however, E5M2 is utilized
because gradients are less susceptible to the loss of precision but require
Expand Down Expand Up @@ -51,7 +53,7 @@
"""

"""shell
pip install -q --upgrade git+https://github.com/keras-team/keras-nlp.git # Get the latest version of KerasNLP
pip install -q --upgrade keras-nlp
pip install -q --upgrade keras # Upgrade to Keras 3.
"""

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,13 @@
"\n",
"In detail, there are two distinct types of FP8: E4M3 and E5M2, useful in\n",
"different parts of training.\n",
"\n",
"- E4M3: It consists of 1 sign bit, 4 exponent bits and 3 bits of mantissa. It\n",
" can store values up to +/-448 and nan.\n",
"can store values up to +/-448 and nan.\n",
"- E5M2: It consists of 1 sign bit, 5 exponent bits and 2 bits of mantissa. It\n",
" can store values up to +/-57344, +/-inf and nan. The tradeoff of the\n",
" increased dynamic range is lower precision of the stored values.\n",
"can store values up to +/-57344, +/-inf and nan. The tradeoff of the\n",
"increased dynamic range is lower precision of the stored values.\n",
"\n",
"Typically, E4M3 is best used during the forward pass because activations and\n",
"weights require more precision. In the backward pass, however, E5M2 is utilized\n",
"because gradients are less susceptible to the loss of precision but require\n",
Expand Down Expand Up @@ -75,7 +77,7 @@
},
"outputs": [],
"source": [
"!pip install -q --upgrade git+https://github.com/keras-team/keras-nlp.git # Get the latest version of KerasNLP\n",
"!pip install -q --upgrade keras-nlp\n",
"!pip install -q --upgrade keras # Upgrade to Keras 3."
]
},
Expand Down Expand Up @@ -278,8 +280,7 @@
" vocabulary_size=vocab_size,\n",
" reserved_tokens=reserved_tokens,\n",
" )\n",
" return vocab\n",
""
" return vocab\n"
]
},
{
Expand Down Expand Up @@ -416,8 +417,7 @@
"\n",
"train_ds = make_dataset(train_ds)\n",
"val_ds = make_dataset(val_ds)\n",
"test_ds = make_dataset(test_ds)\n",
""
"test_ds = make_dataset(test_ds)\n"
]
},
{
Expand Down Expand Up @@ -470,8 +470,7 @@
" x = keras.layers.Dense(intermediate_dim, activation=\"relu\")(x)\n",
" x = keras.layers.Dropout(dropout)(x)\n",
" outputs = keras.layers.Dense(1, activation=\"sigmoid\")(x)\n",
" return keras.Model(inputs=token_id_input, outputs=outputs)\n",
""
" return keras.Model(inputs=token_id_input, outputs=outputs)\n"
]
},
{
Expand Down Expand Up @@ -665,4 +664,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,13 @@ floating point with nearly no degradation in accuracy.

In detail, there are two distinct types of FP8: E4M3 and E5M2, useful in
different parts of training.

- E4M3: It consists of 1 sign bit, 4 exponent bits and 3 bits of mantissa. It
can store values up to +/-448 and nan.
can store values up to +/-448 and nan.
- E5M2: It consists of 1 sign bit, 5 exponent bits and 2 bits of mantissa. It
can store values up to +/-57344, +/-inf and nan. The tradeoff of the
increased dynamic range is lower precision of the stored values.
can store values up to +/-57344, +/-inf and nan. The tradeoff of theincreased
dynamic range is lower precision of the stored values.

Typically, E4M3 is best used during the forward pass because activations and
weights require more precision. In the backward pass, however, E5M2 is utilized
because gradients are less susceptible to the loss of precision but require
Expand Down Expand Up @@ -53,7 +55,7 @@ Note: The dependency on TensorFlow is only required for data processing.


```python
!pip install -q --upgrade git+https://github.com/keras-team/keras-nlp.git # Get the latest version of KerasNLP
!pip install -q --upgrade keras-nlp
!pip install -q --upgrade keras # Upgrade to Keras 3.
```

Expand Down

0 comments on commit 6aa9bf4

Please sign in to comment.