From 4bbd1e919e3f11a28749290a4a458db4f606ee1a Mon Sep 17 00:00:00 2001 From: Siavash Golkar <35383824+golkar@users.noreply.github.com> Date: Sun, 16 Jun 2024 14:17:35 -0400 Subject: [PATCH] Update 2024-05-30-counting.md --- _posts/2024-05-30-counting.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2024-05-30-counting.md b/_posts/2024-05-30-counting.md index b45aec8..68f76fd 100644 --- a/_posts/2024-05-30-counting.md +++ b/_posts/2024-05-30-counting.md @@ -50,8 +50,8 @@ This information helps disambiguate the different regions based on context. #### Key Propositions -1. **Proposition 1:** If the regional contextual position information is available in the latent representation of the tokens at some layer of a Transformer, the contextual counting task can be solved with a single additional layer. -2. **Proposition 2:** A causal Transformer with a single layer and no position encoding (NoPE) can infer the regional contextual position. +- **Proposition 1:** If the regional contextual position information is available in the latent representation of the tokens at some layer of a Transformer, the contextual counting task can be solved with a single additional layer. +- **Proposition 2:** A causal Transformer with a single layer and no position encoding (NoPE) can infer the regional contextual position. These propositions imply that a two-layer causal Transformer with NoPE can solve the contextual counting task.