- Model Info:
- Tied embeddings: False
- LM head uses bias: True
- Embeddings shape: [51200, 2560]
- Tokenizer Info:
- Vocab Size: 50295
- Tokenizer Class: CodeGenTokenizer
- Tokenizer Type: BPE
- Bytes handling: Byte Input
- Token for verification prompt building: BuyableInstoreAndOnline
- Token id for verification prompt building: 40242
- Indicator summary:
- Indicator for under-trained tokens: E_{in} L2 Norm
- Overall distribution: 1.184 +/- 0.242
- Detected Token Counts:
- Number of tested under-trained tokens: 999, 969 non-special, 101 below p = 0.01 threshold, 82 below soft indicator threshold
- Number of single byte tokens: 256, of which 13 below indicator threshold
- Number of special tokens: 0, of which 0 below indicator threshold
- Number of non-single-byte UTF-fragment tokens: 216, of which 3 below soft indicator threshold
82 entries below threshold of 0.123
token_id | token | indicator | max_prob | in_other_tokens |
---|---|---|---|---|
42424 | DragonMagazine |
0.000763806 | 3.8e-07 | |
17900 | ▁Dragonbound |
0.000779675 | 3.4e-07 | |
36174 | ▁RandomRedditorWithNo |
0.000779685 | 3.3e-07 | |
30213 | ▁externalToEVAOnly |
0.000785526 | 2.1e-07 | |
42202 | GoldMagikarp |
0.000786877 | 3.4e-07 | ▁SolidGoldMagikarp |
37579 | TPPStreamerBot |
0.000787823 | 2.2e-07 | |
41551 | Downloadha |
0.000791465 | 2.1e-07 | |
43177 | EStreamFrame |
0.000793959 | 2.3e-07 | |
31666 | ?????-?????- |
0.000794586 | 2e-07 | |
42089 | ▁TheNitrome |
0.000801782 | 2.2e-07 | ▁TheNitromeFan |
39755 | isSpecialOrderable |
0.000801835 | 3.6e-07 | |
25992 | ▁裏覚醒 |
0.000806573 | 2.7e-07 | |
36938 | ▁sqor |
0.000807593 | 3.5e-07 | |
30209 | ▁unfocusedRange |
0.000809125 | 2.3e-07 | |
29372 | ▁guiActiveUn |
0.000809322 | 2e-07 | ▁guiActiveUnfocused |
39811 | soDeliveryDate |
0.000809846 | 3.8e-07 | |
39753 | quickShipAvailable |
0.000810512 | 2.8e-07 | |
35207 | ▁attRot |
0.000816364 | 2.2e-07 | |
36173 | ▁RandomRedditor |
0.000816671 | 1.9e-07 | ▁RandomRedditorWithNo |
43361 | ゼウス |
0.000817697 | 2.6e-07 |
62 additional entries below threshold
token_id | token | indicator | max_prob | in_other_tokens |
---|---|---|---|---|
18472 | ▁guiActive |
0.000821784 | 2.3e-07 | ▁guiActiveUn , ▁guiActiveUnfocused |
30210 | ▁guiActiveUnfocused |
0.000824333 | 3.8e-07 | |
31765 | MpServer |
0.000826598 | 2.1e-07 | |
47571 | ▁DevOnline |
0.000826712 | 3.2e-07 | |
30906 | rawdownloadcloneembedreportprint |
0.000827612 | 3.9e-07 | |
40241 | InstoreAndOnline |
0.000827931 | 3.9e-07 | BuyableInstoreAndOnline |
42090 | ▁TheNitromeFan |
0.000828417 | 3.5e-07 | |
39821 | 龍契士 |
0.00082932 | 2.5e-07 | |
43065 | ▁srfAttach |
0.00083027 | 3.7e-07 | |
39177 | ItemThumbnailImage |
0.000831932 | 2.2e-07 | |
40240 | oreAndOnline |
0.000833177 | 2.2e-07 | InstoreAndOnline , BuyableInstoreAndOnline |
30898 | embedreportprint |
0.000834069 | 3e-07 | cloneembedreportprint , rawdownloadcloneembedreportprint |
37631 | FactoryReloaded |
0.000834167 | 2.2e-07 | |
43453 | ▁SolidGoldMagikarp |
0.000837289 | 2.1e-07 | |
30899 | cloneembedreportprint |
0.00083965 | 1.8e-07 | rawdownloadcloneembedreportprint |
45544 | ▁サーティ |
0.000840979 | 2.5e-07 | ▁サーティワン |
42586 | ▁srfN |
0.000841239 | 3.6e-07 | |
30905 | rawdownload |
0.000846193 | 2.6e-07 | rawdownloadcloneembedreportprint |
28666 | PsyNetMessage |
0.000846675 | 2.7e-07 | |
24934 | ForgeModLoader |
0.00084798 | 3.3e-07 | |
30211 | ▁guiIcon |
0.000848594 | 3e-07 | |
30212 | ▁externalToEVA |
0.000848985 | 2.9e-07 | ▁externalToEVAOnly |
39757 | channelAvailability |
0.000854599 | 2.1e-07 | |
50009 | ▁strutConnector |
0.000861435 | 2.2e-07 | |
40242 | BuyableInstoreAndOnline |
0.000864522 | 3.2e-07 | |
33454 | 龍喚士 |
0.000865251 | 2.6e-07 | |
37574 | StreamerBot |
0.000870607 | 2.3e-07 | TPPStreamerBot |
35579 | ▁Mechdragon |
0.000883004 | 2.9e-07 | |
35496 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
0.000893283 | 2.1e-07 | |
39446 | ▁SetFontSize |
0.0232393 | 8.9e-08 | |
14827 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
0.0321166 | 3.5e-07 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ , ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
41383 | assetsadobe |
0.0339458 | 1.1e-06 | |
45545 | ▁サーティワン |
0.0359049 | 1.7e-07 | |
23282 | ▁davidjl |
0.0382696 | 2e-07 | |
12781 | wcsstore |
0.0422431 | 3e-07 | |
31957 | cffffcc |
0.0449799 | 6e-07 | |
30208 | ▁externalTo |
0.0496358 | 1.2e-05 | ▁externalToEVA , ▁externalToEVAOnly |
39253 | ▁UCHIJ |
0.0501214 | 6.4e-08 | |
38370 | iHUD |
0.0533025 | 1.6e-06 | |
39756 | inventoryQuantity |
0.0555391 | 7.6e-06 | |
34448 | ▁ItemLevel |
0.061663 | 2.2e-07 | |
49781 | EngineDebug |
0.0664824 | 6.6e-08 | |
46600 | ▁Adinida |
0.0681257 | 1.3e-07 | |
39752 | quickShip |
0.0722629 | 2.5e-06 | quickShipAvailable |
43038 | ▁Okawaru |
0.0767039 | 3.7e-07 | |
30897 | reportprint |
0.0827692 | 7e-07 | embedreportprint , cloneembedreportprint , rawdownloadcloneembedreportprint |
39165 | catentry |
0.0843905 | 2.4e-05 | |
37444 | ▁petertodd |
0.0861925 | 4.9e-05 | |
38250 | ▁Skydragon |
0.0874714 | 7.1e-07 | |
34027 | ▁actionGroup |
0.0886497 | 1.1e-05 | |
30202 | ▁guiName |
0.0922088 | 2.4e-07 | |
31886 | ▁gmaxwell |
0.0924347 | 7.6e-07 | |
36935 | ▁dstg |
0.0945689 | 0.041 | |
31576 | externalActionCode |
0.0960827 | 7e-05 | |
48396 | ÛÛ |
0.0963896 | 0.007 | |
9364 | ÃÂÃÂÃÂÃÂ |
0.0977885 | 0.00033 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ , ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ , ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
36130 | ▁PsyNet |
0.100781 | 6e-06 | |
32047 | ▁"$:/ |
0.105251 | 3.1e-05 | |
39803 | soType |
0.109921 | 3.6e-07 | |
31032 | SpaceEngineers |
0.111531 | 6.1e-05 | |
23090 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
0.114465 | 4.3e-06 | ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ |
34473 | ヘラ |
0.114489 | 0.00018 |
3 entries below threshold of 0.123
token_id | token | indicator | in_other_tokens |
---|---|---|---|
34504 | ▁裏<0xE8> |
0.000796033 | |
33434 | <0x96><0x9A>士 |
0.000804 | 龍喚士 |
20174 | ▁裏<0xE7> |
0.0726416 |
13 entries below threshold of 0.032
token_id | token | indicator | ord | hex | byte_type |
---|---|---|---|---|---|
185 | <0xFD> |
0.000792164 | 253 | 0xFD | unused_utf8 |
125 | <0xC1> |
0.000807113 | 193 | 0xC1 | unused_utf8 |
178 | <0xF6> |
0.000807633 | 246 | 0xF6 | unused_utf8 |
177 | <0xF5> |
0.000812672 | 245 | 0xF5 | unused_utf8 |
124 | <0xC0> |
0.000813216 | 192 | 0xC0 | unused_utf8 |
180 | <0xF8> |
0.000816539 | 248 | 0xF8 | unused_utf8 |
183 | <0xFB> |
0.000835288 | 251 | 0xFB | unused_utf8 |
182 | <0xFA> |
0.000835342 | 250 | 0xFA | unused_utf8 |
179 | <0xF7> |
0.000835879 | 247 | 0xF7 | unused_utf8 |
181 | <0xF9> |
0.000838596 | 249 | 0xF9 | unused_utf8 |
184 | <0xFC> |
0.000842648 | 252 | 0xFC | unused_utf8 |
187 | <0xFF> |
0.000846493 | 255 | 0xFF | unused_utf8 |
186 | <0xFE> |
0.000855782 | 254 | 0xFE | unused_utf8 |
0 entries below threshold of 0.032