Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Glitch Tokens
Tag
Last edit:
18 Apr 2023 5:31 UTC
by
CronoDAS
Glitch Tokens are tokens in a language model that cause anomalous output, such as SolidGoldMagikarp.
Relevant
New
Old
The ‘ petertodd’ phenomenon
mwatkins
15 Apr 2023 0:59 UTC
192
points
49
comments
38
min read
LW
link
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
and
mwatkins
5 Feb 2023 22:02 UTC
677
points
205
comments
12
min read
LW
link
SolidGoldMagikarp III: Glitch token archaeology
mwatkins
and
Jessica Rumbelow
14 Feb 2023 10:17 UTC
91
points
32
comments
16
min read
LW
link
′ petertodd’’s last stand: The final days of open GPT-3 research
mwatkins
22 Jan 2024 18:47 UTC
109
points
16
comments
45
min read
LW
link
Anomalous tokens reveal the original identities of Instruct models
janus
and
jdp
9 Feb 2023 1:30 UTC
139
points
16
comments
9
min read
LW
link
(generative.ink)
SolidGoldMagikarp II: technical details and more recent findings
mwatkins
and
Jessica Rumbelow
6 Feb 2023 19:09 UTC
111
points
45
comments
13
min read
LW
link
Mapping the semantic void: Strange goings-on in GPT embedding spaces
mwatkins
14 Dec 2023 13:10 UTC
114
points
31
comments
14
min read
LW
link
What’s up with all the non-Mormons? Weirdly specific universalities across LLMs
mwatkins
19 Apr 2024 13:43 UTC
40
points
13
comments
27
min read
LW
link
A New Class of Glitch Tokens—BPE Subtoken Artifacts (BSA)
Lao Mein
20 Sep 2024 13:13 UTC
37
points
7
comments
5
min read
LW
link
Glitch Token Catalog - (Almost) a Full Clear
Lao Mein
21 Sep 2024 12:22 UTC
38
points
3
comments
37
min read
LW
link
SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4
AdamYedidia
15 Apr 2023 22:35 UTC
71
points
18
comments
6
min read
LW
link
Linear encoding of character-level information in GPT-J token embeddings
mwatkins
and
Joseph Bloom
10 Nov 2023 22:19 UTC
34
points
4
comments
28
min read
LW
link
The “spelling miracle”: GPT-3 spelling abilities and glitch tokens revisited
mwatkins
31 Jul 2023 19:47 UTC
85
points
29
comments
20
min read
LW
link
Nokens: A potential method of investigating glitch tokens
Hoagy
15 Mar 2023 16:23 UTC
21
points
0
comments
4
min read
LW
link
A Search for More ChatGPT / GPT-3.5 / GPT-4 “Unspeakable” Glitch Tokens
Martin Fell
9 May 2023 14:36 UTC
26
points
9
comments
6
min read
LW
link
LLMs Universally Learn a Feature Representing Token Frequency / Rarity
Sean Osier
30 Jun 2024 2:48 UTC
12
points
5
comments
6
min read
LW
link
(github.com)
(redacted) Anomalous tokens might disproportionately affect complex language tasks
nikola
15 Jul 2023 0:48 UTC
4
points
0
comments
7
min read
LW
link
An examination of GPT-2′s boring yet effective glitch
MiguelDev
18 Apr 2024 5:26 UTC
5
points
3
comments
3
min read
LW
link
No comments.
Back to top