RSS

Lan­guage Models

TagLast edit: 24 Sep 2021 14:16 UTC by plex

Language Models are a class of AI trained on text, usually to predict the next word or a word which has been obscured. They have the ability to generate novel prose or code based on an initial prompt, which gives rise to a kind of natural language programming called prompt engineering. The most popular architecture for very large language models is called a transformer, which follows consistent scaling laws with respect to the size of the model being trained, meaning that a larger model trained with the same amount of compute will produce results which are better by a predictable amount (when measured by the ‘perplexity’, or how surprised the AI is by a test set of human-generated text).

See also

Cog­ni­tive Bi­ases in Large Lan­guage Models

Jan25 Sep 2021 20:59 UTC
17 points
3 comments12 min readLW link
(universalprior.substack.com)

NVIDIA and Microsoft re­leases 530B pa­ram­e­ter trans­former model, Me­ga­tron-Tur­ing NLG

Ozyrus11 Oct 2021 15:28 UTC
48 points
31 comments1 min readLW link
(developer.nvidia.com)

NLP Po­si­tion Paper: When Com­bat­ting Hype, Pro­ceed with Caution

sbowman15 Oct 2021 20:57 UTC
31 points
9 comments1 min readLW link

Thoughts on the Align­ment Im­pli­ca­tions of Scal­ing Lan­guage Models

leogao2 Jun 2021 21:32 UTC
77 points
11 comments17 min readLW link

[AN #144]: How lan­guage mod­els can also be fine­tuned for non-lan­guage tasks

rohinmshah2 Apr 2021 17:20 UTC
19 points
0 comments6 min readLW link
(mailchi.mp)

How truth­ful is GPT-3? A bench­mark for lan­guage models

Owain_Evans16 Sep 2021 10:09 UTC
54 points
24 comments6 min readLW link

[Question] How does OpenAI’s lan­guage model af­fect our AI timeline es­ti­mates?

jimrandomh15 Feb 2019 3:11 UTC
50 points
7 comments1 min readLW link

Build­ing AGI Us­ing Lan­guage Models

leogao9 Nov 2020 16:33 UTC
11 points
1 comment1 min readLW link
(leogao.dev)

Suffi­ciently Ad­vanced Lan­guage Models Can Do Re­in­force­ment Learning

Zachary Robertson2 Aug 2020 15:32 UTC
23 points
7 comments7 min readLW link

The case for al­ign­ing nar­rowly su­per­hu­man models

Ajeya Cotra5 Mar 2021 22:29 UTC
175 points
74 comments38 min readLW link

The Codex Skep­tic FAQ

Michaël Trazzi24 Aug 2021 16:01 UTC
47 points
24 comments2 min readLW link

On lan­guage mod­el­ing and fu­ture ab­stract rea­son­ing research

alexlyzhov25 Mar 2021 17:43 UTC
3 points
1 comment1 min readLW link
(docs.google.com)

Agen­tic Lan­guage Model Memes

FactorialCode1 Aug 2020 18:03 UTC
16 points
1 comment2 min readLW link

Struc­tured Tasks for Lan­guage Models

Zachary Robertson29 Jul 2020 14:17 UTC
5 points
0 comments1 min readLW link

[AN #164]: How well can lan­guage mod­els write code?

rohinmshah15 Sep 2021 17:20 UTC
13 points
7 comments9 min readLW link
(mailchi.mp)

[AN #113]: Check­ing the eth­i­cal in­tu­itions of large lan­guage models

rohinmshah19 Aug 2020 17:10 UTC
23 points
0 comments9 min readLW link
(mailchi.mp)

New GPT-3 competitor

Quintin Pope12 Aug 2021 7:05 UTC
32 points
10 comments1 min readLW link

OpenAI Codex: First Impressions

Rishit Vora13 Aug 2021 16:52 UTC
49 points
8 comments4 min readLW link
(sixeleven.in)