Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Truthful AI
Tag
Last edit:
Apr 7, 2022, 4:40 PM
by
Ruby
Relevant
New
Old
Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses
TurnTrout
Jan 16, 2025, 2:14 AM
64
points
3
comments
1
min read
LW
link
(turntrout.com)
A tension between two prosaic alignment subgoals
Alex Lawsen
Mar 19, 2023, 2:07 PM
31
points
8
comments
1
min read
LW
link
How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots
Owain_Evans
Mar 28, 2024, 2:34 AM
27
points
0
comments
9
min read
LW
link
New, improved multiple-choice TruthfulQA
Owain_Evans
,
James Chua
and
Steph Lin
Jan 15, 2025, 11:32 PM
72
points
0
comments
3
min read
LW
link
Truthfulness, standards and credibility
Joe Collman
Apr 7, 2022, 10:31 AM
12
points
2
comments
32
min read
LW
link
Fact-Based AI and The Dangers of False Truths in AI Development
CLBrogan
Aug 5, 2024, 3:17 AM
1
point
0
comments
5
min read
LW
link
(1drv.ms)
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter
,
Francis Rhys Ward
,
HarrietW
,
LAThomson
,
Ollie J
,
Patrik Bartak
and
Sam F. Brown
Nov 8, 2023, 11:37 AM
49
points
0
comments
18
min read
LW
link
No comments.
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel