Opinions expressed are my own and not endorsed by anyone.
Formerly @ ARC Evals aka METR
Opinions expressed are my own and not endorsed by anyone.
Formerly @ ARC Evals aka METR
#onlyReadBadWriters #hansonFTW
From the frontpage:
https://www.lesswrong.com/posts/zAqqeXcau9y2yiJdi/can-we-build-a-better-public-doublecrux
https://www.lesswrong.com/posts/bkr9BozFuh7ytiwbK/my-hour-of-memoryless-lucidity
https://www.lesswrong.com/posts/ANGmJnZL2fskHX6tj/dyslucksia
Like all of them basically.
most of the value is in even figuring out how to diagram the posts
Think of it like a TLDR. There are many ways to TLDR but any method that’s not terrible is fantastic
The job would of course be done by a diagramming god, not a wordpleb like me
If i got double dog dared...
“Lo-salt” salt is salt with potassium. That’s been my table salt for 5 years.
Put your phone in the oven and stand in the grass and eat some grass and see how it tastes
LW mods, please pay somebody to turn every post with 20+ karma into a diagram. Diagrams are just so vastly superior to words.
That title!! I was even fan of you and yam specifically and had even gone through a number of your old works looking for nuggets! Figure 22.3 makes up for it all though haha. Diagrams are so far superior to words...
Bump
Okay now I know why I got this one wrong. It’s your fault. You hid it in chapter 22 of a book! Not even a clickbait title for the chapter! I even bought that book when it came out and read a good portion of it but never saw the chapter :(
Btw, why didn’t we have vending machines for everything 50 years ago?
I got all the questions you mentioned wrong and definitely feel like I should’ve gotten them all right.
I think it just takes a lot time of time and effort to find the obvious future and it isn’t super fun. You don’t get to spend most of the time building up your tower of predictions. A lot of digging up foundations, pouring new foundations, digging them up...
It probably can be fun with the right culture within a small group of friends. Damn maybe that’s what those people who were correct had...
Justify this extensively right now or you’re a phony
I was thinking the bear would scare other stuff off yeah. But now I think I’m doing this wrong and the code is broken. Can you fix my code?
A possom or whatever will scratch mine like half the time
Original post that introduced the technique is best explanation of steering stuff. https://www.lesswrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector
What monster downvoted this
Hmm I think the damaging effect would occur over many years but mainly during puberty. It looks like there’s only two studies they mention lasting over a year. One found a damaging effect and the other found no effect.
The “love minus hate” thing really holds up
I assumed somebody had. Maybe everyone did haha