Independent alignment researcher
Garrett Baker
Taking the parameters which seem to matter and rotating them until they don’t
How (not) to choose a research project
Pessimistic Shard Theory
Announcing Suffering For Good
I agree with Conjecture’s reply that this reads more like a hitpiece than an even-handed evaluation.
I don’t think your recommendations follow from your observations, and such strong claims surely don’t follow from the actual evidence you provide. I feel like your criticisms can be summarized as the following:
-
Conjecture was publishing unfinished research directions for a while.
-
Conjecture does not publicly share details of their current CoEm research direction, and that research direction seems hard.
-
Conjecture told the government they were AI safety experts.
-
Some people (who?) say Conjecture’s governance outreach may be net-negative and upsetting to politicians.
-
Conjecture’s CEO Connor used to work on capabilities.
-
One time during college Connor said that he replicated GPT-2, then found out he had a bug in his code.
-
Connor has said at some times that open source models were good for alignment, then changed his mind.
-
Conjecture’s infohazard policy can be overturned by Connor or their owners.
-
They’re trying to scale when it is common wisdom for startups to try to stay small.
-
It is unclear how they will balance profit and altruistic motives.
-
Sometimes you talk with people (who?) and they say they’ve had bad interactions with conjecture staff or leadership when trying to tell them what they’re doing wrong.
-
Conjecture seems like they don’t talk with ML people.
I’m actually curious about why they’re doing 9, and further discussion on 10 and 8. But I don’t think any of the other points matter, at least to the depth you’ve covered them here, and I don’t know why you’re spending so much time on stuff that doesn’t matter or you can’t support. This could have been so much better if you had taken the research time spent on everything that wasn’t 8, 9, or 10, and used to to do analyses of 8, 9, and 10, and then actually had a conversation with Conjecture about your disagreements with them.
I especially don’t think your arguments support your suggestions that
-
Don’t work at Conjecture.
-
Conjecture should be more cautious when talking to media, because Connor seems unilateralist.
-
Conjecture should not receive more funding until they get similar levels of organizational competence than OpenAI or Anthropic.
-
Rethink whether or not you want to support conjecture’s work non-monetarily. For example, maybe think about not inviting them to table at EAG career fairs, inviting Conjecture employees to events or workspaces, and taking money from them if doing field-building.
(1) seems like a pretty strong claim, which is left unsupported. I know of many people who would be excited to work at conjecture, and I don’t think your points support the claim they would be doing net-negative research given they do alignment at Conjecture.
For (2), I don’t know why you’re saying Connor is unilateralist. Are you saying this because he used to work on capabilities?
(3) is just absurd! OpenAI will perhaps be the most destructive organization to-date. I do not think your above arguments make the case they are less organizationally responsible than OpenAI. Even having an info-hazard document puts them leagues above both OpenAI and Anthropic in my book. And add onto that their primary way of getting funded isn’t building extremely large models… In what way do Anthropic or OpenAI have better corporate governance structures than Conjecture?
(4) is just… what? Ok, I’ve thought about it, and come to the conclusion this makes no sense given your previous arguments. Maybe there’s a case to be made here. If they are less organizationally competent than OpenAI, then yeah, you probably don’t want to support their work. This seems pretty unlikely to me though! And you definitely don’t provide anything close to the level of analysis needed to elevate such hypotheses.
Edit: I will add to my note on (2): In most news articles in which I see Connor or Conjecture mentioned, I feel glad he talked to the relevant reporter, and think he/Conjecture made that article better. It is quite an achievement in my book to have sane conversations with reporters about this type of stuff! So mostly I think they should continue doing what they’re doing.
I’m not myself an expert on PR (I’m skeptical if anyone is), so maybe my impressions of the articles are naive and backwards in some way. This is something which if you think is important, it would likely be good to mention somewhere why you think their media outreach is net-negative, ideally pointing to particular things you think they did wrong rather than vague & menacing criticisms of unilateralism.
- 12 Jun 2023 20:38 UTC; 68 points) 's comment on Critiques of prominent AI safety labs: Conjecture by (EA Forum;
-
My hopes for alignment: Singular learning theory and whole brain emulation
I like this post, with one exception: I don’t think putting out fires feels like putting out fires. I think it feels like you’re utterly confused, and instead of nodding your head when you explain the confusion & people try to resolve it but you don’t understand them, continuing to actively notice & chase the confusion no matter how much people decrease your status due to you not being able to understand what they’re saying. It feels far more similar to going to school wearing a clown suit than heroically putting out obvious-to-you fires.
We should expect something similar to this fiasco to happen if/when Anthropic’s responsible scaling policies tell them to stop scaling.
On Complexity Science
Outside the three major AGI labs, I’m reasonably confident no major organization is following a solid roadmap to AGI; no-one else woke up. A few LARPers, maybe, who’d utter “we’re working on AGI” because that’s trendy now. But nobody who has a gears-level model of the path there, and what its endpoint entails.
This seems pretty false. In terms of large players, there also exists Meta and Inflection AI. There are also many other smaller players who also care about AGI, and no doubt many AGI-motivated workers at three labs mentioned would start their own orgs if the org they’re currently working under shuts down.
So You Created a Sociopath—New Book Announcement!
Neuroscience and Alignment
What are the actual rationality concepts LWers are basically required to understand to participate in most discussions?
I am prior to having this bar be set pretty high, like 80-100% of the sequences level. I remember years ago when I finished the sequences, I spent several months practicing everyday rationality in isolation, and only then deigned to visit LessWrong and talk to other rationalists, and I was pretty disappointed with the average quality level, and like I dodged a bullet by spending those months thinking alone rather than with the wider community.
It also seems like average quality has decreased over the years.
Predictable confusion some will have: I’m talking about average quality here. Not 90th percentile quality posters.
Can you list a concrete research path which you’re pursuing in light of this strategy? This all sounds ok in principle, but I’d bet alignment problems show up in concrete pathways.
Good AGI-notkilleveryoneism-conscious researchers should in general prioritize working at big AGI labs over working independently, for alignment-focused labs, or for academia marginally more than they currently do.
The ratio of good alignment work done at labs vs independently mostly skews toward labs
Good meaning something different from impactful here. Obviously AGI labs will pay more attention to their researchers or researchers from respectable institutions than independent researchers. Your answer should factor out such considerations.
Edit: Also normalize for quantity of researchers.
I want people to not discuss things in DMs, and discuss things publicly more. I also don’t think this is embarrassing for Quintin, or at all a public spectacle.
It’s an easy mistake to make: both things are called “AI”, after all. But you wouldn’t study manually-written FPS bots circa 2000s, or MNIST-classifier CNNs circa 2010s, and claim that your findings generalize to how LLMs circa 2020s work. By the same token, LLM findings do not necessarily generalize to AGI.
My understanding is that many of those studying MNIST-classifier CNNs circa 2010 were in fact studying this because they believed similar neural-net inspired mechanisms would go much further, and would not be surprised if very similar mechanisms were at play inside LLMs. And they were correct! Such studies led to ReLU, backpropagation, residual connections, autoencoders for generative AI, and ultimately the scaling laws we see today.
If you traveled back to 2010, and you had to choose between already extant fields, having that year’s GPU compute prices and software packages, what would you study to learn about LLMs? Probably neural networks in general, both NLP and image classification. My understanding is there was & is much cross-pollination between the two.
Of course, maybe this is just a misunderstanding of history on my part. Interested to hear if my understanding’s wrong!
To my ears it sounded like Shane’s solution to “alignment” was to make the models more consequentialist. I really don’t think he appreciates most of the difficulty and traps of the problems here. This type of thinking, on my model of their models, should make even alignment optimists unimpressed, since much of the reason for optimism lies in observing current language models, and interpreting their outputs as being nonconsequentialist, corrigible, and limited in scope yet broad in application.
We should expect something similar to this fiasco to happen if/when Anthropic’s oversight board tries to significantly exercise their powers.