jungofthewon

Karma: 357

coo @ ought.org. by default please assume i am uncertain about everything i say unless i say otherwise :)

jungofthewon 12 Sep 2022 0:03 UTC
4 points
0
in reply to: Evan R. Murphy’s comment on: Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible
Sure! Prior to this survey I would have thought:
1. Fewer NLP researchers would have taken AGI seriously, identified understanding its risks as a significant priority, and considered it catastrophic.
  1. I particularly found it interesting that underrepresented researcher groups were more concerned (though less surprising in hindsight, especially considering the diversity of interpretations of catastrophe). I wonder how well the alignment community is doing with outreach to those groups.
2. There were more scaling maximalists (like the survey respondents did)
I was also encouraged that the majority of people thought the majority of research is crap.
...Though not sure how that math exactly works out. Unless people are self-aware of their publishing crap :P

Ought will host a factored cognition “Lab Meeting”

jungofthewon and stuhlmueller

9 Sep 2022 23:46 UTC

35 points

1 comment1 min readLW link

jungofthewon 1 Sep 2022 18:47 UTC
1 point
0
in reply to: elifland’s comment on: (My understanding of) What Everyone in Technical Alignment is Doing and Why
All good, thanks for clarifying.

jungofthewon 1 Sep 2022 16:05 UTC
LW: 2 AF: 2
0
AF
on: Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible
This was really interesting, thanks for running and sharing! Overall this was a positive update for me.
Results are here
I think this just links to PhilPapers not your survey results?

jungofthewon 31 Aug 2022 14:07 UTC
3 points
0
on: (My understanding of) What Everyone in Technical Alignment is Doing and Why
and Ought either builds AGI or strongly influences the organization that builds AGI.
“strongly influences the organization that builds AGI” applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work.

jungofthewon 25 Aug 2022 20:23 UTC
LW: 8 AF: 5
0
AF
on: Common misconceptions about OpenAI
I also appreciated reading this.

jungofthewon 20 Aug 2022 16:32 UTC
3 points
0
on: Deliberate Grieving
I found this post beautiful and somber in a sacred way. Thank you.

jungofthewon 20 Aug 2022 16:28 UTC
LW: 5 AF: 3
0
AF
on: How to do theoretical research, a personal perspective
This was really helpful and fun to read. I’m sure it was nontrivial to get to this level of articulation and clarity. Thanks for taking the time to package it for everyone else to benefit from.

jungofthewon 7 Aug 2022 20:33 UTC
LW: 7 AF: 6
0
AF
on: Rant on Problem Factorization for Alignment
If anyone has questions for Ought specifically, we’re happy to answer them as part of our AMA on Tuesday.

jungofthewon 5 Aug 2022 19:54 UTC
LW: 17 AF: 12
1
AF
on: Rant on Problem Factorization for Alignment
I think we could play an endless and uninteresting game of “find a real-world example for / against factorization.”
To me, the more interesting discussion is around building better systems for updating on alignment research progress -
1. What would it look like for this research community to effectively update on results and progress?
2. What can we borrow from other academic disciplines? E.g. what would “preregistration” look like?
3. What are the ways more structure and standardization would be limiting / taking us further from truth?
4. What does the “institutional memory” system look like?
5. How do we coordinate the work of different alignment researchers and groups to maximize information value?

jungofthewon 12 Apr 2022 2:43 UTC
1 point
0
in reply to: Alex_Shleizer’s comment on: Supervise Process, not Outcomes
Thanks for that pointer. It’s always helpful to have analogies in other domains to take inspiration from.

Elicit: Language Models as Research Assistants

stuhlmueller and jungofthewon

9 Apr 2022 14:56 UTC

73 points

6 comments13 min readLW link

Supervise Process, not Outcomes

stuhlmueller and jungofthewon

5 Apr 2022 22:18 UTC

146 points

9 comments10 min readLW link

jungofthewon 25 Feb 2022 0:03 UTC
6 points
0
on: Learning By Writing
I enjoyed reading this, thanks for taking the time to organize your thoughts and convey them so clearly! I’m excited to think a bit about how we might imbue a process like this into Elicit.
This also seems like the research version of being hypothesis-driven / actionable / decision-relevant at work.

jungofthewon 2 Feb 2022 23:21 UTC
16 points
0
on: Reflections on six months of fatherhood
love. very beautifully written. today i will also try to scoot n+1 inches.

jungofthewon 29 Nov 2021 2:15 UTC
2 points
0
in reply to: Samuel Knoche’s comment on: Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More
It does—ty!

jungofthewon 29 Nov 2021 0:14 UTC
1 point
0
on: Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More
I think the discord link is broken?

jungofthewon 1 Apr 2021 14:55 UTC
LW: 5 AF: 3
0
AF
on: How do we prepare for final crunch time?
Access
Alignment-focused policymakers / policy researchers should also be in positions of influence.
Knowledge
I’d add a bunch of human / social topics to your list e.g.
- Policy
- Every relevant historical precedent
- Crisis management / global logistical coordination / negotiation
- Psychology / media / marketing
- Forecasting
Research methodology / Scientific “rationality,” Productivity, Tools
I’d be really excited to have people use Elicit with this motivation. (More context here and here.)
Re: competitive games of introducing new tools, we did an internal speed Elicit vs. Google test to see which tool was more efficient for finding answers or mapping out a new domain in 5 minutes. We’re broadly excited to structure and support competitive knowledge work and optimize research this way.

jungofthewon 19 Mar 2021 4:32 UTC
LW: 9 AF: 5
0
AF
on: The case for aligning narrowly superhuman models
This is exactly what Ought is doing as we build Elicit into a research assistant using language models / GPT-3. We’re studying researchers’ workflows and identifying ways to productize or automate parts of them. In that process, we have to figure out how to turn GPT-3, a generalist by default, into a specialist that is a useful thought partner for domains like AI policy. We have to learn how to take feedback from the researcher and convert it into better results within session, per person, per research task, across the entire product. Another spin on it: we have to figure out how researchers can use GPT-3 to become expert-like in new domains.
We’re currently using GPT-3 for classification e.g. “take this spreadsheet and determine whether each entity in Column A is a non-profit, government entity, or company.” Some concrete examples of alignment-related work that have come up as we build this:
- One idea for making classification work is to have users generate explanations for their classifications. Then have GPT-3 generate explanations for the unlabeled objects. Then classify based on those explanations. This seems like a step towards “have models explain what they are doing.”
- I don’t think we’ll do this in the near future but we could explore other ways to make GPT-3 internally consistent, for example:
  - Ask GPT-3 why it classified Harvard as a “center for innovation.”
  - Then ask GPT-3 if that reason is true for Microsoft.
    Or just ask GPT-3 if Harvard is similar to Microsoft.
  - Then ask GPT-3 directly if Microsoft is a “center for innovation.”
  - And fine-tune results until we get to internal consistency.
- We eventually want to apply classification to the systematic review (SR) process, or some lightweight version of it. In the SR process, there is one step where two human reviewers identify which of 1,000-10,000 publications should be included in the SR by reviewing the title and abstract of each paper. After narrowing it down to ~50, two human reviewers read the whole paper and decide which should be included. Getting GPT-3 to skip these two human processes but be as good as two experts reading the whole paper seems like the kind of sandwiching task described in this proposal.
We’d love to talk to people interested in exploring this approach to alignment!
What links here?
- jungofthewon's comment on How do we prepare for final crunch time? by Eli Tyre (1 Apr 2021 14:55 UTC; 5 points)

jungofthewon 10 Mar 2021 2:23 UTC
8 points
0
in reply to: Ben Pace, the Vacationing Vagabond’s comment on: Open & Welcome Thread – March 2021
Ought is building Elicit, an AI research assistant using language models to automate and scale parts of the research process. Today, researchers can brainstorm research questions, search for datasets, find relevant publications, and brainstorm scenarios. They can create custom research tasks and search engines. You can find demos of Elicit here and a podcast explaining our vision here.
We’re hiring for the following roles:
Each job description contains sample projects from our roadmap.
Research is one of the primary engines by which society moves forward. We’re excited about the potential language models and ML have for making this engine orders of magnitude more effective.

jungofthewon

Ought will host a fac­tored cog­ni­tion “Lab Meet­ing”

Elicit: Lan­guage Models as Re­search Assistants

Su­per­vise Pro­cess, not Outcomes

Ought will host a factored cognition “Lab Meeting”

Elicit: Language Models as Research Assistants

Supervise Process, not Outcomes