Soren

Karma: 15

Soren 5 Sep 2022 18:06 UTC
9 points
6
on: Worlds Where Iterative Design Fails
I think this is a very useful post that is talking about many of the right things. One question though: isn’t it only worth focusing on the worlds where iterative design does not work for alignment to the extent to which progress can still be made towards mitigating those worlds? It appears to me that progress in technical fields is usually accomplished through iterative design, so it makes sense to have a high prior on non-iterative approaches being less effective. Depending on your specific numbers here, it seems like it could be worth it to pay attention to the areas more tractable for iterative design or less. I think its also misleading to think of iterative design as either working or failing. Fields have gradations of ability for prompt and high-quality feedback and ability for repeated trials. It also seems like problems that initially seem hard to iterate on can often be formulated in ways that allows better iteration (like the ELK problem being formulated in a way that allows for testing toy solutions and counterexamples). I worry that trying to focus in an unnuanced way about worlds where iterative design fails may miss out on opportunities to formulate some of these hard problems in ways that might make them easier to iterate on.

Soren 12 Oct 2022 21:08 UTC
3 points
0
on: AI Timelines via Cumulative Optimization Power: Less Long, More Short
Really good post. Based on this, it seems extremely valuable to me to test the assumption that we already have animal-level AIs. I understand that this is difficult due to built-in brain structure in animals, different training distributions, and the difficulty of creating a simulation as complex as real life. It still seems like we could test this assumption by doing something along the lines of training a neural network to perform as well as a cat’s visual cortex on image recognition. I predict that if this was done in a way that accounted for the flexibility of real animals that the AI wouldn’t perform better than an animal at around cat or raven level (80% confidence). I predict that even if AI was able to out-perform a part of an animal’s brain in one area, it would not be able to out-perform the animal in more than 3 separate areas as broad as vision (60% confidence). I am quite skeptical of greater than 20% probability of AGI in less than 10 years, but contrary evidence here could definitely make me change my mind.