Bachelor in general and applied physics. AI safety/Agent foundations researcher wannabe.
I love talking to people, and if you are an alignment researcher we will have at least one common topic (but I am very interested in talking about unknown to me topics too!), so I encourage you to book a call with me: https://calendly.com/roman-malov27/new-meeting
Email: roman.malov27@gmail.com
GitHub: https://github.com/RomanMalov
TG channels (in Russian): https://t.me/healwithcomedy, https://t.me/ai_safety_digest
Roman Malov
I feel so exhausted by the double hype.
On the one side, there are people who are saying that AI is going to be a machine god, backing up their claims with various exponential graphs. On the other side, there are people who are claiming that this is all bullshit, that they’re stochastic parrots and slop generators, that nothing revolutionary has happened, and that technology stagnated 2 years ago.
And I feel like, both groups are onto something? There are obviously a lot of limitations to current AI systems, and big tech can overblow their capabilities. But also, the speed of advancement is insane, and you can see where the trajectory is going.
But my point is more about how difficult it is to form an adequate picture of reality from people’s opinions, because both groups often ignore obvious facts. Like, can we at least agree on something?
For example, some people live in a reality where a year ago an internal OpenAI model won gold on the IMO. Other people believe that’s somehow “hype” and “fake”.
Some people believe Mythos is a revolution in cybersecurity capabilities. Other people replicated the same vulnerability findings with open-source models and find Anthropic’s noise around Project Glasswing annoying.[1]
I’m just so tired of us being incapable of agreeing on basic facts.
- ^
In both of those cases I actually do not know who is correct. Maybe I haven’t invested enough time in this, but that is kinda the point of my complaint.
- ^
From the first impression, this sounds like an algorithmic information theory word salad, but I might be wrong, and this may be a sensible concept. Though I am sure this concept is not “superintelligence” because it doesn’t have any reference point. Ants are superintelligent relative to a cell. Humans are superintelligent relative to dogs. This definition either describes so much that even a simple cellular automaton falls under it, or it is so restrictive that machine intelligence capable of building a Dyson sphere doesn’t fall under it.
Maybe I’m not looking in the right place, but the obvious question to this benchmark—how do humans fair in it? If humans score 0 too, then models scoring 0 is not a huge signal (even though authors claim that this bench is supposed to be closer to work of a real engineer).
We need LW reaction for that!
would airplane flying on fuel synthesized from horses count?
Legible vs. Illegible AI Safety Problems somewhat resonates with this.
I also had to google it and google AI said that LTV means “Lifetime Value”.
What are those weird equality symbols?
This is a paragraph from the description of a future where AI companies try to solve alignment by automating it with LLM agents, did I guess correctly?
Maybe this post of mine might be relevant?
Have you seen Harder Drive?
I browsed a bit on your website but did not found link to any call. Can you please help?
Aspects of math which are shaped by humans and not math’s structure is what my post is about.
Oh, I plan to post on the topic of alien math. But in short—aliens are going to be guided by beauty/interestingness/utility for the same reason evolution pushed humans to value them, so a lot of our math could intersect (but you still need aliens or humans to pluck out those valuable math bits, you can’t force math look at itself hard enough and present those parts to you).
And they would have group theory because our universe is just full of symmetries.
I mean, math is invented in the same sense fridges are invented. Is there a space of designs, search over which, some search process could stumble upon and therefore “discover” a fridge design? Sure. But at that point, we call this invention.
I’m not saying that such formalised objective couldn’t exist. My claim is that we (probably) haven’t yet found one. And if there will be one, it wouldn’t be “metaphysically objective”, it will just spit out very insightful theorems very fast.
Unlike chess, there is no “optimal play” for math. And if there is, I think it would be considered slow/boring (if we do have computational constraints).
I will adjust the post. Thanks.
My second example about Mythos, also there are other examples of “capabilities jumps” which are disproven by showing the same capabilities in earlier models.
Though, of course, groups are inhomogeneous and there are probably examples of people who do agree with any particular belief of the other side.