Chess. Mistakes in chess usually become noticeable quickly, in just a move or two, and you have no RNG or teammates to blame them on. But to get better you have to acknowledge your mistakes and avoid making the same mistakes again.
Nate Showell
the part where there is no such thing as pleasure, just distractions and the absence of suffering, also seems kind of crazy to me.
This is an incorrect description of the Buddhist position. Pleasure traditionally plays a really important role in Buddhist worldviews and practices! The first three jhanas have pleasure (sukkha) as one of their defining factors, and it’s also part of the definition of one of the brahmaviharas, mudita (sympathetic joy).
As far as I can tell, fiction-writing ability and humor are lacking in the best models. But it’s the type of thing that we should not be surprised to see fall this year.
I would be surprised. In my experience, AI creative writing abilities have stagnated in the past year. And since creative writing is less amenable to RLVR than activities like math and programming, AI labs don’t have an easy path toward large advances in it. Labs would have to rely on pretraining dataset collection and curation, where the low-hanging fruit has already been picked.
Registering a prediction: by the beginning of 2030, no novel in which more than 50% of the text is AI-generated will reach #1 on the New York Times bestseller list (94%).
I recently experienced turning up my level-of-detail dial when I played the video game Blue Prince. It’s a puzzle game that takes place in a house with a draftable layout of rooms that changes from day to day, and most of the puzzles are broken up across multiple rooms. It’s not always obvious what’s part of a puzzle and what isn’t; you might walk past a seemingly inconsequential object in a room dozens of times, only to find out that it’s a clue in a puzzle hours later. I ended up taking a huge amount of notes to record everything I saw that looked like it might be a clue, and I still missed some things.
That increased tendency to notice details, and to feel like I was observing part of a puzzle, carried over at times to my experience outside the game. I started seeing more detail in visual art and appreciating it more. My work involves some puzzle-like elements, along with note-taking, and noticing those similarities to the activities I was doing when playing Blue Prince made my work more enjoyable.
On the first question, reaching superintelligence might require designing, testing, at-scale manufacturing, and installation of new types of computing hardware, which would probably take more than two years.
In the pre-LLM era, it seemed more likely (compared to now) that there was an algorithmically simple core of general intelligence, rather than intelligence being a complex aggregation of skills. If you’re operating under the assumption that general intelligence has a simple algorithmic structure, decision theory is an obvious place to search for it. So the early focus on decision theory wasn’t random.
There are the terms “closed individualism,” “open individualism,” and “empty individualism” used in this Qualia Computing post.
My own experience is very different from those described in this post. I find it relaxing instead of stressful to spend time doing nothing, and felt this way even when I was a child and hadn’t started meditating regularly. I also don’t enjoy using a smartphone, due to the small screen size and reliance on touch inputs, so I don’t fill gaps in activities by browsing the Internet on my phone. It’s also common for me to have brief interactions with strangers even though I’m young. People frequently ask me for directions when I’m on my way to or from work.
The easiest way to promote justice is to focus on punishing people who behave badly (since that’s easier than rewarding people who behave well).
The premises of the toy model don’t require this to be true. Whether it’s true, and to what extent, can vary between environments.
The orthogonality question is an engineering question
People usually think about the orthogonality question (“Is the orthogonality thesis true?”) as a philosophical question. The usual way of approaching the orthogonality question is by taking a starting point of “assume an AGI exists” and then reasoning about what goals the AGI would have. But one can flip the usual starting point around and ask, for a specific goal, “is it realistically achievable to create a general intelligence that has this goal?” This reframing turns the orthogonality question into an engineering question that has more direct practical relevance than the philosophical version. The engineering version is a question about the types of results an AI developer can expect from different engineering decisions, rather than speculation about an idealized AGI; it’s grounded in what’s realistically achievable instead of what might be theoretically possible.
Instances of the engineering version of the orthogonality question also open the broader orthogonality question up to empirical testing. And so far, the empirical evidence we’ve received has pointed toward the answer “no.” Ever since the early days of reinforcement learning, researchers have been creating models with narrow goals, and so far, none of those systems has shown full generalization in the type of intelligence it’s developed. Protein-folding models only fold proteins; chess engines don’t model their environments outside the confines of the 64 squares. Language prediction has generalized further than most other training objectives, but language models still perform poorly at non-linguistic tasks (understanding images, acting within physical environments) and have jagged capabilities even within the set of language-based problems. Each new failure to create general intelligence from a narrow training objective is (usually weak) empirical evidence that narrow training signals are too impoverished to let a model develop highly general capabilities. Maybe general intelligence from a narrow goal would be possible with truly gargantuan amounts of compute, but recall that the engineering version of the orthogonality question is about what’s practically achievable.
There was likely a midwit-meme effect going on at the philosophy meetup, where, in order to distinguish themselves from the stereotypical sports-bar-goers, the attendees were forming their beliefs in ways that would never occur to a true “normie.” You might have a better experience interacting with “common people” in a setting where they aren’t self-selected for trying to demonstrate sophistication.
Just spitballing, but maybe you could incorporate some notion of resource consumption, like in linear logic. You could have a system where the copies have to “feed” on some resource in order to stay active, and data corruption inhibits a copy’s ability to “feed.”
I don’t remember, it was something I saw in the New York Times Book Review section a few years ago.
The spiralism attractor is the same type of failure mode as GPT-2 getting stuck repeating a single character or ChatGPT’s image generator turning photos into caricatures of black people. The only difference between the spiralism attractor and other mode collapse attractors is that some people experiencing mania happen to find it compelling. That is to say, the spiralism attractor is centrally a capabilities failure and only incidentally an alignment failure.
I once read a positive review of a novel that, in one brief passage, described reading that novel as feeling similar to reading Twitter. That one sentence alone made the review useful to me by giving me a strong signal that I wouldn’t like the book, even though the author liked it.
The “Use New Feed” checkbox is stuck checked for me. Clicking on it doesn’t uncheck it.
Second, we could take condensation as inspiration and try to create new machine-learning models which resemble condensation, in the hopes that their structure will be more interpretable.
Condensation could also be applied to model scaffolding design or the interpretability of scaffolded systems. Some AI memory storage and retrieval systems already have structures that resemble the tagged-notebook analogy, with documents stored in a database along with tags or summaries. A condensation-inspired memory structure could potentially have low retrieval latency while also being highly interpretable. Condensation might also be useful for interpreting why a model retrieves a specific set of documents from its memory system when responding to a query.
It’s worth distinguishing between epistemic and instrumental forms of heroic responsibility. Shapley values are the mathematically precise way of apportioning credit or blame for an outcome among a group of people. Heroic responsibility as a belief about one’s own share of credit or blame is a dark art of rationality, since it involves explicitly deviating from the Shapley value assignment in one’s beliefs about credit or blame. But taking heroic responsibility as an action, while acknowledging that you’re not trying to be mathematically precise in your credit assignment, can still be useful as a way of solving coordination problems.
Williamson and Dai both appear to describe philosophy as a general-theoretical-model-building activity, but there are other conceptions of what it means to do philosophy. In contrast to both Williamson and Dai, if Wittgenstein (either early or late period) is right that the proper role of philosophy is to clarify and critique language rather than to construct general theses and explanations, LLM-based AI may be quickly approaching peak-human competence at philosophy. Critiquing and clarifying writing are already tasks that LLMs are good at and widely used for. They’re tasks that AI systems improve at from the types of scaling-up that labs are already doing, and labs have strong incentives to keep making their AIs better at them. As such, I’m optimistic about the philosophical competence of future AIs, but according to a different idea of what it means to be philosophically competent. AI systems that reach peak-human or superhuman levels of competence at Wittgensteinian philosophy-as-an-activity would be systems that help people become wiser on an individual level by clearing up their conceptual confusions, rather than a tool for coming up with abstract solutions to grand Philosophical Problems.
Have you experienced any of the negative effects on memory that the PNSE paper describes as sometimes occurring at Location 4?