Harjas

Karma: 41

Harjas 20 Apr 2026 21:28 UTC
1 point
0
in reply to: Harjas’s comment on: Harjas Sandhu’s Shortform
I’m a little surprised by the amount of disagree reacts, given that no one has replied.

Harjas 20 Apr 2026 21:26 UTC
6 points
0
on: Harjas Sandhu’s Shortform
I keep running into conceptual confusion around the term “alignment,” particularly when reading older Less Wrong posts. Some people say “aligned AI” and mean “an AI that works for human flourishing,” some people say that an AI “is aligned” if it reliably advances the intended objectives of some person or group (and doesn’t have some secret set of goals / isn’t scheming), and yet other people use “alignment” to mean something along the lines of “the ability of any system to reliably work towards some pre-defined goal.” I usually have to work out which is being said on the spot, which is annoying given that the implications of each are very different.
Is there one commonly accepted definition? Is this confusion just a thing we’ve all accepted?

Harjas 20 Apr 2026 21:13 UTC
1 point
0
in reply to: Yitz’s comment on: why assume AGIs will optimize for fixed goals?
A bit of a necro-comment from me, but I’m reading this about four years later and am very surprised that this is the first time I’m hearing about the concept. I can’t think of anything in the intervening time period that has either confirmed or deconfirmed this comment, let alone even engaged with it.

Harjas 10 Apr 2026 19:08 UTC
1 point
0
in reply to: Raemon’s comment on: Socrates is Mortal
For the record, I think this is helpful and will be stealing it for any future advice posts I might write!

Harjas 7 Apr 2026 3:27 UTC
2 points
0
on: Harjas Sandhu’s Shortform
Has anyone made a proper post about potential “warning shots” and how we should prepare for them? This post has lived rent-free in my head for the past couple of months and I’m curious to know if anyone else has been thinking about this topic too.

Harjas 3 Apr 2026 16:05 UTC
5 points
2
in reply to: Sodium’s comment on: Max H’s Shortform
Re character: I think most Americans (including myself) have been so far removed from true corruption that we have forgotten how bad it can possibly get. Even my state of Illinois, which is notable for its historical machine politics and general corruption (4 of our 11 last governors serving time + many others like Mike Madigan), has still more or less seen forward progress, because the corruption wasn’t bad enough to completely erode politics in the state.
But it CAN get that bad. We’re seeing this now with the Trump admin. I am generally left-leaning, but at this point I think I’d take an honest Republican over a corrupt Democrat—a position I did not hold previously—because corruption eats policy and utterly erodes the foundation upon which we build fair markets and strong institutions.

Harjas 3 Apr 2026 15:59 UTC
4 points
−14
on: Harjas Sandhu’s Shortform
Thought in progress: epistemic humility is not a substitute for actual humility (or professed humility). You only get to cry wolf once, but you can probably warn about potential wolves several times—so long as you don’t burn goodwill on an incorrect or overconfident prediction.
I think epistemic humility helps to increase trust and confidence in EA/Less Wrong-type spaces, but I think professed humility is far more helpful when it comes to public-facing AI comms, particularly as scenarios get more intense and specific (e.g. prefacing AI doom predictions with a decent amount of throat-clearing beforehand commensurate with the intensity and specificity of the forecast). For example, I think that AI 2027 might have been better received if the authors had spent less time trying to convince the readers of their credibility at the beginning and spent more time saying something along the lines of “we know this sounds crazy and are well aware of how sci-fi the scenario seems”. (I’m not a huge fan of lampshading in fiction, but IRL, I think you do need to display self-awareness of outlandishness in order to be taken seriously, particularly if what you’re predicting sounds insane to the average person.)
Of course, there are huge diminishing returns on this: the more throat-clearing you do, the less confident you seem. And throat-clearing should probably be saved for public-facing comms, because actual technical work seems to require people who are confident in their beliefs even when they are outlandish (as proven by the outlandish explosion of AI progress recently).
Still, I think that the AI safety community at large has a worse reputation than they deserve, and I think part of that is due to the appearance of overconfidence. This problem seems simple, tractable, and important.

Harjas 21 Mar 2026 16:34 UTC
1 point
0
in reply to: FiftyTwo’s comment on: Eight Short Studies On Excuses
Random future reader from 14 years in the future: seconded. And also, why didn’t they just use i.e.?

Harjas 16 Mar 2026 19:34 UTC
1 point
0
in reply to: ChristianKl’s comment on: An AI skeptic’s case for recursive self-improvement
I don’t think that the belief that godlike intelligence is necessary for human extinction via AI is a popular AI doomer position among people who are intellectually sophisticated. It’s more like those people hold complex position and it’s easy for people who are skeptics to frame this as “a popular position”.
Hang on, I don’t think I said that godlike intelligence was necessary for human extinction, and actually, didn’t make any claim about human extinction at all. This post was just about the possibility of an intelligence explosion, and I think “AI will reach godlike levels of intelligence” is an accurate description of the AI 2027 position.
You can’t conclude from the fact that inference scaling happened that most AI improvements are due to scaling.
Did you read the cited link that you quoted? Toby Ord’s argument was pretty convincing to me. What do you disagree with?
When it comes to inference it’s also worth noting that they found a lot of tricks to make inference cheaper. It’s not just more/better hardware
Right, ending in about late 2024, which is why I specified (~late 2024) in “most recent gains”. It doesn’t seem like that trend has continued.

Harjas 16 Mar 2026 0:26 UTC
1 point
0
in reply to: ChristianKl’s comment on: An AI skeptic’s case for recursive self-improvement
Re misframing: fair enough. Maybe I should have said “a popular AI doomer position”.
On the other thing: I’m not quite sure what you mean? My thesis in the quoted text was basically what I said: since most AI improvements have come from inference scaling, aka scaling up compute requirements, we can expect that future progress will also come from scaling up compute requirements. Obviously this only holds true until another paradigm shift happens.
Do you think agents will be trained on themselves in a similar fashion to AlphaGo, and do you think that training will reduce compute requirements / provide a performance increase driven by training instead of inference?

An AI skeptic’s case for recursive self-improvement

Harjas14 Mar 2026 17:01 UTC

9 points

4 comments8 min readLW link

(hardlyworking1.substack.com)

Harjas 14 Mar 2026 2:00 UTC
1 point
0
in reply to: Kaj_Sotala’s comment on: Inputs, outputs, and valued outcomes
I imagine a grounded career would be one in which you can see a direct link between your actions and their results e.g. careers in medicine. Strangely enough, Earning to Give seems to fit this criteria (provided you give to a grounded charity like GiveDirectly or something). I’m not necessarily endorsing this philosophy though—I think there are a lot of important and impactful careers that have really loose feedback loops and uncertain outputs, like journalism or science communication.

Harjas 13 Mar 2026 20:58 UTC
1 point
0
on: Inputs, outputs, and valued outcomes
Even if the efforts of any particular person are likely to go to waste, if a large number of people follow the strategy of “do things that have a small chance of being hugely beneficial on net”, this may turn out to be more impactful than if they all tried to maximize their chance of making an individual impact.
As a counterpoint, I’m reminded of this post from the EA Forum:
The best cause will disappoint you: An intro to the optimisers curse
in which the author argues against naively trying to maximize for expected value and argues for choosing grounded causes over speculative causes instead. Curious for your thoughts on this.

Harjas’s Shortform

Harjas14 Jun 2025 15:55 UTC

1 point

6 comments1 min readLW link

Harjas 14 Jun 2025 15:55 UTC
11 points
0
on: Harjas Sandhu’s Shortform
Has anyone written an essay about how to fight against/correct for Trapped Priors? I would like to do something like that, but I want to make sure that I’m not reinventing the wheel here. Thank you!

Harjas 5 Jun 2025 17:44 UTC
2 points
1
on: “Flaky breakthroughs” pervade coaching — but no one tracks them
This is a great point. Making the “breakthrough” from that poster’s meditation retreat last is less about maintaining a single realization and paradigm shift, and more about distilling the 8+ hours of daily meditation into a single 5-30 minute daily practice that still confers the majority of the benefits. Instead, as you point out, people end up chasing the feeling of finding a revelation over actual progress.
Before enlightenment, chop wood, carry water.
After enlightenment, chop wood, carry water

Harjas 4 Jun 2025 17:28 UTC
1 point
0
on: Frank Herbert’s great insight into human agency—Muad’Dib the tool?
As I understood it, Paul’s initial role in the story during Dune and Dune: Messiah was one of being coopted by all the great forces playing out around him. It’s a very sneaky framing—Herbert makes him seem like one of the Great Men of History, but as he futilely realizes towards the end of the first Dune novel, his life or death during the final fight in the throne room would not have changed anything. Had he died, he would’ve been the martyr who sacrificed himself to free Arrakis from the Harkonnens, and the Jihad would be carried out in his name. Since he lived, the same thing happened (save for the martyring).
Dune 1 and Dune: Messiah are a deconstruction of the hero and his “agency” in a world that is governed not by individual choices but by sociology. Religion, the Bene Gesserit, Herbert’s views on genetics, the Landsraad and the wars of the Great Houses. All of these factors drove the story into being what it was. The superhuman power of prescience allows one to see possible futures, but there are still only very few things that are possible. Despite Paul’s superhuman capabilities as the pinnacle of the human species, he was still subject to the forces of history. Other timelines involved him maintaining the status quo by dying before becoming Muad’Dib, and some other highly ignoble ways of preventing the Jihad, but none of them gave him the revenge he wanted—and he could not find a way to exact revenge on the Harkonnens without unleashing Holy War on the universe.
And so he did.
By Children of Dune, Herbert changes his mind a little. Paul himself was unable to forge the course of history, because he was still so very human. The peak of the human species was still subject to our limitations. Leto II, on the other hand, is the only member of any truly alien species in the Dune series, because he merged with the worms. Doing so allowed him to break free from the chains that bound the rest of us. At this point, Paul was also retconned into having seen the Golden Path but simply lacking the strength to pursue it (which I was also fine with tbh given the previous statements on Dune 1 and Messiah—Paul’s character trait of lacking the will to undertake truly horrific decisions is not a retcon, but the existence of the Golden Path is). Leto II was willing to unleash the forces of Jihad at his own command, and reshaped the world as God Emperor.
And yet, the God Emperor was not really free either. Just like Paul after the Stone Burner, he saw a vision, assumed it to be the only way, and fully gave into it, and from then on outward simply walked the path laid out for him by himself. Ironically, his choice to eradicate the very concept of prescience freed every other future Kwisatz Haderach from the burden he and Paul carried: a false total understanding. I can’t remember where this happens, but Herbert likened prescience as collapsing the wavefunction of the future a la the Copenhagen interpretation of Quantum Mechanics, but cautioned that doing so also meant eradicating other possible futures that were as of yet unseen (truly Herbert was the GOAT this shit ruled). Prescience was the very thing Paul warned against:
“And always, he fought the temptation to choose a clear, safe course, warning ’That path leads ever down into stagnation.”
But it seems that all the great prescience-users fell victim to the trap. One could argue that Leto II did this on purpose, forcing humans into 3,500 years of stagnation in order to teach them “a lesson their bones would remember.”
I give them enduring eons of enforced tranquility which plods on and on despite their every effort to escape into chaos. Belleve me, the memory of Leto’s Peace shall abide with them forever. They will seek their quiet security thereafter only with extreme caution and steadfast preparation.
And yet, Leto’s peace never seemed quite right to me. This comment is long enough, though, so I’ll leave it here.
___________________________________________________________________________________________
Also, I forget what happens after God Emperor. I have barely any recollection what goes on in Heretics and Chapterhouse and have very little desire to read them again. I have never read Brian Herbert’s sequel books and never plan to.

Harjas 3 Jun 2025 16:04 UTC
1 point
1
in reply to: lesswronguser123’s comment on: In defense of memes (and thought-terminating clichés)
I think your distinction makes a lot of sense here. IIRC Kitten argued somewhere else that the dunk was self-evident—taking morality too literally leads you to strange places and unintuitive conclusions, which I don’t necessarily agree with—but I agree with you in that it was more of a semantic stopsign (what a nice term) than a proper laconic takedown of an idea (which is really hard to do).
Funnily enough, politicalcompassmemes is the literally epitome of Scott’s bingo card idea, but it hasn’t seemed to result in the full Ostrich effect “head in sand” phenomenon he was worried about. Instead, it’s sort of fragmented into an inside joke community, which was sort of my point about memes serving the purpose of communicating humor/in-jokes.

In defense of memes (and thought-terminating clichés)

Harjas2 Jun 2025 20:18 UTC

11 points

4 comments10 min readLW link

Harjas 2 Jun 2025 14:45 UTC
1 point
0
in reply to: Alex_Altair’s comment on: Alex_Altair’s Shortform
It definitely is trading off with comprehension, if only because time spent thinking about and processing ideas roughly correlates with how well they cement themselves in your brain and worldview (note: this is just intuition). I can speedread for pure information very quickly, but I often force myself to slow down and read every word when reading content that I actually want to think about and process, which is an extra pain and chore because I have ADHD. But if I don’t do this, I can end up in a state where I technically “know” what I just read, but haven’t let it actually change anything in my brain—it’s as if I just shoved it into storage. This is fine for reading instruction manuals or skimming end-user agreements. This is not fine for reading LessWrong posts or particularly information-dense books.
If you are interested in reading quicker, one thing that might slow your reading pace is subvocalizing or audiating the words you are reading (I unfortunately don’t have a proper word for this). This is when you “sound out” what you’re reading as if someone is speaking to you inside your head. If you can learn to disengage this habit at will, you can start skimming over words in sentences like “the” or “and” that don’t really enhance semantic meaning, and eventually be able to only focus in on the words or meaning you care about. This still comes with the comprehension tradeoff and somewhat increases your risk for misreading, which will paradoxically decrease your reading speed (similar to taking typing speed tests: if you make a typo somewhere you’re gonna have to go back and redo the whole thing and at that point you may as well have just read slower in the first place.)
Hope this helps!

Harjas

An AI skep­tic’s case for re­cur­sive self-improvement

Har­jas’s Shortform

In defense of memes (and thought-ter­mi­nat­ing clichés)

An AI skeptic’s case for recursive self-improvement

Harjas’s Shortform

In defense of memes (and thought-terminating clichés)