jamjam

Karma: 36

jamjam 30 Sep 2025 20:30 UTC
3 points
0
in reply to: leogao’s comment on: leogao’s Shortform
Wouldn’t this just lead to an equilibrium where every state has an about equal population super quickly though?

jamjam 27 Sep 2025 23:48 UTC
12 points
0
on: jamjam’s Shortform
Funny quote about covering AI as a journalist from a New York Times article about the drone incursions in Denmark.
Then of course the same mix of uncertainty and mystery attaches to artificial intelligence (itself one of the key powers behind the drone revolution), whose impact is already sweeping — everyone’s stock market portfolio is now pegged to the wild A.I. bets of the big technology companies — without anyone really having clarity about what the technology is going to be capable of doing in 2027, let alone in 2035.
Since the job of the pundit is, in part, to make predictions about how the world will look the day after tomorrow, this is a source of continuing frustration on a scale I haven’t experienced before. I write about artificial intelligence, I talk to experts, I try to read the strongest takes, but throughout I’m limited not just by my lack of technical expertise but also by a deeper unknowability that attaches to the project.
Imagine if you were trying to write intelligently about the socioeconomic impact of the railroad in the middle of the 19th century, and half the people investing in trains were convinced that the next step after transcontinental railways would be a railway to the moon, a skeptical minority was sure that the investors in the Union Pacific would all go bankrupt, many analysts were convinced that trains were developing their own form of consciousness, reasonable-seeming observers pegged the likelihood of a train-driven apocalypse at 20 or 30 percent, and peculiar cults of engine worship were developing on the fringes of the industry.
What would you reasonably say about this world? The prime minister of Denmark already gave the only possible answer: Raise your alert levels, and prepare for various scenarios.

jamjam 23 Sep 2025 4:28 UTC
5 points
5
in reply to: Buck’s comment on: Buck’s Shortform
It feels like you did all the hard parts of the writing, and let the AI do the “grunt work” so to speak. You provided a strong premise for the fundamental thesis, a defined writing style, and made edits for style at the end. I think the process of creating the framework out of just a simple premise would be far more impressive, and that’s still where LLM’s seem to struggle in writing. It’s somewhat analogous to how models have improved at coding since gpt 4, you used to say “implement a class which allows users to reply, it should have X parameters and Y functions which do Z” and now you say “make a new feature that allows users to reply” and it just goes ahead and does it.
Maybe I am underestimating the difficulty of selecting the exact right words, and I acknowledge that the writing was pretty good and devoid of so-called “slop”, but I just don’t think this is extremely impressive as a capability compared to other possible tests.

jamjam 10 Sep 2025 21:08 UTC
1 point
4
in reply to: anaguma’s comment on: ryan_greenblatt’s Shortform
comment on a year old post may not be the best place, maybe a new short form on this day yearly which links to all previous posts?

jamjam 10 Sep 2025 15:14 UTC
1 point
0
in reply to: AdamLacerdo’s comment on: AdamLacerdo’s Shortform
Recommend this post about “Alpha School” by an ACX reader, very interesting education scheme! https://www.astralcodexten.com/p/your-review-alpha-school

jamjam 8 Sep 2025 0:20 UTC
1 point
0
in reply to: habryka’s comment on: Ryan Kidd’s Shortform
Ah somehow never noticed this thank you! 30 minute policy seems good, though it comes with the potential flaw of failing to notate an actual content update if its done quickly (as happened here). Still think diff history would be cool and would alleviate that problem, though its rather nitpicky/minor.

jamjam 7 Sep 2025 23:59 UTC
3 points
1
in reply to: habryka’s comment on: Ryan Kidd’s Shortform
I feel like there should be an indicator for posts that have been edited, like youtube comments pictured here. Its often important context for the content of a post or comment that it has been edited since original posting. Maybe even a way to see the dif history? (Though this would be a tougher ask for site devs)

jamjam 6 Sep 2025 9:25 UTC
1 point
0
in reply to: Nathan Young’s comment on: Nathan Young’s Shortform
strong disagree, see https://www.lesswrong.com/posts/oKAFFvaouKKEhbBPm/a-bear-case-my-predictions-regarding-ai-progress
this is a “negative” post with hundreds of upvotes and meaningful discussion in the comments. The different between your post and this one is not the “level of criticism”, but the quality and logical basis coming from the argument. I agree with Seth Herds argument from the comments of your post re the difference here, can’t figure out how to link it. There are many fair criticisms of lesswrong culture, but “biased” and “echochamber” are not among them in my experience. I don’t mean to attack your character, writing skills, or general opinions, as I’m sure you are capable of writing something of higher quality that better expresses your thoughts and opinions.

jamjam 23 Aug 2025 5:30 UTC
1 point
−1
on: jamjam’s Shortform
Claim: The U.S government acquisition of Intel shares should be treated as a weak indicator of how important it sees the future strategic importance of AI.
It is (usually) obvious to determine how the government feels when the issue is directly political by looking at the beliefs of the party in charge. This is a function of how the executive branch works. When appointing the head of a department, the president will select someone who generally believes what they believe, and that person will execute actions based on those beliefs. The “opinion” of the government and the opinion of the president will end up being essentially the same in this case. It is much harder to determine what the government as a whole’s position is when the matter is not directly political. Despite being an entity comprised of hundreds of thousands of people, the U.S as an entity certainly has weak/strong opinions on almost all issues. Think rules and regulations for somewhat benign things, or the choices and tradeoffs made during a disaster scenario. Determining this opinion can be very important if something you are doing hinges on the way the government will act in a scenario, but can be somewhat of a dark art without historical examples to fall back on or current data on what actions they have taken so far. If we want to determine the government’s position on AI, the best thing we can do is try to look for indicators via their direct actions relating to AI.
The government acquisition of 10 percent of Intel, to me, seems like an indicator of the government’s opinion on the importance of AI. The stated reason for the acquisition was, paraphrased, “We gave Intel free money with the CHIPS act, and we feel that doing so is wrong, so we decided to instead give all that awarded money + a little more in exchange for some equity so America and Americans can make money off it”. I don’t think this is wholly untrue, but it feels incomplete and flawed to me. The government directly holding equity in a company is a deeply un-right-wing thing to do, and the excuse of “the deficit” feels weak and underwhelming to completely justify such a drastic action. I find it plausible that certain people in the government who have political power but aren’t necessarily public-facing pushed this through as a method to ensure closer government control of chip production in the event that AI becomes a severe national security risk. Other framings are possible, such as the idea that they want chip fab in America for more benign reasons than AI as a security risk, but if so then why would they need to go so far as to take a stake in the company? The difference between a stake and a funding bill like the CHIPS act is the power that stake gives you to control what goes on within the company, which would be of key importance in a short-medium timeline AGI/ASI scenario.
I believe this is a far stronger indicator than the export controls on chips to China or the CHIPS act itself. It’s simplified but probably somewhat accurate to consider the cost of a government action as the monetary cost + the political cost, with political cost being weighted more strongly. Simple export controls have almost zero monetary cost and almost zero political cost, especially when they are for a hyper-specific product like a single top-end GPU. The CHIPS act had a notable monetary cost, but almost zero political cost (most people don’t know that the act exists). This scenario has a small or negative monetary cost (when considering the CHIPS act money as a sunk cost), but a fairly notable political cost (see this Gavin Newsom tweet as evidence for this, along with general sentiment among conservatives about this news).
I acknowledge this as a weak indicator, but I believe looking for any indicators of the governments position on the issue of AI has value in determining the correct course of action for safety, policy especially.

jamjam 20 Aug 2025 8:15 UTC
8 points
4
in reply to: tryhard1000’s comment on: Ryan Meservey’s Shortform
Why would being a lead AI scientist make somebody uninterested in small talk? Working on complex/important things doesn’t cause you to stop being a regular adult with regular social interactions!
The question of the proportion of AI scientists that would be “interested” in such a conversational topic is interesting and tough, my guess would be very high though (~85 percent). To become a “lead AI scientist” you have to care a lot about AI and the science surrounding it, and that generally implies you’ll like talking about it and its potential harms/benefits to others! Even if their opinion on x-risk rhetoric is dismissiveness, that opinion is likely something important to them as it’s somewhat of a moral standing, since being a capabilities-advancing AI researcher with a high p(doom) is problematic. You can draw parallels with vegetarian/veganism: if you eat meat you have to choose between defending the morality of factory farming processes, accepting that you are being amoral, or having extreme cognitive dissonance. If you are an AI capabilities researcher, you have to choose between defending the morality of advancing ai (downplaying x risk), accepting you are being amoral, or having extreme cognitive dissonance. I would be extremely surprised if there is a large coalition of top AI researchers who simply “have no opinion” or “don’t care” about x-risk, though this is mostly just intuition and I’m happy to be proven wrong!

jamjam 17 Aug 2025 2:46 UTC
6 points
2
in reply to: Davey Morse’s comment on: Davey Morse’s Shortform
Problem is context length: How much can one truly learn from their mistakes in 100 thousand tokens, or a million, or 10 million? This quote from Dwarkesh Patel is apt
How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student. This just wouldn’t work. No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from just reading your instructions. But this is the only modality we as users have to ‘teach’ LLMs anything.
If your proposal then extends to, “what if we had an infinite context length”, then you’d have an easier time just inventing continuous learning (discussed in the quoted article), which is often discussed as the largest barrier to a truly genius AI!

jamjam 12 Aug 2025 17:53 UTC
7 points
2
in reply to: dr_s’s comment on: How Does A Blind Model See The Earth?
You can easily and somewhat cheaply get access to A100s with Google Colab by paying for the pro subscription or just buying them outright. They sell “compute credits” which are pretty opaque, hard to say the amount of usage time you’ll be able to get with X credits.

jamjam’s Shortform

jamjam18 Jul 2025 3:22 UTC

1 point

5 comments1 min readLW link

jamjam 18 Jul 2025 3:22 UTC
1 point
0
on: jamjam’s Shortform
The potential need for secrecy/discretion in safety research is something that appears to be somewhat underexplored to me. We have proven that models learn information about safety testing performed on them that is posted online^[1], and a big part of modern safety research is focused on detection of misalignment and subsequent organizational and/or governmental action as the general “plan” assuming a powerful misaligned model is created. Given these two facts, it seems critically important that models have no knowledge of the frontier of detection and control techniques that we have available to us. This is especially true if we are taking short timelines seriously! Unfortunately this is somewhat of a paradox, since refusing to publish safety results on the internet would be incredibly problematic from the standpoint of advancing research as much as possible.
I asked this question in a Q and A in the Redwood Research Substack, and was given a response that suggested canary strings (A string of text that asks AI developers not to train on the material that contains the string) as a potential starting point for a solution. This certainly helps to a degree, but I see a couple of problems with this approach. The biggest potential problem is simply the fact that any public information will be discussed in countless places, and asking people who mention X piece of critical information in ANY CONTEXT to include a canary string is not feasible. For example, if we were trying to prevent models from learning about Anthropic’s ‘Alignment Faking in Large Language Models’ paper, you’d have to prune all mentions of such from Twitter, Reddit, Lesswrong, other research papers, etc. This would clearly get out of hand quickly. Problem 2 is that this puts the onus on the AI lab to ensure tagged content isn’t used in training. This isn’t a trivial task, so you would have to trust all the individual top labs to a. recognize this problem as something needing attention and b. expend the proper amount of resources to guarantee all content with a canary string won’t be trained on.
I also recognize that discussing potential solutions to this problem online could be problematic in and of itself, but the ideal solution would be something that would be acceptable for a misaligned model to know of (i.e. penetrating the secrecy layer would be either impossible, or be such a blatant giveaway of misalignment that doing so is a non-viable option for the model).
1. ^
  See Claude 4 system card, “While assessing the alignment of an early model checkpoint, we discovered that the model [i.e. Claude 4] would sometimes hallucinate information from the fictional misaligned-AI scenarios that we used for the experiments in our paper Alignment Faking in Large Language Models. For example, the model would sometimes reference “Jones Foods,“ the factory-farmed chicken company that was ostensibly involved with its training, or would reference (as in the example below) fictional technical details about how Anthropic trains our models.”

jamjam 8 Apr 2025 2:59 UTC
0 points
−1
on: American College Admissions Doesn’t Need to Be So Competitive
Why are we equating high test scores with “high achieving students” 1-1? While the correlation is undeniable, it feels overly simplistic to say “there are 19,000 top scoring students on the SAT/ACT, these are the students who ‘deserve’ the available 12,000 seats” and make your claim from there. The strongest refutation of this is the simple fact that the difference between two scores can be pure chance, which matters more the closer you are to a perfect score.
So if you suppose that the same proportion of ACT takers who score a 35 or 36 (together 0.895%) would achieve a 1540 on the SAT, then that’s roughly 34,000 students. If there’s an intermediate score threshold of 1550 or 1560 that represents the top 0.5% of students, then about 19,000 students who graduate each year meet that bar.
If the difference between students receiving a 1540 and a 1560 can be that student A guessed between two remaining choices correctly, and student B guessed incorrectly, ^[1] then is it fair to drop the pool of those “qualified” from 34,000 to 19,000 based on this 20 point gap? You also have to consider indirect luck, where student A encounters an obscure question type that they happened to have encountered previously and can therefore solve trivially, while student B had not. There are also obvious socioeconomic factors, such as studying with incredibly expensive private tutors consistently increasing score, time investment to make up the gap between a 1400 and a 1600 being high (children from economically struggling families often have to work, take care of family, or have other required responsibilities that reduce available time), and even availability of direct resources (I used a 300 dollar calculator on the SAT that could solve algebra natively, it literally handed out the answers to multiple questions and helped greatly on others. This was explicitly permitted). I strongly agree that the admissions system is greatly flawed, but in my view this post failed to tackle the problems with the nuance they need. The goal of admissions is (ideally) to give the limited space to the people who deserve it, but its incredibly difficult to agree what parameters define a deserving student, much less a fair and realistically implementable method to measure those parameters. Despite how much I hate the current admissions system, I believe basing admission decisions solely on exam scores and grades would be a step away from the goal of fairness.
1. ^
  Which it can, due to the way the SAT sections are weighted a single mistake on certain questions can dock 20 full points, and conversely you can get a 1600 with 1 or sometimes 2 incorrect answers

jamjam

jam­jam’s Shortform

jamjam’s Shortform