Fabien Roger 4 Apr 2024 4:42 UTC
LW: 68 AF: 26
2
AF
on: Fabien’s Shortform
I listened to the book This Is How They Tell Me the World Ends by Nicole Perlroth, a book about cybersecurity and the zero-day market. It describes in detail the early days of bug discovery, the social dynamics and moral dilemma of bug hunts.
(It was recommended to me by some EA-adjacent guy very worried about cyber, but the title is mostly bait: the tone of the book is alarmist, but there is very little content about potential catastrophes.)
My main takeaways:
- Vulnerabilities used to be dirt-cheap (~$100) but are still relatively cheap (~$1M even for big zero-days);
- If you are very good at cyber and extremely smart, you can hide vulnerabilities in 10k-lines programs in a way that less smart specialists will have trouble discovering even after days of examination—code generation/analysis is not really defense favored;
- Bug bounties are a relatively recent innovation, and it felt very unnatural to tech giants to reward people trying to break their software;
- A big lever companies have on the US government is the threat that overseas competitors will be favored if the US gov meddles too much with their activities;
- The main effect of a market being underground is not making transactions harder (people find ways to exchange money for vulnerabilities by building trust), but making it much harder to figure out what the market price is and reducing the effectiveness of the overall market;
- Being the target of an autocratic government is an awful experience, and you have to be extremely careful if you put anything they dislike on a computer. And because of the zero-day market, you can’t assume your government will suck at hacking you just because it’s a small country;
- It’s not that hard to reduce the exposure of critical infrastructure to cyber-attacks by just making companies air gap their systems more—Japan and Finland have relatively successful programs, and Ukraine is good at defending against that in part because they have been trying hard for a while—but it’s a cost companies and governments are rarely willing to pay in the US;
- Electronic voting machines are extremely stupid, and the federal gov can’t dictate how the (red) states should secure their voting equipment;
- Hackers want lots of different things—money, fame, working for the good guys, hurting the bad guys, having their effort be acknowledged, spite, … and sometimes look irrational (e.g. they sometimes get frog-boiled).
- The US government has a good amount of people who are freaked out about cybersecurity and have good warning shots to support their position. The main difficulty in pushing for more cybersecurity is that voters don’t care about it.
  - Maybe the takeaway is that it’s hard to build support behind the prevention of risks that 1. are technical/abstract and 2. fall on the private sector and not individuals 3. have a heavy right tail. Given these challenges, organizations that find prevention inconvenient often succeed in lobbying themselves out of costly legislation.
Overall, I don’t recommend this book. It’s very light on details compared to The Hacker and the State despite being longer. It targets an audience which is non-technical and very scope insensitive, is very light on actual numbers, technical details, real-politic considerations, estimates, and forecasts. It is wrapped in an alarmist journalistic tone I really disliked, covers stories that do not matter for the big picture, and is focused on finding who is in the right and who is to blame. I gained almost no evidence either way about how bad it would be if the US and Russia entered a no-holds-barred cyberwar.

Fabien Roger 4 Jan 2023 17:45 UTC
LW: 68 AF: 23
28
AF
on: Basic Facts about Language Model Internals
I’m surprised you put the emphasis on how Gaussian your curves are, while your curves are much less Gaussian that you would naively expect if you agreed with the “LLM are a bunch of small independent heuristic” argument.
Even ignoring outliers, some of your distributions don’t look like Gaussian distributions to me. In Geogebra, exponential decays fit well, Gaussians don’t.
I think your headlines are misleading, and that you’re providing evidence against “LLM are a bunch of small independent heuristic”.

If influence functions are not approximating leave-one-out, how are they supposed to help?

Fabien Roger22 Sep 2023 14:23 UTC

66 points

4 comments3 min readLW link

Fabien Roger 22 Mar 2024 2:34 UTC
LW: 61 AF: 23
4
AF
on: Fabien’s Shortform
I just finished listening to The Hacker and the State by Ben Buchanan, a book about cyberattacks, and the surrounding geopolitics. It’s a great book to start learning about the big state-related cyberattacks of the last two decades. Some big attacks /leaks he describes in details:
- Wire-tapping/passive listening efforts from the NSA, the “Five Eyes”, and other countries
- The multi-layer backdoors the NSA implanted and used to get around encryption, and that other attackers eventually also used (the insecure “secure random number” trick + some stuff on top of that)
- The shadow brokers (that’s a *huge* leak that went completely under my radar at the time)
- Russia’s attacks on Ukraine’s infrastructure
- Attacks on the private sector for political reasons
- Stuxnet
- The North Korea attack on Sony when they released a documentary criticizing their leader, and misc North Korean cybercrime (e.g. Wannacry, some bank robberies, …)
- The leak of Hillary’s emails and Russian interference in US politics
- (and more)
Main takeaways (I’m not sure how much I buy these, I just read one book):
- Don’t mess with states too much, and don’t think anything is secret—even if you’re the NSA
- The US has a “nobody but us” strategy, which states that it’s fine for the US to use vulnerabilities as long as they are the only one powerful enough to find and use them. This looks somewhat nuts and naive in hindsight. There doesn’t seem to be strong incentives to protect the private sector.
- There are a ton of different attack vectors and vulnerabilities, more big attacks than I thought, and a lot more is publicly known than I would have expected. The author just goes into great details about ~10 big secret operations, often speaking as if he was an omniscient narrator.
- Even the biggest attacks didn’t inflict that much (direct) damage (never >10B in damage?) Unclear if it’s because states are holding back, if it’s because they suck, or if it’s because it’s hard. It seems that even when attacks aim to do what some people fear the most (e.g. attack infrastructure, …) the effect is super underwhelming.
  - The bottleneck in cyberattacks is remarkably often the will/the execution, much more than actually finding vulnerabilities/entry points to the victim’s network.
  - The author describes a space where most of the attacks are led by clowns that don’t seem to have clear plans, and he often seems genuinely confused why they didn’t act with more agency to get what they wanted (does not apply to the NSA, but does apply to a bunch of Russia/Iran/Korea-related attacks)
- Cyberattacks are not amazing tools to inflict damage or to threaten enemies if you are a state. The damage is limited, and it really sucks that (usually) once you show your capability, it reduces your capability (unlike conventional weapons). And states don’t like to respond to such small threats. The main effect you can have is scaring off private actors from investing in a country / building ties with a country and its companies, and leaking secrets of political importance.
- Don’t leak secrets when the US presidential election is happening if they are unrelated to the election, or nobody will care.
(The author seems to be a big skeptic of “big cyberattacks” / cyberwar, and describes cyber as something that always happens in the background and slowly shapes the big decisions. He doesn’t go into the estimated trillion dollar in damages of everyday cybercrime, nor the potential tail risks of cyber.)

Some negative steganography results

Fabien Roger9 Dec 2023 20:22 UTC

55 points

5 comments2 min readLW link

A quick investigation of AI pro-AI bias

Fabien Roger19 Jan 2024 23:26 UTC

52 points

1 comment2 min readLW link

Toy models of AI control for concentrated catastrophe prevention

Fabien Roger and Buck

6 Feb 2024 1:38 UTC

50 points

2 comments7 min readLW link

Fabien Roger 11 Apr 2024 20:56 UTC
LW: 48 AF: 21
7
AF
on: Fabien’s Shortform
I listened to The Failure of Risk Management by Douglas Hubbard, a book that vigorously criticizes qualitative risk management approaches (like the use of risk matrices), and praises a rationalist-friendly quantitative approach. Here are 4 takeaways from that book:
- There are very different approaches to risk estimation that are often unaware of each other: you can do risk estimations like an actuary (relying on statistics, reference class arguments, and some causal models), like an engineer (relying mostly on causal models and simulations), like a trader (relying only on statistics, with no causal model), or like a consultant (usually with shitty qualitative approaches).
- The state of risk estimation for insurances is actually pretty good: it’s quantitative, and there are strong professional norms around different kinds of malpractice. When actuaries tank a company because they ignored tail outcomes, they are at risk of losing their license.
- The state of risk estimation in consulting and management is quite bad: most risk management is done with qualitative methods which have no positive evidence of working better than just relying on intuition alone, and qualitative approaches (like risk matrices) have weird artifacts:
  - Fuzzy labels (e.g. “likely”, “important”, …) create illusions of clear communication. Just defining the fuzzy categories doesn’t fully alleviate that (when you ask people to say what probabilities each box corresponds to, they often fail to look at the definition of categories).
  - Inconsistent qualitative methods make cross-team communication much harder.
  - Coarse categories mean that you introduce weird threshold effects that sometimes encourage ignoring tail effects and make the analysis of past decisions less reliable.
  - When choosing between categories, people are susceptible to irrelevant alternatives (e.g. if you split the “5/5 importance (loss > $1M)” category into “5/5 ($1-10M), ⁵⁄₆ ($10-100M), ⁵⁄₇ (>$100M)”, people answer a fixed “1/5 (<10k)” category less often).
  - Following a qualitative method can increase confidence and satisfaction, even in cases where it doesn’t increase accuracy (there is an “analysis placebo effect”).
  - Qualitative methods don’t prompt their users to either seek empirical evidence to inform their choices.
  - Qualitative methods don’t prompt their users to measure their risk estimation track record.
- Using quantitative risk estimation is tractable and not that weird. There is a decent track record of people trying to estimate very-hard-to-estimate things, and a vocal enough opposition to qualitative methods that they are slowly getting pulled back from risk estimation standards. This makes me much less sympathetic to the absence of quantitative risk estimation at AI labs.
A big part of the book is an introduction to rationalist-type risk estimation (estimating various probabilities and impact, aggregating them with Monte-Carlo, rejecting Knightian uncertainty, doing calibration training and predictions markets, starting from a reference class and updating with Bayes). He also introduces some rationalist ideas in parallel while arguing for his thesis (e.g. isolated demands for rigor). It’s the best legible and “serious” introduction to classic rationalist ideas I know of.
The book also contains advice if you are trying to push for quantitative risk estimates in your team / company, and a very pleasant and accurate dunk on Nassim Taleb (and in particular his claims about models being bad, without a good justification for why reasoning without models is better).
Overall, I think the case against qualitative methods and for quantitative ones is somewhat strong, but it’s far from being a slam dunk because there is no evidence of some methods being worse than others in terms of actual business outputs. The author also fails to acknowledge and provide conclusive evidence against the possibility that people may have good qualitative intuitions about risk even if they fail to translate these intuitions into numbers that make any sense (your intuition sometimes does the right estimation and math even when you suck at doing the estimation and math explicitly).

A Mystery About High Dimensional Concept Encoding

Fabien Roger3 Nov 2022 17:05 UTC

46 points

13 comments7 min readLW link

Some ML-Related Math I Now Understand Better

Fabien Roger9 Mar 2023 16:35 UTC

45 points

4 comments4 min readLW link

Open consultancy: Letting untrusted AIs choose what answer to argue for

Fabien Roger12 Mar 2024 20:38 UTC

35 points

4 comments5 min readLW link

Protocol evaluations: good analogies vs control

Fabien Roger19 Feb 2024 18:00 UTC

35 points

10 comments11 min readLW link