Herb Ingram

Karma: 70

Herb Ingram 3 Jul 2023 21:38 UTC
13 points
5
in reply to: johnlawrenceaspden’s comment on: When do “brains beat brawn” in Chess? An experiment
That makes sense to me but to make any argument about the “general game of life” seems very hard. Actions in the real world are made under great uncertainty and aggregate in a smooth way. Acting in the world is trying to control (what physicists call) chaos.

In such a situation, great uncertainty means that an intelligence advantage only matters “on average over a very long time”. It might not matter for a given limited contest, such as a struggle for world domination. For example, you might be much smarter than me and a meteorologist, but you’d find it hard to predict the weather in a year’s time better than me if it’s a single-shot-contest. How much “smarter” would you need to be in order to have a big advantage? Pretty much regardless of your computational ability and knowledge of physics, you’d need such an amount of absurdly precise knowledge about the world that it might still take (both you and even much less intelligent actors) less resources to actively control the entire planet’s weather than predict it a year in advance.

The way that states of the world are influenced by our actions is usually in some sense smooth. For any optimal action, there are usually lots of similar “nearby actions”. These may or may not be near-optimal but in practice only plans that have a sufficiently high margin for error are feasible. The margin of error depends on the resources that allow finely controlled actions and thus increase the space of feasible plans. This doesn’t have a good analogy in chess: chess is much further from smooth than most games in the real world.

Maybe RTS games are a slightly better analogy. They have “some smoothness of action-result mapping” and high amounts of uncertainty. Based on AlphaStar’s success in StarCraft, I would expect we can currently build super-human AIs for such games. They are superior to humans both in their ability to quickly and precisely perform many actions, as well as find better strategies. An interesting restriction is to limit the numbers of actions the AI may take to below what a human can to see the effect of these abilities individually. Restricting the precision and frequency of actions reduces the space of viable plans, at which point the intelligence advantage might matter much less.

All in all, what I’m trying to say is that the question “how much does what intelligence imbalance matter in the world” is hard. The question is not independent of access to information and access to resources or ability to act on the world. To make use of a very high intelligence, you might need a lot more information and also a lot more ability to take precise actions. The question for some system “taking over” is whether its initial intelligence, information and ability to take actions is sufficient to bootstrap quickly enough.

These are just some more reasons you can’t predict the result just by saying “something much smarter is unbeatable at any sufficiently complex game”.

Herb Ingram 14 Jul 2023 7:24 UTC
12 points
2
in reply to: jbash’s comment on: Jailbreaking GPT-4′s code interpreter
I agree. To me, the most interesting aspects of this (quite interesting and well-executed) exercise are getting a glimpse into OpenAI’s approach to cybersecurity, as well as the potentially worrying fact that GPT3 made meaningful contributions to finding the “exploits”.

Given what was found out here, OpenAI’s security approach seems to be “not terrible” but also not significantly better than what you’d expect from an average software company, which isn’t necessarily encouraging because those get hacked all the time. It’s definitely not what people here call “security mindset”, which casts doubt on OpenAI’s claim to be “taking the dangers very seriously”. I’d expect to hear about something illegal being done with one of these VMs before too long, assuming they continue and expand the service, which I expect they will.

I’m sure there are also security experts (both at OpenAi and elsewhere) looking into this. Given OpenAI’s PR strategy, they might be able to shut down such services “due to emerging security concerns” without much reputational damage. (Many companies are economically compelled to keep services running that they know are compromised or that have known vulnerabilities and instead pretend not to know about them or at least not inform customers as long as possible.) Not sure how much e.g. Microsoft would push back on that. All in all, security experts finding something might be taken seriously.

I’m increasingly worried (while ascribing a decent chance, mind you, that “AI might well go about as bad for us as most of history but not worse”) about what happens when GPT-X has hacking skills that are, say, on par with the median hacker. Being able to hack easy-ish targets at scale might not be something the internet can handle, potentially resulting in, e.g , an evolutionary competition between AIs to build a super-botnet.

Herb Ingram 22 Aug 2023 22:44 UTC
9 points
6
in reply to: AlanCrowe’s comment on: Large Language Models will be Great for Censorship
Indeed, systems controlling the domestic narrative may become sophisticated enough that censorship plays no big role. No regime is more powerful and enduring than one which really knows what poses a danger to it and what doesn’t, one which can afford to use violence, coercion and censorship in the most targeted and efficient way. What a small elite used to do to a large society becomes something that the society does to itself. However, this is hard and I assume will remain out of reach for some time. We’ll see what develops faster: sophistication of societal control and the systems through which it is achieved, or technology for censorship and surveillance. I’d expect at least a “transition period” of censorship technology spreading around the world as all societies that successfully use it become sophisticated enough to no longer really need it.

What seems more certain is that AI will be very useful for influencing societies in other countries, where the sophisticated domestically optimal means aren’t possible to deploy. This goes very well with exporting such technology.

Herb Ingram 1 Sep 2023 17:33 UTC
6 points
1
on: Biosecurity Culture, Computer Security Culture
I think it makes a huge difference that most cybersecurity desasters only cost money (or cause damage to a company’s reputation and loss of confidential information of customers) while a biosecurity desaster can kill a lot of people. This post seems to ignore this?

Herb Ingram 21 Jun 2023 6:31 UTC
6 points
1
on: Lessons On How To Get Things Right On The First Try
Very interesting exercise on modeling, with some great lessons. I don’t really like the AI analogy though.

The ramp problem is a situation where idealizations are well-understood. The main steps to solving it seem to be realizing that these idealizations are very far from reality and measuring (rather than modeling) as much as possible.

On the first step, comparing with AI progress and risks there, nobody thinks they have a detailed mechanistic model of what should happen. Rather, most people just assume there is no need to get anything right on the first try because that approach has “worked” so-far for new technology. People also anticipate strong capabilities to take a lot longer to develop and generally underdeliver on the industry’s promises, again because that’s how it usually goes with new hyped technology. You could say “that’s the model one should resist using there” but the analogy is very stretched in my opinion. It only applies if the “potential models to be resisted” are taken to be extremely crude base-rate estimates from guestimated base-rates for how “technological development in general” is supposed to work. Such a model would be just as established as the fact that “such very crude models give terribly inaccurate predictions”. There is no temptation there.

On the second step, I don’t seee what one might reasonably try to measure concerning AI progress. E.g., extrapolating some curve of “capability advancement over time” rather than just being sceptical-by-default isn’t going to make a difference for AI risk.

I think a better metaphorical experiment/puzzle relevant to AI risk would be one where you naively think you have a lot of tries, but it turns out that you only get one due to some catastrophic failure which you could have figured out and mitigated if only you thought and looked more carefully. In the ramp problem, the “you get one try” part is implicit in how the problem is phrased.

My argument is based on models concerned with the question whether “you only get one try for AI”. Maybe some people are unconcerned because they assume that others have detailed and reasonably accurate models of what a given AI will do. I doubt that because “it’s a blackbox” is the one fact one hears most often about current AI.

Herb Ingram 31 Jul 2023 21:22 UTC
5 points
0
in reply to: Dave Lindbergh’s comment on: Lack of Social Grace Is an Epistemic Virtue

just in case it turns out he’s heir to a giant fortune or something.

That seems like a highly dubious explanation to me. I guess, the woman’s honest account (or what you’d get by examining her state of mind) would say that she does it as a matter of habit, aiming to be nice and conform to social conventions.

If that’s true, the question becomes where the convention comes from and what maintains it despite the naively plausible benefits one might hope to gain by breaking it. I don’t claim to understand this (that would hint at understanding a lot of human culture at a basic level). However, I strongly suspect the origins of such behavior (and what maintains it) to be social. I.e., a good explanation of why the woman has come to act this way involves more than two people. That might involve some sort of strategic deception, but consider that most people in fact want to be lied to in such situations. An explanation must go a lot deeper than that kind of strategic deception.

Herb Ingram 18 Jun 2023 23:17 UTC
5 points
2
on: Solomonoff induction still works if the universe is uncomputable, and its usefulness doesn’t require knowing Occam’s razor
This post seems to be about justifying why Solomonoff induction (I presume) is a good way to make sense of the world. However, unless you use it as another name for “Bayesian reasoning”, it clearly isn’t. Rather, it’s a far-removed idealization. Nobody “uses” Solomonoff induction, except as an idealized model of how very idealized reasoning works. It has purely academic interest. The post sounds (maybe I’m misunderstanding this completely) like this weren’t the case, e.g., discussing what to do about a hypothetical oracle we might find.

In practice, humanity will only ever care about a finite number of probability distributions, each of which has finite support and will be updated a finite number of times using finite memory and finite processing power. (I guess one might debate this, but I personally usually can’t think of anything interesting to say in that debate) As such, for all practical purposes, the solution to any question that has any practical relevance is computable. You could also put an arbitrary fixed but very large bound on complexity of hypotheses, keeping all hypotheses we might ever discuss in the race, and thus make everything computable. This would change the model but make no difference at all in practice, since the size of hypotheses we may ever actually assess is miniscule. The reason why we discus the infinite limit is because it’s easier to grasp (ironically) and prove things about.

Starting from this premise, what is this post telling me? It’s saying something about a certain idealized conception of reasoning. I can see how to transfer certain aspects of Solomonoff induction to the real world: you have Occam’s razor, you have Bayesian reasoning. Is there something to transfer here? Or did I misunderstand it completely, leading me to expect there to be something?

Of course, I’m not sure if this “focus on finite things and practicality” is the most useful way to think about it, and I’ve seen people argue otherwise elsewhere, but always very unconvincingly from my perspective. Perhaps someone here will convince me that computability should matter in practice, for some reasonable concept of practice?

Herb Ingram 18 Jul 2023 7:13 UTC
4 points
1
on: Proof of posteriority: a defense against AI-generated misinformation
Unfortunately for this scheme, I would expect rendering time for AI videos to eventually be faster than real time. So, as the post implies, even if we had a reasonably good way to prove posteriority, this may not do to certify videos as “non-AI” for long.

On the other hand, as long as rendering AI videos is slower than real time, poof of priority alone might go a long way. You can often argue that prior to some point in time you couldn’t reasonably have known what kind of video you should fake.

The “analog requirement” reminds me of physical unclonable functions, which might have some cross-pollination with this issue. I couldn’t think of a way to make use of them but maybe someone else will.

Herb Ingram 10 Jul 2023 0:12 UTC
4 points
−2
on: Some reasons to not say “Doomer”
Who is the target audience for this?

I doubt anyone has been calling themselves a “doomer”. There are people on this site who wouldn’t ever get called that but I haven’t seen anyone else here label anyone a “doomer” yet. So it seems that you’re left with people who don’t frequent this site and would probably dismiss your arguments as “a doomer complaining about being called a doomed”?

Did I miss people call each other “doomer” on LW? Did you also post something like this on Twitter?

Herb Ingram 3 Jul 2023 22:53 UTC
4 points
0
in reply to: johnlawrenceaspden’s comment on: When do “brains beat brawn” in Chess? An experiment
What point are you trying to make? I’m not sure how that relates to what I was trying to illustrate with the weather example. Assuming for the moment that you didn’t understand my point.

The “game” I was referring to was one where it’s literally all-or-nothing “predict the weather a year from now”, you get no extra points for tomorrow’s weather. This might be artificial but I chose it because it’s a common example of the interesting fact that chaos can be easier to control than simulate.

Another example. You’re trying to win an election and “plan long-term to make the best use of your intelligence advantage”, you need to plan and predict a year ahead. Intelligence doesn’t give you a big advantage in predicting tomorrow’s polls given today’s polls. I can do that reasonably well, too. In this contest, resources and information might matter a lot more than intelligence. Of course, you can use intelligence to obtain information and resources. But this bootstrapping takes time and it’s hard to tell how much depending where you start off.

Herb Ingram 8 Sep 2023 12:21 UTC
3 points
0
in reply to: lbThingrb’s comment on: AI#28: Watching and Waiting
Formally proving that some X you could realistically build has property Y is way harder than building an X with property Y. I know of no exceptions (formal proof only applies to programs and other mathematical objects). Do you disagree?

I don’t understand why you expect the existence of a “formal math bot” to lead to anything particularly dangerous, other than by being another advance in AI capabilities which goes along other advances (which is fair I guess).

Human-long chains of reasoning (as used for taking action in the real world) neither require nor imply the ability to write formal proofs. Formal proofs are about math and making use of math in the real world requires modelling, which is crucial, hard and usually very informal. You make assumptions that are obviously wrong, derive something from these assumptions, and make an educated guess that the conclusions still won’t be too far from the truth in the ways you care about. In the real world, this only works when your chain of reasoning is fairly short (human-length), just as arbitrarily complex and long-term planning doesn’t work, while math uses very long chains of reasoning. The only practically relevant application so-far seems cryptography because computers are extremely reliable and thus modeling is comparatively easy. However, plausibly it’s still easier to break some encryption scheme than to formally prove that your practically relevant algorithm could break it.

LLMs that can do formal proof would greatly improve cybersecurity across the board (good for delaying some scenarios of AI takeover!). I don’t think they would advance AI capabilities beyond the technological advances used to build them and increasing AI hype. However, I also don’t expect to see useful formal proofs about useful LLMs in my lifetime (you could call this “formal interpretability”? We would first get “informal interpretability” that says useful things about useful models.) Maybe some other AI approach will be more interpretable.

Fundamentally, the objection stands that you can’t prove anything about the real world without modeling, and modeling always yields a leaky abstraction. So we would have to figure out “assumptions that allow to prove that AI won’t kill us all while being only slightly false and in the right ways”. This doesn’t really solve the “you only get one try problem”. Maybe it could help a bit anyway?

I expect a first step might be an AI test lab with many layers of improving cybersecurity, ending at formally verified, air-gapped, no interaction to humans. However, it doesn’t look like people are currently worried enough to bother building something like this. I also don’t see such an “AI lab leak” as the main path towards AI takeover. Rather, I expect we will deploy the systems ourselves and on purpose, finding us at the mercy of competing intelligences that operate at faster timescales than us, and losing control.

Herb Ingram 15 Aug 2023 23:27 UTC
3 points
0
in reply to: the gears to ascension’s comment on: We Should Prepare for a Larger Representation of Academia in AI Safety
No offense, this reads to me as if it was deliberately obfuscated or AI-generated (I’m sure you didn’t do either of these, this is a comment on writing style). I don’t understand what you’re saying. Is it “LW should focus on topics that academia neglects”?

I also didn’t understand at all what the part starting with “social justice” is meant to tell me or has to do with the topic.

Herb Ingram 14 Jul 2023 22:15 UTC
3 points
0
in reply to: Kaj_Sotala’s comment on: Jailbreaking GPT-4′s code interpreter
I guess it depends on whether this post found anything at all that can be called questionable security practice. Maybe it didn’t but the author was also no cybersecurity expert. Upon reflection, my earlier judgement was premature and the phrasing overconfident.

In general, I assume that OpenAI would view a serious hack as quite catastrophic, as it might e.g. leak their model (not an issue in this case), severely damage their reputation and undermine their ongoing attempt at regulatory capture. However, such situations didn’t prevent shoddy security practices in countless cybersecurity desasters.

I guess for this feature even the most serious vulnerabilities “just” lead to some Azure VMs being hacked, which has no relevance for AI safety. It might still be indicative of OpenAIs approach to security, which usually isn’t so nuanced within organizations as to differ wildly between applications where stakes are different. So it’s interesting how secure the system really is, which we won’t know how untill someone hacks it or some whistleblower emerges.

Some of my original reasoning was this:

You might argue that the “inner sandbox” is only used to limit resource use (for users who do not bother jailbreaking it) and to examine how users will act, as well as how badly exactly the LLM itself will fare against jailbreaking. In this case studying how people jailbreak it may be an integral part of the whole feature.

However, even if that is the case, to count as “security mindset”, the “outer sandbox” has to be extremely good and OpenAI needs to be very sure that it is. To my (very limited) knowledge im cybersecurity, it’s an unusual idea that you can reconcile very strong security requirements with purposely not using every opportunity to make it more secure. Maybe the idea that comes closest would be a “honeypot”, which this definitely isn’t.

So that suggests they purposely took a calculated security risk for some mixture of research and commercial reasons, which they weren’t compelled to do. Depending on how dangerous they really think such AI models are or may soon become, how much what they learn from the experiment benefits future security and how confident they are in the outer sandbox, the calculated risk might make sense. Assuming by default that the outer sandbox is “normal indusitry standard”, it’s incompatible with the level of worry they claim when pursuing regulatory capture.

Herb Ingram 10 Jul 2023 19:29 UTC
3 points
2
in reply to: Richard_Ngo’s comment on: Is the Endowment Effect Due to Incomparability?
Someone who you’re likely to trade with (either because they offer you a trade or because they are around when you want to trade) are on average more experienced than you at trading. So trades available to you are disproportionately unfavorable and you cannot figure out which ones “are likely to lead to favorable trades in the future”, by assumption that they are incomparable.

This is what you mean by “trades are often adversarialy chosen” in (1.), right? I don’t understand why or in what situation you’re dismissing that argument in (1.).

There can be a lot of other reasons to avoid incomparable trades. In accepting a trade where you don’t clearly gain anything, you’re taking a risk to be cheated and reveal information to others about your preferences, which can risk social embarrassment and might enable others to cheat you in the future. You’re investing the mental effort to evaluate these things despite already having decided that you don’t stand to gain anything.

An interesting counterexample are social contexts where trading is an established and central activity. For example, people who exchange certain collectibles. In such a context, people feel that the act of trading itself has positive value and thus will make incomparable trades.

I think this situation is somewhat analogous to betting. Most people (cultures?) are averse to betting in general. Risk aversion and the known danger of gambling addiction explains aversion to betting for money/valuables. However, many people also strongly dislike betting without stakes. In some social contexts (horse racing, LW) betting is encouraged, even between “incomparable options”, where the odds correctly reflect your credence.

In such cases, most people seem to consider it impolite to answer an off-hand probability estimate by offering a bet. It is understood perhaps as questioning their credibility/competence/sincerity, or as an attempt to cheat them when betting for money. People will decline the bet but maintain their off-hand estimate. This might very well make sense, especially if they don’t explicitly value “training to make good probability estimates”, and perhaps for some of the same reasons as apply to trades?

Herb Ingram 4 Jul 2023 2:54 UTC
3 points
0
in reply to: X4vier’s comment on: When do “brains beat brawn” in Chess? An experiment
I really have no idea, probably a lot?

I don’t quite see what you’re trying to tell me. That one (which?) of my two analogies (weather or RTS) is bad? That you agree or disagree with my main claim that “evaluating the relative value of an intelligence advantage is probably hard in real life”?

Your analogy doesn’t really speak to me because I’ve never tried to start a company and have no idea what leads to success, or what resources/time/information/intelligence helps how much.

Herb Ingram 22 Aug 2023 23:08 UTC
2 points
1
on: Ideas for improving epistemics in AI safety outreach
I think any outreach must start with understanding where the audience is coming from. The people most likely to make the considerable investment of “doing outreach” are in danger of being too convinced of their position and thinking it obvious; “how can people not see this?”.

If you want to have a meaningful conversation with someone and interest them in a topic, you need to listen to their perspective, even if it sounds completely false and missing the point, and be able to empathize without getting frustrated. For most people to listen and consider any object level arguments about a topic they don’t care about, there must first be a relationship of mutual respect, trust and understanding. Getting people to consider some new ideas, rather than convincing them of some cause, is already a very worthy achievement.

Herb Ingram 16 Aug 2023 23:10 UTC
2 points
0
on: What does it mean to “trust science”?
Uncharitably, “Trust the Science” is a talking point in debates that have some component which one portrays as “fact-based” and which one wants to make an “argument” about based on the authority of some “experts”. In this context, “trust the science” means “believe what I say”.

Charitably, it means trusting that thinking honestly about some topic, seeking truth and making careful observations and measurements actually leads to knowledge, that knowledge is inteligibly attainable. This isn’t obvious, which is why there’s something there to be trusted. It means trusting that the knowledge gained this way can be useful, that it’s worth at least hearing out people who seem or claim to have it, that it’s worth stopping for a moment to honestly question one’s own motivations and priors, the origins of one’s beliefs and to ponder the possible consequences in case of failure, whenever one willingly disbelieves or dismisses that knowledge. In this context, “trust the science” means “talk to me and we’ll figure it out”.

Herb Ingram 10 Aug 2023 22:52 UTC
2 points
0
on: LLMs are (mostly) not helped by filler tokens
There has been some talk recently about long “filler-like” input (e.g. “a a a a a [...]”) somewhat derailing GPT3&4, e.g. leading them to output what seems like random parts of it’s training data. Maybe this effect is worth mentioning and thinking about when trying to use filler input for other purposes.
What links here?
- gwern's comment on LLMs are (mostly) not helped by filler tokens by Kshitij Sachan (13 Aug 2023 1:39 UTC; 5 points)

Herb Ingram 28 Aug 2023 23:20 UTC
1 point
0
in reply to: Raemon’s comment on: Assume Bad Faith
Besides thinking it fascinating and perhaps groundbreaking, I don’t really have original insights to offer. The most interesting democracies on the planet in my opinion are Switzerland and Taiwan. Switzerland shows what a long and sustained cultural development can do. Taiwan shows the potential for reform from within and innovation.

There’s a lot of material to read, in particular the events after the sunflower movement in Taiwan. Keeping links within lesswrong: https://www.lesswrong.com/posts/5jW3hzvX5Q5X4ZXyd/link-digital-democracy-is-within-reach and https://www.lesswrong.com/posts/x6hpkYyzMG6Bf8T3W/swiss-political-system-more-than-you-ever-wanted-to-know-i

Herb Ingram 28 Aug 2023 23:10 UTC
1 point
0
on: Assume Bad Faith
What’s missing in this discussion is why one is talking to the “bad faith” actor in the first place.

If you’re trying to get some information and the “bad faith” actor is trying to deceive you, you walk away. That is, unless you’re sure that you’re much smarter or have some other information advantage that allows you to get new useful information regardless. The latter case is extremely rare.

If you’re trying to convince the “bad faith” actor, you either walk away or transform the discussion into a negotiation (it arguably was a negotiation in the first place). The post is relevant for this case. In such situations, people often pretend to be having an object level discussion although all parties know it’s a negotiation. This is interesting.

Even more interesting, Politics: you’re trying to convince an amateur audience that you’re right and someone else is wrong. The other party will almost always act “in bad faith” because otherwise the discussion would be taking place without an audience. You can walk away while accusing the other party of bad faith but the audience can’t really tell if you were “just about to loose the argument” or if you were arguing “less in bad faith than the other party”, perhaps because the other party is losing the argument. Crucially, given that both parties are compelled to argue in bad faith, the audience is to some extent justified in not being moved by any object level arguments since they mostly cannot check if they’re valid. They keep to the opinions they have been holding and the opinions of people they trust.

In this case, it might be worth it to move from the above situation, where the object level being discussed isn’t the real object-level issue, as in the bird example, to one where a negotiation is taking place that is transparent to the audience. However, this is only possible if there is a competent fourth party arbitrating, as the competing parties really cannot give up the advantage of “bad faith”. That’s quite rare.

An upside: If the audience is actually interested in the truth, however, and if it can overcome the tribal drive to flock to “their side”, they can maybe force the arguing parties to focus on the real issue and make object-level arguments in such a way that the audience can become competent enough to judge the arguments.Doing this is a huge investment of time and resources. It may be helped by all parties acknowledging the “bad faith” aspect of the situation and enforcing social norms that address it. This is what “debate culture” is supposed to do but as far as I know never really has.

My takeaway: don’t be too proud of your debate culture where everyone is “arguing in good faith”, if it’s just about learning about the word. This is great, of course, but doesn’t really solve the important problems.

Instead, try to come up with a debate culture (debate systems?) that can actually transform a besides-the-point bad-faith apparent disagreement into a negotiation where the parties involved can afford to make their true positions explicitly known. This is very hard but we shouldn’t give up. For example, some of the software used to modernize democracy in Taiwan seems like an interesting direction to explore.