David James

Karma: 10

My top interest is AI safety, followed by reinforcement learning. My professional background is in software engineering, computer science, machine learning. I have degrees in electrical engineering, liberal arts, and public policy. I currently live in the Washington, DC metro area; before that, I lived in Berkeley for about five years.

David James 5 Jun 2024 20:25 UTC
1 point
0
in reply to: jimrandomh’s comment on: The Pavlov Strategy
Reinforcement learning is not required for the analysis above. Only evolutionary game theory is needed.
- In evolutionary game theory, the population’s mix of strategies changes via replicator dynamics.
- In RL, each individual agent modifies its policy as it interacts with its environment using a learning algorithm.

David James 4 Jun 2024 16:32 UTC
1 point
−2
on: AI safety from first principles: Conclusion

Personally, I am most confident in 1, then 4, then 3, then 2 (in each case conditional on all the previous claims)

Oops. A previous version of this comment was wrong, so I edited it. The author’s confidence can be written as:

$P (1) \geq P (4 | 3, 2, 1) \geq P (3 | 2, 1) \geq P (2 | 1)$

$P (1) \geq P (4 | 3) \geq P (3 | 2) \geq P (2 | 1)$

Also, independent of the author’s confidence:

$P (1) \geq P (2) \geq P (3) \geq P (4)$

David James 30 May 2024 3:10 UTC
1 point
0
on: My hour of memoryless lucidity
thereby writing directly into your brain’s long-term storage and bypassing the cache that would otherwise get erased
What do we know about “writing directly” into long-term storage versus a short-term cache? What studies? Any theories about the mechanism(s)?

[Question] Inviting discussion of “Beat AI: A contest using philosophical concepts”

David James29 May 2024 11:55 UTC

2 points

1 comment1 min readLW link

David James 27 May 2024 15:06 UTC
1 point
0
in reply to: Johannes C. Mayer’s comment on: Spaghetti Towers
First, thank you for writing this. I would ask that you continue to think & refine and share back what you discover, prove, or disprove.

To me, it seems quite likely that B will have a lot of regularity to it. It will not be good code from the human perspective, but there will be a lot of structure I think, simply because that structure is in T and the environment.

I’m interested to see if we can (i) do more than claim this is likely and (ii) unpack reasons that might require that it be the case.

One argument for (ii) would go like this. Assume the generating process for A has a preference for shorter length programs. So we can think of a A as a tending to find shorter description lengths that match task T.

Claim: shorter (and correct) descriptions reflect some combination of environmental structure and compression.
- by ‘environmental structure’ I mean the laws underlying the task.
- by ‘compression’ I mean using information theory embodied in algorithms to make the program smaller
I think this claim is true, but let’s not answer that too quickly. I’d like to probe this question more deeply.
1. Are there more than two factors (environmental structure & compression)?
2. Is it possible that the description gets the structure wrong but makes up for it with great compression? I think so. One can imagine a clever trick by which a small program expands itself into something like a big ball of mud that solves the task well.
3. Any expansion process takes time and space. This makes me wonder if we should care not only about description length but also run time and space. If we pay attention to both, it might be possible to penalize programs that expand into a big ball of mud.
4. However, penalizing run time and space might be unwise, depending on what we care about. One could imagine a program that start with first principles and derives higher-level approximations that are good enough to model the domain. It might be worth paying the cost of setting up the approximations because they are used frequently. (In other words, the amortized cost of the expansion is low.)
5. Broadly, what mathematical tools can we use on this problem?

David James 27 May 2024 14:51 UTC
3 points
0
in reply to: Christian Z R’s comment on: Spaghetti Towers
See also Nomic, a game by Peter Suber where a move in the game is a proposal to change the rules of the game.

David James 27 May 2024 14:46 UTC
1 point
0
in reply to: Davidmanheim’s comment on: Spaghetti Towers
I grant that legalese increases the total page count, but I don’t think it necessarily changes the depth of the tree very much (by depth I mean how many documents refer back to other documents).

I’ve seen spaghetti towers written in very concise computer languages (such as Ruby) that nevertheless involve perhaps 50+ levels (in this context, a level is a function call).

David James 27 May 2024 14:42 UTC
1 point
2
in reply to: mako yass’s comment on: Spaghetti Towers
In my experience, programming languages with {static or strong} typing are considerably easier to refactor in comparison to languages with {weak or dynamic} typing.*

* The {static vs dynamic} and {strong vs weak} dimensions are sometimes blurred together, but this Stack Overflow Q&A unpacks the differences pretty well.

David James 27 May 2024 14:25 UTC
1 point
0
in reply to: waveman’s comment on: Spaghetti Towers

No source code

I get the intended meaning, but I would like to made the words a little more precise. While we can find the executable source code (DNA) for an organism, that DNA is far from a high-level language.

David James 26 May 2024 12:52 UTC
1 point
0
on: Smart People are Probably Dangerous
I got minimal value from the article as written, but I’m hoping that a steel-man version might be useful. In that spirit, I can grant a narrower claim: Smart people have more capability to fool us, all other things equal. Why? Because increased intelligence brings increased capability for deception.
- This is as close to a tautology as I’ve seen in a long time. What predictive benefit comes from tautologies? I can’t think of any.
- But why focus on capability? Probability of harm is a better metric.
- Now, with that in mind, one should not assume a straight line between capability and probability of harm. One should look at all potential causal factors.
- More broadly, the “all other things equal part” is problematic here. I will try to write more on this topic when I have time. My thoughts are not fleshed out yet, but I think my unease has to do with how ceteris paribus imposes constraints on a system. The claim I want to examine would go something like this: those constraints “bind” the system in ways that prevent proper observation and analysis.

David James 23 May 2024 4:54 UTC
3 points
0
in reply to: Daniel Kokotajlo’s comment on: The Bottom Line
If instead you keep deliberating until the balance of arguments supports your preferred conclusion, you’re almost guaranteed to be satisfied eventually!

Inspired by the above, I offer the pseudo code version...
```
loop {
    if assess(args, weights) > 1 { // assess active arguments
        break; // preferred conclusion is "proved"
    } else {
        arg = biased_sample(remaining_args); // without replacement
        args.insert(arg);
        optimize(args, weights); // mutates weights to maximize `assess(args, weights)`
    }
}
```
… the code above implements “the balance of arguments” as a function parameterized with weights. This allows for using an optimization process to reach one’s desired conclusion more quickly :)

David James 23 May 2024 1:03 UTC
1 point
0
in reply to: Raemon’s comment on: You Have About Five Words
Thanks for your quick answer—you answered before I was even done revising my question. :) I can personally relate to Dan Luu’s examples. / This immediately makes me want to find potential solutions, but I won’t jump to any right now. / For now, I’ll just mention the ways in which Jacob Collier can explain music harmony at many levels.

David James 23 May 2024 0:24 UTC
1 point
0
on: You Have About Five Words
Preface: I feel like I’m wearing the clown suit to a black tie event here. I’m new to LW and respect the high standards for discussion. So, I’ll treat this an experiment. I’d rather be wrong, downvoted, and (hopefully) enlightened & persuaded than have this lingering suspicion that the emperor has no clothes.

I should also say that I personally care a lot about the topic of communication and brevity, because I have a tendency to say too much at one time and/or use the wrong medium in doing so. If anyone needs to learn how to be brief, it is me, and I’ll write a few hundred words if necessary to persuade you of it.

Ok, that said, here are my top two concerns with the article: (1) This article strikes me as muddled and unclear. (i) I don’t understand what “get” five words even means. (ii) I don’t understand how coordination relates to the core claims or insight. My confusion leads to my second concern: (2) what can I take from this article?

Let’s start with the second part. Is the author saying if I’m a CEO of a company of thousands I only “get” five words?

A quick aside: to me, “get” is an example of muddled language. What does the author mean w.r.t. (a) time period; (b) … struggling for the right words here … meaning? As to (a), do I “get” five words per message? Or five words some (unspecified) time frame? As to (b), is “get” a proxy for how many words the recipient/audience will read? But reading isn’t enough for coordination, so I expect the author means something more. Does the author mean “read and understand” or “read and internalize” or “read and act on”?

Anyhow, due to the paragraph above, I don’t know how to convert “You only get five words” into a prediction. In this sense, to me, the claim it isn’t even wrong, because I don’t know how to put it into practice.

Normally I would stop here, put the article aside, and move on. However, this article is featured here on LW and has many up-votes which suggests that others get a lot of value out of it. So I’m curious: what am I missing? Is there some connection to EA that makes this particularly salient, perhaps?

I have a guess that fans of the article have some translation layer that I’m missing. Perhaps if I could translate what the author means by get and coordination I would have the ah-ha moment.

To that end, would someone be so kind as to (a) summarize the key point(s) as simply as possible; with (b) clear intended meanings for “coordinate” and “get” (as in you only “get” X words) -- including what timeframe we’re talking about—and (c) the logic and evidence for the claims.

It is also possible that I’m not “calibrated” with the stated Epistemic Status:

all numbers are made up and/or sketchily sourced. Post errs on the side of simplistic poetry – take seriously but not literally.”

Ok, but what does this mean for the reader? The standards of rationality still apply, right? There should still be some meaningful, clear, testable takeaway, right?

David James 8 May 2024 10:57 UTC
1 point
0
in reply to: Kabir Kumar’s comment on: Workshop (hackathon, residence program, etc.) about for-profit AI Safety projects?
Would you please expand on how ai-plans.com addresses the question from the post above … ?

Maybe let’s try to make a smart counter-move and accelerate the development of for-profit AI Safety projects [...] ? With the obvious idea to pull some VC money, which is a different pool than AI safety philanthropic funds.

I took a look at ai-plans, but I have yet to find information about:
1. How does it work?
2. Who created it?
3. What is the motivation for building it?
4. What problem(s) will ai-plans help solve?
5. Who controls / curates / moderates it?
6. What is the process/algorithm for: curation? moderation? ranking?
I would suggest (i) answering these questions on the ai-plans website itself then (ii) adding links here.

David James 7 May 2024 10:42 UTC
1 point
0
in reply to: JonStall’s comment on: If You Demand Magic, Magic Won’t Help
Let’s step back. This thread of the conversation is rooted in this claim: “Let’s be honest: all fiction is a form of escapism.”. Are we snared in the Disputing Definitions trap? To quote from that LW article:

if the issue arises, both sides should switch to describing the event in unambiguous lower-level constituents, like acoustic vibrations or auditory experiences. Or each side could designate a new word, like ‘alberzle’ and ‘bargulum’, to use for what they respectively used to call ‘sound’; and then both sides could use the new words consistently. That way neither side has to back down or lose face, but they can still communicate. And of course you should try to keep track, at all times, of some testable proposition that the argument is actually about.

I propose that we recognize several lower-level testable claims, framed as questions. How many people read fiction to …
1. entertain?
2. distract from an unpleasant reality?
3. understand the human condition (including society)?
4. think through alternative scenarios?
Now I will connect the conversation to these four points:
- Luke_A_Somers wrote “Why would I ever want to escape from my wonderful life to go THERE?” which relates to #2.
- thomblake mentions the The Philosophy of Horror. Consider this quote from the publisher’s summary: ”… horror not only arouses the senses but also raises profound questions about fear, safety, justice, and suffering. … horror’s ability to thrill has made it an integral part of modern entertainment.” which suggests #1 and #3.
- JonInstall pulls out the dictionary in the hopes of “settling” the debate. He’s talking about #1.
- Speaking for myself, when reading e.g. the embedded story The Tale of the Omegas in Life 3.0, my biggest takeaway was #4.
Does this sound about right?

David James 6 May 2024 3:55 UTC
1 point
0
in reply to: Dawn Drescher’s comment on: MIRI announces new “Death With Dignity” strategy
If we know a meteor is about to hit earth, having only D days to prepare, what is rational for person P? Depending on P and D, any of the following might be rational: throw an end of the world party, prep to live underground, shoot ICBMs at the meteor, etc.

David James 6 May 2024 3:06 UTC
2 points
0
in reply to: Jim Fisher’s comment on: Announcing the LessWrong Curated Podcast
I listened to part of “Processor clock speeds are not how fast AIs think”, but I was disappointed by the lack of a human narrator. I am not interested in machine readings; I would prefer to go read the article.

David James 1 May 2024 9:52 UTC
1 point
0
in reply to: Cyan2’s comment on: How An Algorithm Feels From Inside
For Hopfield networks in general, convergence is not guaranteed. See [1] for convergence properties.

[1] J. Bruck, “On the convergence properties of the Hopfield model,” Proc. IEEE, vol. 78, no. 10, pp. 1579–1585, Oct. 1990, doi: 10.1109/5.58341.

David James 1 May 2024 9:32 UTC
1 point
0
on: How An Algorithm Feels From Inside
The audio reading of this post [1] mistakenly uses the word hexagon instead of pentagon; e.g. “Network 1 is a hexagon. Enclosed in the hexagon is a five-pointed star”.

[1] [RSS feed](https://intelligence.org/podcasts/raz); various podcast sources and audiobooks can be found [here](https://intelligence.org/rationality-ai-zombies/)

David James 27 Apr 2024 23:56 UTC
1 point
0
in reply to: Chris_Leong’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
I’m not so sure.

I would expect that a qualified, well-regarded leader is necessary, but I’m not confident it is sufficient. Other factors might dominate, such as: budget, sustained attention from higher-level political leaders, quality and quantity of supporting staff, project scoping, and exogenous factors (e.g. AI progress moving in a way that shifts how NIST wants to address the issue).

What are the most reliable signals for NIST producing useful work, particularly in a relatively new field? What does history show us? What kind of patterns do we find when NIST engages with: (a) academia; (b) industry; (c) the executive branch?

David James

[Question] Invit­ing dis­cus­sion of “Beat AI: A con­test us­ing philo­soph­i­cal con­cepts”

[Question] Inviting discussion of “Beat AI: A contest using philosophical concepts”