Why did FHI get closed down? In the end, because it did not fit in with the surrounding administrative culture. I often described Oxford like a coral reef of calcified institutions built on top of each other, a hard structure that had emerged organically and haphazardly and hence had many little nooks and crannies where colorful fish could hide and thrive. FHI was one such fish but grew too big for its hole. At that point it became either vulnerable to predators, or had to enlarge the hole, upsetting the neighbors. When an organization grows in size or influence, it needs to scale in the right way to function well internally – but it also needs to scale its relationships to the environment to match what it is.
I don’t quite get what actions are available in the heat engine example.
Is it just choosing a random bit from H or C (in which case we can’t see whether it’s 0 or 1) OR a specific bit from W (in which case we know whether it’s 0 or 1) and moving it to another pool?
Any thoughts on Symbolica? (or “categorical deep learning” more broadly?)
All current state of the art large language models such as ChatGPT, Claude, and Gemini, are based on the same core architecture. As a result, they all suffer from the same limitations.
Extant models are expensive to train, complex to deploy, difficult to validate, and infamously prone to hallucination. Symbolica is redesigning how machines learn from the ground up.
We use the powerfully expressive language of category theory to develop models capable of learning algebraic structure. This enables our models to have a robust and structured model of the world; one that is explainable and verifiable.
It’s time for machines, like humans, to think symbolically.
How likely is it that Symbolica [or sth similar] produces a commercially viable product?
How likely is it that Symbolica creates a viable alternative for the current/classical DL?
I don’t think it’s that different from the intentions behind Conjecture’s CoEms proposal. (And it looks like Symbolica have more theory and experimental results backing up their ideas.)
Symbolica don’t use the framing of AI [safety/alignment/X-risk], but many people behind the project are associated with the Topos Institute that hosted some talks from e.g. Scott Garrabrant or Andrew Critch.
What is the expected value of their research for safety/verifiability/etc?
Sounds relevant to @davidad’s plan, so I’d be especially curious to know his take.
How likely is it that whatever Symbolica produces meaningfully contributes to doom (e.g. by advancing capabilities research without at the same time sufficiently/differentially advancing interpretability/verifiability of AI systems)?
(There’s also PlantingSpace but their shtick seems to be more “use probabilistic programming and category theory to build a cool Narrow AI-ish product” whereas Symbolica want to use category theory to revolutionize deep learning.)
(I skipped straight to ch7, according to your advice, so I may be missing relevant parts from the previous chapters if there are any.)
I probably agree with you on the object level regarding phenomenal consciousness.
That being said, I think it’s “more” than a meme. I witnessed at least two people not exposed to the scientific/philosophical literature on phenomenal consciousness reinvent/rediscover the concept on their own.
It seems to me that the first-person perspective we necessarily adopt makes inclines to ascribe to sensations/experiences some ineffable, seemingly irreducible quality. My guess is that we (re)perceive our perception as a meta-modality different from ordinary modalities like vision, hearing, etc, and that causes the illusion. It’s plausible that being raised in a WEIRD culture contributes to that inclination.
A butterfly conjecture: While phenomenal consciousness is an illusion, there is something to be said about the first-person perspective being an interesting feature of some minds (sufficiently sophisticated? capable of self-reflection?). It can be viewed as a computational heuristic that makes you “vulnerable” to certain illusions or biases, such as phenomenal consciousness, but also:
the difficulty to accept one-boxing in the Newcomb’s problem
mind-body dualism
the naive version of free will illusion, difficulty in accepting physicalism/determinism
(maybe) the illusion of being in control over your mind (various sources say that meditation-naive people are often surprised to discover how little control they have over their own mind when they first try meditation)
A catchy term for this line of investigation could be “computational phenomenology”.
Wouldn’t total updatelessness amount to constantly taking one action?
If not, I’m missing something important and would appreciate an explanation of what it is that I’m missing.
If a reasonable agent expects itself to perform some function satisfactorily, then according to that agent, that agent ought to perform that function satisfactorily.
[this] is somewhat subtle. If I use a fork as a tool, then I am applying an “ought” to the fork; I expect it ought to function as an eating utensil. Similar to using another person as a tool (alternatively “employee” or “service worker”), giving them commands and expecting that they ought to follow them.
Can you taboo ought? I think I could rephrase these as:
I am trying to use a fork as an eating utensil because I expect that if I do, it will function like I expect eating utensils to function.
I am giving a person commands because I expect that if I do, they will follow my commands. (Which is what I want.)
More generally, there’s probably a difference between oughts like “I ought to do X” and oughts that could be rephrased in terms of conditionals, e.g.
“I believe there’s a plate in front of me because my visual system is a reliable producer of visual knowledge about the world.”
to
“Conditional on my visual system being a reliable producer of visual knowledge about the world, I believe there’s a plate in front of me and because I believe a very high credence in the latter, I have a similarly high credence in the former.”
Maybe it’s a good time to make something like a semi-official/curated LW playlist? Do we have enough material for that? Aside from this album, the foreign aid song, I only recall a song about killing the dragon (as an analogy for defeating death) but I can’t find it right now.
For people who (like me immediately after reading this reply) are still confused about the meaning of “humane/acc”, the header photo of Critch’s X profile is reasonably informative
E.g. Scott Alexander references Elua in Mediations on Moloch and I know of at least one prominent LWer who was a big enough fan of it to reference Elua in their discord handle.
Moreover, legal texts are not super strict (much is left to interpretation) and we are often selective about “whether it makes sense to apply this law in this context” for reasons not very different from religious people being very selective about following the laws of their holy books.
I fully agree that something like persistence/[continued existence in ~roughly the same shape] is the most natural/appropriate/joint-carving way to think about whatever-natural-selection-is-selecting-for in its full generality. (At least that’s the best concept I know at the moment.)
(Although there is still some sloppiness in what does it mean for a thing at time t0 to be “the same” as some other thing at time t1.)
This view is not entirely novel, see e.g., Bouchard’s PhD thesis (from 2004) or the SEP entry on “Fitness” (ctrl+F “persistence”).
I also agree that [humans are]/[humanity is] obviously massively successful on that criterion.
I’m very uncertain as to what implications this has for AI alignment.
I think it might have been kinda the other way around. We wanted to systematize (put on a firm, principled grounding) a bunch of related stuff like care-based ethics, individuality, identity (and the void left by the abandonment of the concept of “soul”), etc, and for that purpose, we coined the concept of (phenomenal) consciousness.
Mateusz Bagiński
~[agent foundations]
Reminds me of https://atheistethicist.blogspot.com/2011/12/basic-review-of-desirism.html?m=1
I don’t quite get what actions are available in the heat engine example.
Is it just choosing a random bit from H or C (in which case we can’t see whether it’s 0 or 1) OR a specific bit from W (in which case we know whether it’s 0 or 1) and moving it to another pool?
Any thoughts on Symbolica? (or “categorical deep learning” more broadly?)
How likely is it that Symbolica [or sth similar] produces a commercially viable product?
How likely is it that Symbolica creates a viable alternative for the current/classical DL?
I don’t think it’s that different from the intentions behind Conjecture’s CoEms proposal. (And it looks like Symbolica have more theory and experimental results backing up their ideas.)
Symbolica don’t use the framing of AI [safety/alignment/X-risk], but many people behind the project are associated with the Topos Institute that hosted some talks from e.g. Scott Garrabrant or Andrew Critch.
What is the expected value of their research for safety/verifiability/etc?
Sounds relevant to @davidad’s plan, so I’d be especially curious to know his take.
How likely is it that whatever Symbolica produces meaningfully contributes to doom (e.g. by advancing capabilities research without at the same time sufficiently/differentially advancing interpretability/verifiability of AI systems)?
(There’s also PlantingSpace but their shtick seems to be more “use probabilistic programming and category theory to build a cool Narrow AI-ish product” whereas Symbolica want to use category theory to revolutionize deep learning.)
I’m not aware of any, but you may call it “hybrid ontologies” or “ontological interfacing”.
There is an unsolved meta-problem but the meta-problem is an easy problem.
(I skipped straight to ch7, according to your advice, so I may be missing relevant parts from the previous chapters if there are any.)
I probably agree with you on the object level regarding phenomenal consciousness.
That being said, I think it’s “more” than a meme. I witnessed at least two people not exposed to the scientific/philosophical literature on phenomenal consciousness reinvent/rediscover the concept on their own.
It seems to me that the first-person perspective we necessarily adopt makes inclines to ascribe to sensations/experiences some ineffable, seemingly irreducible quality. My guess is that we (re)perceive our perception as a meta-modality different from ordinary modalities like vision, hearing, etc, and that causes the illusion. It’s plausible that being raised in a WEIRD culture contributes to that inclination.
A butterfly conjecture: While phenomenal consciousness is an illusion, there is something to be said about the first-person perspective being an interesting feature of some minds (sufficiently sophisticated? capable of self-reflection?). It can be viewed as a computational heuristic that makes you “vulnerable” to certain illusions or biases, such as phenomenal consciousness, but also:
the difficulty to accept one-boxing in the Newcomb’s problem
mind-body dualism
the naive version of free will illusion, difficulty in accepting physicalism/determinism
(maybe) the illusion of being in control over your mind (various sources say that meditation-naive people are often surprised to discover how little control they have over their own mind when they first try meditation)
A catchy term for this line of investigation could be “computational phenomenology”.
Wouldn’t total updatelessness amount to constantly taking one action? If not, I’m missing something important and would appreciate an explanation of what it is that I’m missing.
My impression was that people stopped seriously working on debate a few years agoETA: I was wrong
I think a proof of not-not-X[1] is more apt to be called “a co-proof of X” (which implies X if you [locally] assume the law of the excluded middle).
Or, weaker, very strong [evidence of]/[argument for] not-not-X.
Can you taboo ought? I think I could rephrase these as:
I am trying to use a fork as an eating utensil because I expect that if I do, it will function like I expect eating utensils to function.
I am giving a person commands because I expect that if I do, they will follow my commands. (Which is what I want.)
More generally, there’s probably a difference between oughts like “I ought to do X” and oughts that could be rephrased in terms of conditionals, e.g.
“I believe there’s a plate in front of me because my visual system is a reliable producer of visual knowledge about the world.”
to
“Conditional on my visual system being a reliable producer of visual knowledge about the world, I believe there’s a plate in front of me and because I believe a very high credence in the latter, I have a similarly high credence in the former.”
Maybe it’s a good time to make something like a semi-official/curated LW playlist? Do we have enough material for that? Aside from this album, the foreign aid song, I only recall a song about killing the dragon (as an analogy for defeating death) but I can’t find it right now.
The Litany of Tarrrrski is beyond wholesome!
Thank you for doing this!
missing subject, who was performing? I guess WIV?
FYI this link is broken
For people who (like me immediately after reading this reply) are still confused about the meaning of “humane/acc”, the header photo of Critch’s X profile is reasonably informative
I have the mild impression that Jacqueline Carey’s Kushiel trilogy is somewhat popular in the community?[1] Is it true and if so, why?
E.g. Scott Alexander references Elua in Mediations on Moloch and I know of at least one prominent LWer who was a big enough fan of it to reference Elua in their discord handle.
Moreover, legal texts are not super strict (much is left to interpretation) and we are often selective about “whether it makes sense to apply this law in this context” for reasons not very different from religious people being very selective about following the laws of their holy books.
I fully agree that something like persistence/[continued existence in ~roughly the same shape] is the most natural/appropriate/joint-carving way to think about whatever-natural-selection-is-selecting-for in its full generality. (At least that’s the best concept I know at the moment.)
(Although there is still some sloppiness in what does it mean for a thing at time t0 to be “the same” as some other thing at time t1.)
This view is not entirely novel, see e.g., Bouchard’s PhD thesis (from 2004) or the SEP entry on “Fitness” (ctrl+F “persistence”).
I also agree that [humans are]/[humanity is] obviously massively successful on that criterion.
I’m very uncertain as to what implications this has for AI alignment.
I think it might have been kinda the other way around. We wanted to systematize (put on a firm, principled grounding) a bunch of related stuff like care-based ethics, individuality, identity (and the void left by the abandonment of the concept of “soul”), etc, and for that purpose, we coined the concept of (phenomenal) consciousness.