Hi, I am a Physicist, an Effective Altruist and AI Safety researcher.
Linda Linsefors
Rotations in Superposition
Hugs!
I would recomend thay anyone with dependents, or any other need for economic stability (e.g. lack of safety net from your family or country) should focus on erning money.
You can save up and fund yourself. Or if that takes too long, you can give what you can give 10% (or what ever works for you) to support someone else.
Definetly yes to more honestly!
However, I think it’s unfair to describe all the various AI safery programs as “MATS clones”. E.g. AISC is both order and quite diffrent.
But no amount of “creative ways to bridge the gap” will solve the fundamental problem, because there isn’t a gap realy. There isn’t lots of senior jobs, if we could only level up people faster. The simple fact is that there isn’t enough money.
Solstice Singalong Watch Party
So the section headings are not about the transmission type investigated, but which transmission type the studies pointed to as the leading one?
Datapoint: I found EAG to be valuable when I lived in Sweden. After moving to London, I completely lost interest. I don’t need it anymore.
I’m confused by the section headings.
“The large particle test” and “The small particle test” you write about under “Fomites” seems to be about Aerosols.
The experiments described under “Aerosols” seems to be either about mixed transmission or Fomites only. Passing around cards and poker chips, etc.
Am I missunderstanding something?
I remember reading that some activation stearing experiments used x100 size activation vectors. Is this correct? That would be much larger than the normal activation in that layer, right?
How does the model deal with this? If there is any superposition going on, I expect activation spill over everywhere, breaking everything.
If you amplify the activations in one layer, what effect has that on the magnitude of the activations in the next layer? If it’s smaller, by what mechanism? That mechanism is probably an error correction algorithm. Or just some suppression aimed at not letting the network thinking about too many things at once, in order to keep down interference.
Does anyone have experience with activation stearing, and would be up for helping me out? E.g. aswer my questions, jump on a call, or help me set up my own experiments?
I mean trying to signal something more specific than, e.g. dressing according to the norms of ones profession. Anything that the person would expect others to understand as some other information than “I belong here”, or I have X official role.
E.g. haivng a high-vis vest if you’re a rode workier, or wearing nicer cloths if you’re at a dress-up occation does not count. Whereing a t-skirt advertising you like chess counts, if and only if you’re not currently at a chess club, and you chose it deliberatly.
Thanks :)
I will reviel the true answer to 2 in about a week, in case anyone else want to take a guess.
To some extent “goodness” is some ever moving negotiated set of norms of how one should behave.
I notice that when I use the word “good” (or envoke this consept using other words such as “should”), I don’t use it to point to the existing norms, but as a bid for what I think these norms should be. This sometimes overlap with the existing norms and sometimes not.
E.g. I might say that it’s good to allow lots of diffrent subcultures to co-exist. This is a vote for a norm where peopel who don’t my subculture leave me and my firends alone, in exchange for us leaving them alone. This is not unrelated to me getting what is jummy to me, but it at least one step removed.
“Good” is the set of norms we use to coordinate cooperation. If most people don’t like when you pick your nose in public, then it’s good to make an effort not to do so, and similar for a lot of other values. Even if you don’t care about the nose picking, you probably care about some other of the things “good” coordinates around. For most people it’s probably worth supporing the package deal. But I also think you “should” use your voice to help imrove the notion of what is “good”.
In this example, Mr. A has learned the average numbers of red, yellow, and green orders for some past days and wants to update his predictions of today’s orders on this information. So he decides that the expected values of his distributions should be equal to those averages, and that he should find the distribution that makes the least assumptions, given those constraints. I at least agree that entropy is a good measure of how little assumptions your distribution makes. The point I’m confused about is how you get from “the average of this number in past observations is N” to “the expected value of our distribution for a future observation has to be N but we should put no other information in it”.
I agree that it’s implausible that Mr A has enough data to be confident of the averages, but not enough data to draw any other conclutions. Such is often the case with math execises. :shrug:
Second, why are you even finding a distribution that is constrainedly optimal in the first place, rather than just taking your prior distribution over sequences of results and your observations, and using Bayes’ Theorem to update your probabilities for future results? Even if you don’t know anything other than the average value, you can still take your distribution over sequences of results, update it on this information (eliminating the possible outcome sequences that don’t have this average value), and then find the distribution P(NextResult|AverageValue) by integrating P(NextResult|PastResults)P(PastResults|AverageValue) over the possible PastResults. This seems like the correct thing to do according to Bayesian probability theory, and it’s very different from doing constrained optimization to find a distribution.
In the example in the post, what would you say is the “prior distribution over sequences of results”? All Mr A has is a probability distribution for widgets each day. If I would naively turn that in distributions over sequences of widget orders each day, the simplest option is to assume inedpenent draw from that distribution each day. But then Mr A is in the same situation as the “poorly informed robot”
The reason one can’t use Bayes rule in this case is because of a type error. If Mr A had a prior probaility distribution over probability distributions, P[P_i], then he could use Bays rule, to calculate a posteriour of P[P_i], and then integrage P_final = Sum_i P[P_i] P_i. But the porblem with this is that the anser will defpend on how you generalise from P[N,N,N] to P[P_i], and there isn’t a unique way to do this.
The same consept where independently invented by a larp organsier I know. Unfortunatly I stronly dislike the words they chose, so I will not repeat them. But it occurs to me that the consept of “final responsibility”, or “the buck stops here”, is so universaly usefull, that it’s wierd that there isn’t some more common term for it.
I notice that everything you list has to do with finding things. This matches my expereince. Printing is hell when ever I try to prin somewhere new. And since I print so rearely now days, this is the typical expereince. But I remember a time where I printed more often, then it was molsty just click “print” and it worked.
It seems like printers are built to be set up onece, and then be your forever printer? Which is no longer a good match for how you (and me) use printers.
Questsions for John or anyone that feels like answering:
What persentage of people around you, do you think are trying to signal anything with their outfit?
(if you’ev met and remember me) Do you think I’m trying to signal anything, and in that case what?
I designed and had printed physical Hero Licences (business card size), that I’ve haned out at various EAGs. If anyone wants a stack to boost your Mysterious Old Wizard powers, let me know.
I got them becasue I thougt it would be a good idea, becasue I noticed that some people just need permission. But even so, I was supprised how much these where appriciated.
This post is also a good description of why I’m typically not interested in someone elses’s steal-manning or devlis-advocating for a possition they don’t hold. The result is often a shallow simulation, in some ways simular to an LLM ouptput, and uninteresing for the same reasons.
I didn’t have this analogy untill now, becasue I’ve been anoyed at this since before the LLM eara, and I didn’t make the connection untill this pot.
I’m supprised that intrumental convergence wasn’t covered in the book. I didn’t even notice it was left out untill reading this review.
Here’s some alternative sources in anyone prefeers text over video:
Yes, I just rememebered that I forgott to do this. Oops.
I chose my clothing based on:
Comfort
Fitting in (not alwasy the same as blending in)
Things I like how they look
The list is roughly in order of priority, and I don’t wheare anything that does not at least satisfise some baselevel of them.
Point 2 depend on the setting. E.g. I wouldn’t go to a costume party without at an atempt at a costume. Also at a costume party, a great costume scores better on 2 than an average on, this is an example of fitting in not being the same as blending in.
In general 2 is not very constraining, there are a lot of diffrent looks tha qualify as fiting in, in most places I hang out, but I would still proabbly experiment with more unusual looks if I was less conformist. And I would be naked a lot more, if that was normal.
I’m emotionaly conformist. But I expect a lot of people I meet don’t notice this, becasue I’m also bad at conforming. There is just so much else pulling in other directions.