Karma: 35

# Idea: NV⁻ Cen­ters for Brain Interpretability

18 Feb 2024 5:28 UTC
10 points
• Perhaps “fit”, from the Latin fio (come about) + English fit (fit). An object must fit, survive, and spread.

• To see how much the minimal point contributes to the integral we can integrate it in its vicinity

I think you should be looking at the entire stable island, not just integrating from zero to one. I expect you could get a decent approximation with Lie transform perturbation theory, and this looks similar to the idea of macro-states in condensed matter physics, but I’m not knowledgeable in these areas.

• −N∑i=1logp(yi|xi,w)

You have a typo, the equation after Free Energy should start with

Also the third line should be , not minus.

Also, usually people use for model parameters (rather than ). I don’t know the etymology, but game theorists use the same letter (for “types” = models of players).

• Also sometimes when I explain what a hyperphone is well enough for the other person to get it, and then we have a complex conversation, they agree that it would be good. But very small N, like 3 to 5.

It’s difficult to understand your writing, and I feel like you could improve in general at communication based on this quote. The concept of a hyperphone isn’t that complex—the ability to branch in conversations—so the modifiers “well enough”, “complex”, and “very small N” make me believe it’s only complex because you’re unclear.

For example, the blog post you linked to is titled “Hyperphone”, yet you never define a hyperphone. I can infer from the section on streaming what you imagine, but that’s the second-to-last section!

• There’s the automorphism

which turns a switchy distribution into a sticky one, and vice versa. The two have to be symmetric, so your conclusion cannot be correct.

• This means the likelihood distribution over data generated by Steady is closer to the distribution generated by Switchy than to the distribution generated by Sticky.

Their KL divergences are exactly the same. Suppose Baylee’s observations are . Let be the probability if there’s a chance of switching, and similar for . By the chain rule,

In particular, when either or is equal to one half, this divergence is symmetric for the other variable.

• The problem with etching specific models is scale. It costs around \$1M to design a custom chip mask, so it needs to be amortized over tens or hundreds of thousands of chips to become profitable. But no companies need that many.

Assume a model takes 3e9 flops to infer the next token, and these chips run as fast as H100s, i.e. 3e15 flops/​s. A single chip can infer 1e6 tokens/​s. If you have 10M active users, then 100 chips can provide each user a token every 10ms, around 600wpm.

Even OpenAI would only need hundreds, maybe thousands of chips. The solution is smaller-scale chip production. There are startups working on electron beam lithography, but I’m unaware of a retailer Etched could buy from right now.

EDIT: 3 trillion flops/​token (similar to GPT-4) is 3e12, so that would be 100,000 chips. The scale is actually there.

• so

It should be .

• Graph Utilitarianism:

People care about others, so their utility function naturally takes into account utilities of those around them. They may weight others’ utilities by familiarity, geographical distance, DNA distance, trust, etc. If every weight is nonnegative, there is a unique global utility function (Perron-Frobenius).

Some issues it solves:

• Pascal’s mugging.

• The argument “utilitarianism doesn’t work because you should care more about those around you”.

Big issue:

• In a war, people assign negative weights towards their enemies, leading to multiple possible utility functions (which say the best thing to do is exterminate the enemy).

# James Ca­ma­cho’s Shortform

22 Jul 2023 1:55 UTC
2 points
• Did you check if there was a significant age difference between the two groups? I would expect proto-rationalists to be younger, so they would have less money and fewer chances to have signed up for cryonics.

• Have you considered that signaling could play a large part into this? A European friend of mine once said, “people in the US try to do everything in high school.” Because, to get into a top undergraduate program, Americans have to signal very hard. Worse, a master’s degree is quickly becoming the new high school diploma, due to signaling to employers.

When kids are spending their lives trying to signal stronger, it’s a lot harder to balance it with friends. It used to be dating as an undergraduate made sense—people would actually get married during or out of college! Now, it makes less sense to date for a year or two and try to maintain a long-distance relationship as you split off into different PhD programs.

• I think the correct reasoning is, if you didn’t get the job you didn’t pray hard enough. You weren’t faithful enough to be rewarded. Or maybe you were, and this is just a trial of your faith. It’s easy to have faith when faith seems to work, it’s only when all experiments you perform seem to contradict your faith that it is really tested.

• I think this needs to be done for >18 year-olds as well. Most research positions require a PhD as a prerequisite, when there are many talented undergraduates who could drop out of college and perform the research after a few weeks’ training.

• AIs need immense databases to provide decent results. For example, to recognize if something is a potato, an AI will take 1,000 pictures of a potato and 1,000 pictures of not-a-potato, so that it can tell you if something is a potato with 95% accuracy.

Well, 95% accurate isn’t good enough—that’s how you get Google labelling images of African Americans as gorillas. So what’s the solution? More data! But how do you get more data? Tracking consumers.

Websites track everything you do on the internet, then sell your data to Amazon, Netflix, Facebook, etc. to bolster their AI predictions. Phone companies tracks your location, credit card companies track your purchases.

Eventually, true AI will replace these pattern matching pretenders, but in the meantime data has become a new currency, and it’s being stolen from the general public. Many people know and accept their cookies being eaten by every website, but more have no idea.

Societally, this threatens a disaster for AI research. Already people say to leave your phones at home when you go to a protest—no matter which side of the political spectrum it’s on. Soon enough, people will turn on AI altogether if this negative perception isn’t fixed.

So, to tech executives: Put more funds into true AI, and less into growing databases. Not only is it fiscally costly, but the social cost is too high.

To policymakers: Get your data from consenting parties. A checkbox at the end of a three page legal statement is hardly consent. Instead, follow the example of statisticians. Use studies, but instead of a month-long trial, all you ask is a picture and a favorite movie.

To both: Invest more money in the future of AI. In the past ten years we’ve gone from 64x64 pixel ghoulish faces to high-definition GAN’s and chess grandmasters trained in hours on a home computer. Imagine how much better AI will be in another ten years. Fifteen thousand now could save you Fifteen million or more in your companies’ lifetime.