Dan Elton blog: https://moreisdifferent.substack.com/ website: http://www.moreisdifferent.com twitter: https://twitter.com/moreisdifferent
delton137
Special AI Meetup feat Joscha Bach & Rachel St. Clair—are LLMs enough?
Hi, organizer here. I just saw your message now right after the event. There were a couple people from Microsoft there but I’m not sure if they were interested in alignment research. This was mostly a general audience at this event, mostly coming through the website AIcamp.ai. We also had some people from the local ACX meetup and transhumanist meetup. PS: I sent you an invitation to connect on LinkedIN, let’s stay in touch (I’m https://www.linkedin.com/in/danielelton/).
AI futurists ponder AI and the future of humanity—should we merge with AI?
Unfortunately they have a policy that they check ID at the door and only allow those over 21 in. I’m going to update the post now to make this clear. Even when the outdoor patio is open it’s still only 21+.
The way I would describe it now is there’s a large bar in the main room, and then there’s a side room (which is also quite large) with a place that serves Venezeualian Food (very good), and Somerville Chocolate (they make and sell chocolate there).
The age restriction has never been a problem in the past although I do vaguely recall someone mentioning it once. I’m going to try to have future meetups I run at a public library (probably Cambridge Public Library), its just tricky getting the room reservations sometimes. We have been thinking of trying out the food court in Cambrideside mall, also, although the tables there are rather small and I don’t think they can be moved and joined together (from what I remember).
Book Swap
Sorry for the late reply. In the future we will try to have a Zoom option for big events like this.
We did manage to record it, but the audio isn’t great (and we didn’t cover the Q&A)
This is pretty interesting.. any outcome you can share? (I’ll bug you about this next time I see you in person so you can just tell me then rather than responding, if you’d like)
Good idea to just use the time you fall asleep rather than the sleep stage tracking, which isn’t very accurate. I think the most interesting metric is just boring old total sleep time (unfortunately sleep trackers in my experience are really bad at actually capturing sleep quality.. but I suppose if there’s a sleep quality score you have found useful that might be interesting to look at also). Something else I’ve noticed is that by looking at the heart rate you can often get a more accurate idea of when you feel asleep and woke up.
Alex Hoekstra talk on Open Source Vaccines and the Mind First Foundation
Board games @ Aeronaut Brewing
“The AI Safety Problem”—talk by Richard Ngo at Harvard
Boston ACX Spring Schelling Point Meetup
Board games @ Aeronaut Brewing
I would modify the theory slightly by noting that the brain may become hypersensitive to sensations arising from the area that was originally damaged, even after it has healed. Sensations that are otherwise normal can then trigger pain. I went to the website about pain reprocessing therapy and stumbled upon an interview with Alan Gordon where he talked about this. I suspect that high level beliefs about tissue damage etc play a role here also in causing the brain to become hyper focused on sensations coming from a particular region and to interpret them as painful.
Something else that comes to mind here is the rubber hand illusion. Watch this video—and look at the flinches! Interesting, eh?
edit: (ok, the rubber hand illusion isn’t clearly related, but it’s interesting!)
That’s really cool, thanks for sharing!
Since nobody else posted these:
Bay Area is Sat Dec 17th (Eventbrite) (Facebook)
South Florida (about an hour north of Miami) is Sat Dec 17th (Eventbrite) (Facebook)
On current hardware, sure.
It does look like scaling will hit a wall soon if hardware doesn’t improve, see this paper: https://arxiv.org/abs/2007.05558
But Gwern has responded to this paper pointing out several flaws… (having trouble finding his response right now..ugh)
However, we have lots of reasons to think Moore’s law will continue … in particular future AI will be on custom ASICs / TPUs / neuromorphic chips, which is a very different story. I wrote about this long ago, in 2015. Such chips, especially asynchronous and analog ones, can be vastly more energy efficient.
I disagree, in fact I actually think you can argue this development points the opposite direction, when you look at what they had to do to achieve it and the architecture they use.
I suggest you read Ernest Davis’ overview of Cicero. Cicero is a special-purpose system that took enormous work to produce—a team of multiple people labored on it for three years. They had to assemble a massive dataset from 125,300 online human games. They also had to get expert annotations on thousands of preliminary outputs. Even that was not enough.. they had to generate synthetic datasets as well to fix issues with the system! Even then, the dialogue module required a specialized filter to remove nonsense. This is a break from the scaling idea that says to solve new problems you just need to scale existing architectures to more parameters (and train on a large enough dataset).
Additionally, they argue that this system appears very unlikely to generalize to other problems, or even to slight modifications of the game of Diplomacy. It’s not even clear how well it would generalize to non-blitz games. If the rules were modified slightly, the entire system would likely have to be retrained.I also want to point out that scientific research is not easy as you make it sound. Professors spend the bulk of their time writing proposals, so perhaps AI could help there by summarizing existing literature. Note though a typical paper, even a low-value one, generally takes a graduate student with specialized training about a year to complete, assuming the experimental apparatus and other necessary infrastructure are all in place. Not all science is data-driven either, science can also be observation-driven or theory-driven.
I’ve looked into these methods a lot, in 2020 (I’m not so much up to date on the latest literature). I wrote a review in my 2020 paper, “Self-explaining AI as an alternative to interpretable AI”.
There are a lot of issues with saliency mapping techniques, as you are aware (I saw you link to the “sanity checks” paper below). Funnily enough though, the super simple technique of occlusion mapping does seem to work very well, though! It’s kinda hilarious actually that there are so many complicated mathematical techniques for saliency mapping, but I have seen no good arguments as to why they are better than just occlusion mapping. I think this is a symptom of people optimizing for paper publishing and trying to impress reviewers with novelty and math rather than actually building stuff that is useful.You may find this interesting: “Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization”. What they show is that a very simple model-agnostic technique (finding the image that maximizes an output) allows people to make better predictions about how a CNN will behave than Olah’s activation maximization method, which produces images that can be hard to understand. This is exactly the sort of empirical testing I suggested in my Less Wrong post from Nov last year.
The comparison isn’t super fair because Olah’s techniques were designed for detailed mechanistic understanding, not allowing users to quickly be able to predict CNN behaviour. But it does show that simple techniques can have utility for helping users understand at a high level how an AI works.
hah… actually not a bad idea… too late now. BTW the recording will be available eventually if you’re interested.