I thought a bit about datasets before and to me it seems like what needs collecting most is detailed personal preference datasets. E.g. input-output examples of how you generally prefer information to be filtered, processed, communicated to you, refined with your inputs; what are your success criteria for tasks, where are the places in your day flow / thought flow where the thing needs to actively intervene and correct you. Especially in those places where you feel you can benefit from cognitive extensions most, based on your bottlenecks. It could initially be too hard to infer from screen logs alone.
alexlyzhov
Papers on protein design
Random idea about preventing model stealing. After finetuning a mixture of experts model with your magic sauce, place the trained experts on geographically distinct servers with heterogeneous tech stacks and security systems to avoid common vulnerabilities. Horcrux vibes
Vaguely related paper: Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models is an early attempt to prevent models from being re-purposed via fine-tuning.
It doesn’t seem like a meaningfully positive result. For example, all their plots only track finetuning on up to 200 examples. I imagine they might have even had clear negative results in conditions with >200 examples available for finetuning. After 50-100 examples, the gap between normal finetuning and finetuning from random init, even though still small, grows fast. There are also no plots with x-axis = finetuning iterations. When they optimize for “non-finetunability”, they don’t aim to maintain the language modeling performance, instead, they only impose the constraint of “maintaining finetunability” on one downstream “professions detection task”.
I expect naive solutions to continue to work very poorly on this problem.
I think “on most cognitive tasks” means for an AGI its t is defined as the first t for which it meets the expert level at most tasks. However, what exactly counts as a cognitive task does seem to introduce ambiguity and would be cool to clarify, e.g. by pointing to a clear protocol for sampling all such task descriptions from an LLM.
Several-months-AGI is required to be coherent in the sense of coherence defined with human experts today. I think this is pretty distinct from coherence that humans were being optimized to have before behavioral modernity (50K years ago).
I agree that evolution optimized hard for some kind of coherence, like persistent self-schema, attitudes, emotional and behavioral patterns, attachments, long-term memory access. But what humans have going for them is the combination of this prior coherence and just 50K years of evolution after humans unlocked access to the abstract thinking toolkit. I don’t think we can expect it to enable much in terms of to the ability to coherently plan to do complex tasks or to the ability to write and reason abstractly.
This makes me think humans struggling at coherence is not good evidence for building agents with large t being much more difficult compared to small t: there wasn’t enough optimization pressure.
on most cognitive tasks, it beats most human experts
I think this specifies both thresholds to be 50%.
It doesn’t seem like “shorter timelines” in the safest quadrant has much to do with their current strategy, as they have a gpt-4 paper section on how they postponed the release to reduce acceleration.
why it is so good in general (GPT-4)
What are the examples indicating it’s at the level of performance at complex tasks you would expect from GPT-4? Especially performance which is clearly attributable to improvements that we expect to be made in GPT-4? I looked through a bunch of screenshots but haven’t seen any so far.
Can confirm I consistently had non-deterministic temp-0 completions on older davinci models accessed through the API last year.
Bloomberg reported on plans to invest $10B today
Have you seen this implemented in any blogging platform other people can use? I’d love to see this feature implemented in some Obsidian publishing solution like quartz, but for now they mostly don’t care about access management.
Wow, Zvi example is basically what I’ve been doing recently with hyperbolic discounting too after I’ve spent a fair amount of time thinking about Joe Carlsmith—Can you control the past. It seems to work. “It gives me a lot of the kind of evidence about my future behavior that I like” is now the dominant reason behind certain decisions.
How much time do you expect the form, the coding test, and the interview to take for an applicant?
This idea tries to discover translations between the representations of two neural networks, but without necessarily discovering a translation into our representations.
I think this has been under investigation for a few years in the context of model fusion in federated learning, model stitching, and translation between latent representations in general.
Relative representations enable zero-shot latent space communication—an analytical approach to matching representations (though this is a new work, it may be not that good, I haven’t checked)
Git Re-Basin: Merging Models modulo Permutation Symmetries—recent model stitching work with some nice results
Latent Translation: Crossing Modalities by Bridging Generative Models—some random application of unsupervised translation to translation between autoencoder latent codes (probably not the most representative example)
I don’t expect Putin to use your interpretation of “d” instead of his own interpretation of it which he is publicly advertising whenever he has a big public speech on the topic.
From the latest speech:
> In the 80s they had another crisis they solved by “plundering our country”. Now they want to solve their problems by “breaking Russia”.
This directly references an existential threat.
From the speech a week ago:
> The goal of that part of the West is to weaken, divide and ultimately destroy our country. They are saying openly now that in 1991 they managed to split up the Soviet Union and now is the time to do the same to Russia, which must be divided into numerous regions that would be at deadly feud with each other.
Same.
Also, consider nuclear false flags—the frame for them, including in these same speeches, was created and maintained throughout the entire year.
From my experience of playing VR games on mobile devices (Quest 1 and Quest 2), the majority of in-game characters look much better than this and it doesn’t impact the framerate at all. This seems like a 100% stylistic choice.
Seems related to https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience, https://www.lesswrong.com/posts/epgCXiv3Yy3qgcsys/you-can-t-predict-a-game-of-pinball, and similar objections might be applicable.