I thought that the point was that either managed-interface-only access, or API access with rate limits, monitoring, and an appropriate terms of service, can prevent use of some forms of scaffolding. If it’s staged release, this makes sense to do, at least for a brief period while confirming that there are not security or safety issues.
Davidmanheim
These days it’s rare for a release to advance the frontier substantially.
This seems to be one crux. Sure, there’s no need for staged release if the model doesn’t actually do much more than previous models, and doesn’t have unpatched vulnerabilities of types that would be identified by somewhat broader testing.
The other crux, I think, is around public release of model weights. (Often referred to, incorrectly, as “open sourcing.”) Staged release implies not releasing weights immediately—and I think this is one of the critical issues with what companies like X have done that make it important to demand staged release for any models claiming to be as powerful or more powerful than current frontier models. (In addition to testing and red-teaming, which they also don’t do.)
It is funny, but it also showed up on April 2nd in Europe and anywhere farther east...
I think there are two very different cases of “almost works” that are being referred to. The first is where the added effort is going in the right direction, and the second is where it is slightly wrong. For the first case, if you have a drug that doesn’t quite treat your symptoms, it might be because it addresses all of them somewhat, in which case increasing the dose might make sense. For the second case, you could have one that addresses most of the symptoms very well, but makes one worse, or has an unacceptable side effect, in which case increasing the dose wouldn’t help. Similarly, we could imagine a muscle that is uncomfortable. The second case might then be a stretch that targets almost the right muscle. That isn’t going to help if you do it more. The first case, on the other hand, would be a stretch that targets the right muscle but isn’t doing enough, and obviously it could be great to do more often, or for a longer time.
Again, I think it was a fine and enjoyable post.
But I didn’t see where you “demonstrate how I used very basic rationalist tools to uncover lies,” which could have improved the post, and I don’t think this really explored any underappreciated parts of “deception and how it can manifest in the real world”—which I agree is underappreciated. Unfortunately, this post didn’t provide much clarity about how to find it, or how to think about it. So again, it’s a fine post, good stories, and I agree they illustrate being more confused by fiction than reality, and other rationalist virtues, but as I said, it was not “the type of post that leads people to a more nuanced or better view of any of the things discussed.”
I disagree with this decision, not because I think it was a bad post, but because it doesn’t seem like the type of post that leads people to a more nuanced or better view of any of the things discussed, much less a post that provided insight or better understanding of critical things in the broader world. It was enjoyable, but not what I’d like to see more of on Less Wrong.
(Note: I posted this response primarily because I saw that lots of others also disagreed with this, and think it’s worth having on the record why at least one of us did so.)
“Climate change is seen as a bit less of a significant problem”
That seems shockingly unlikely (5%) - even if we have essentially eliminated all net emissions (10%), we will still be seeing continued warming (99%) unless we have widely embraced geoengineering (10%). If we have, it is a source of significant geopolitical contention (75%) due to uneven impacts (50%) and pressure from environmental groups (90%) worried that it is promoting continued emissions and / or causes other harms. Progress on carbon capture is starting to pay off (70%) but is not (90%) deployed at anything like the scale needed to stop or reverse warming.
Adaptation to climate change has continued (99%), but it is increasingly obvious how expensive it is and how badly it is impacting developing world. The public still seems to think this is the fault of current emissions (70%) and carbon taxes or similar legal limits are in place for a majority of G7 countries (50%) but less than half of other countries (70%).
To start, the claim that it was found 2 miles from the facility is an important mistake, because WIV is 8 miles from the market. For comparison to another city people might know better, in New York, that’s the distance between World Trade Center and either Columbia University, or Newark Airport. Wuhan’s downtown is around 16 miles across. 8 miles away just means it was in the same city.
And you’re over-reliant on the evidence you want to pay attention to. For example, even rstricting ourselves to “nearby coincidence” evidence, the Hunan the market is the largest in central China—so what are the odds that a natural spillover events occurs immediately surrounding the largest animal market? If the disease actually emerged from WIV, what are the odds that the cases centered around the Hunan market, 8 miles away, instead of the Baishazhou live animal market, 3 miles away, or the Dijiao market, also 8 miles away?
So I agree that an update can be that strong, but this one simply isn’t.
Yeah, but I think that it’s more than not taken literally, it’s that the exercise is fundamentally flawed when being used as an argument instead of very narrowly for honest truth-seeking, which is almost never possible in a discussion without unreasonably high levels of trust and confidence in others’ epistemic reliability.
What is the relevance of the “posterior” that you get after updating on a single claim that’s being chosen, post-hoc, as the one that you want to use as an example?
Using a weak prior biases towards thinking the information you have to update with is strong evidence. How did you decide on that particular prior? You should presumably have some reference class for your prior. (If you can’t do that, you should at least have equipoise between all reasonable hypotheses being considered. Instead, you’re updating “Yes Lableak” versus “No Lableak”—but in fact, “from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis just to locate the hypothesis in theory-space. It’s not a question of justifying anything to anyone.”)
How confident are you in your estimate of the bayes factor here? Do you have calibration data for roughly similar estimates you have made? Should you be adjusting for less than perfect confidence?
Thank you for writing this.
I think most points here are good points to make, but I also think it’s useful as a general caution against this type of exercise being used as an argument at all! So I’d obviously caution against anyone taking your response itself as a reasonable attempt at an estimate of the “correct” Bayes factors, because this is all very bad epistemic practice! Public explanations and arguments are social claims, and usually contain heavily filtered evidence (even if unconsciously). Don’t do this in public.
That is, this type of informal Bayesian estimate is useful as part of a ritual for changing your own mind, when done carefully. That requires a significant degree of self-composure, a willingness to change one’s mind, and a high degree of justified confidence n your own mastery of unbiased reasoning.Here, though, it is presented as an argument, which is not how any of this should work. And in this case, it was written by someone who already had a strong view of what the outcome should be, repeated publicly frequently, which makes it doubly hard to accept the implicit necessary claim that it was performed starting from an unbiased point at face value! At the very least, we need strong evidence that it was not an exercise in motivated reasoning, that the bottom line wasn’t written before the evaluation started—which statement is completely missing, though to be fair, it would be unbelievable if it had been stated.
I agree that releasing model weights is “partially open sourcing”—in much the same way that freeware is “partially open sourcing” software, or restrictive licences with code availability is.
But that’s exactly the point; you don’t get to call something X because it’s kind-of-like X, it needs to actually fulfill the requirements in order to get the label. What is being called Open Source AI doesn’t actually do the thing that it needs to.
Thanks—I agree that this discusses the licenses, which would be enough to make LlaMa not qualify, but I think there’s a strong claim I put forward in the full linked piece that even if the model weights were released using a GPL license, those “open” model weights wouldn’t make it open in the sense that Open Source means elsewhere.
I agree that the reasons someone wants the dataset generally aren’t the same reasons they’d want to compile from source code. But there’s a lot of utility for research in having access to the dataset even if you don’t recompile. Checking whether there was test-set leakage for metrics, for example, or assessing how much of LLM ability is stochastic parroting of specific passages versus recombination. And if it was actually open, these would not be hidden from researchers.
And supply chain is a reasonable analogy—but many open-source advocates make sure that their code doesn’t depend on closed / proprietary libraries. It’s not actually “libre” if you need to have a closed source component or pay someone to make the thing work. Some advocates, those who built or control quite a lot of the total open source ecosystem, also put effort into ensuring that the entire toolchain needed to compile their code is open, because replicability shouldn’t be contingent on companies that can restrict usage or hide things in the code. It’s not strictly required, but it’s certainly relevant.
The vast majority of uses of software are via changing configuration and inputs, not modifying code and recompiling the software. (Though lots of Software as a Service doesn’t even let you change configuration directly.) But software is not open in this sense unless you can recompile, because it’s not actually giving you full access to what was used to build it.
The same is the case for what Facebook call open-source LLMs; it’s not actually giving you full access to what was used to build it.
Thanks—Redpajama definitely looks like it fits the bill, but it shouldn’t need to bill itself as making “fully-open, reproducible models,” since that’s what “open source” is already supposed to mean. (Unfortunately, the largest model they have is 7B.)
Yes, agreed—as I said in the post, “Open Source AI simply means that the models have the model weights released—the equivalent of software which makes the compiled code available. (This is otherwise known as software.)”
“Freely remixable” models don’t generally have open datasets used for training. If you know of one, that’s great, and would be closer to open source. (Not Mistral. And Phi-2 is using synthetic data from other LLMs—I don’t know what they released about the methods used to generate or select the text, but it’s not open.)
But the entire point is that weights are not the source code for an LLM, they are the compiled program. Yes, it’s modifiable via LoRA and similar, but that’s not open source! Open source would mean I could replicate it, from the ground up. For facebook’s models, at least, the details of the training methods, the RLHF training they do, where they get the data, all of those things are secrets. But they call it “Open Source AI” anyways.
Good point, and I agree that it’s possible that what I see as essential features might go away—“floppy disks” turned out to be a bad name when they ended up inside hard plastic covers, and “deepware” could end up the same—but I am skeptical that it will.
I agree that early electronics were buggy until we learned to build them reliably—and perhaps we can solve this for gradient-descent based learning, though many are skeptical of that, since many of the problems have been shown to be pretty fundamental. I also agree that any system is inscrutable until you understand it, but unlike early electronics, no-one understands these massive lists of numbers that produce text, and human brains can’t build them, they just program a process to grow them. (Yes, composable NNs could solve some of this, as you point out when mentioning separable systems, but I still predict they won’t be well understood, because the components individually are still deepware.)
Completely as an aside, coordination problems among ASI don’t go away, so this is a highly non trivial claim.