Quinn
Thinking about a top-level post on FOMO and research taste
Fear of missing out defined as inability to execute on a project cuz there’s a cooler project if you pivot
but it also gestures at more of a strict negative, where you think your project sucks before you finish it, so you never execute
was discussing this with a friend: “yeah I mean lesswrong is pretty egregious cuz it sorta promotes this idea of research taste as the ability to tear things down, which can be done armchair”
I’ve developed strategies to beat this FOMO and gain more depth and detail with projects (too recent to see returns yet, but getting there) but I also suspect it was nutritious of me to develop discernment about what projects are valuable or not valuable for various threat models and theories of change (in such a way that being a phd student off of lesswrong wouldn’t have been as good in crucial ways, tho way better in other ways).
but I think the point is you have to turn off this discernment sometimes, unless you want to specialize in telling people why their plans won’t work, which I’m more dubious on the value of than I used to be
Idk maybe this shortform is most of the value of the top level post
He had become so caught up in building sentences that he had almost forgotten the barbaric days when thinking was like a splash of color landing on a page.
a -valued quantifier is any function , so when is bool quantifiers are the functions that take predicates as input and return bool as output (same for prop). the standard
max
andmin
functions on arrays count as real-valued quantifiers for some index set .I thought I had seen as the max of the Prop-valued quantifiers, and exists as the min somewhere, which has a nice mindfeel since forall has this “big” feeling (if you determined for that (of which is just syntax sugar since the variable name is irrelevant) by exhaustive checking, it would cost whereas would cost unless the derivation of the witness was dependent on size of domain somehow).
Incidentally however, in differentiable logic it seems forall is the “minimal expectation” and existential is the “maximal expectation”. Page 10 of the LDL paper, where a is the limit as gamma goes to zero of , or the integral with respect to a -ball about the min of rather than about the entire domain of . os in this sense, the interpretation of a universally quantified prop is a minimal expectation, dual where existentially quantified prop is a maximal expectation.
I didn’t like the way this felt aesthetically, since as I said, forall feels “big” which mood-affiliates toward a max. But that’s notes from barely-remembered category theory I saw once. Anyway, I asked a language model and it said that forall is minimal because it imposes the strictest of “most conservative” requirement. so “max” in the sense of “exists is interpreted to maximal expectation” refers to maximal freedom.
I suppose this is fine.
I’m excited for language model interpretability to teach us about the difference between compilers and simulations of compilers. In the sense that chatgpt and I can both predict what a compiler of a suitably popular programming language will do on some input, what’s going on there---- surely we’re not reimplementing the compiler on our substrate, even in the limit of perfect prediction? Will be an opportunity for a programming language theorist in another year or two of interp progress
firefox is giving me a currently broken redirect on https://atlascomputing.org/safeguarded-ai-summary.pdf
Does anyone use vim / mouse-minimal browser? I like Tridactyl better than the other one I tried, but it’s not great when there’s a vim mode in a browser window everything starts to step on eachother (like in jupyter, colab, leetcode, codesignal)
claude and chatgpt are pretty good at ingesting textbooks and papers and making org-drill cards.
here’s my system prompt https://chat.openai.com/g/g-rgeaNP1lO-org-drill-card-creator though i usually tune it a little further per session.
Here are takes on the idea from the anki ecosystem
I tried a little ankigpt and it was fine, i haven’t tried the direct plugin from ankiweb. I’m opting for org-drill here cuz I really like plaintext.
one exercise remark: swimming! might have to pick up a little technique at first, but the way it combines a ton of muscle groups and cardio is unparalleled
(if anyone reading this is in the bay I will 100% teach you the 0->1 on freestyle and breaststroke technique, just DM)
is lower than it “should” be.
I think the lesson of social desirability bias is that valuable services having lower status than they “ought” to is the system working as intended.
(Though I’m sympathetic with your view: no reason to be extreme about social desirability bias)
Proof cert memo
In the Safeguarded AI programme thesis[1], proof certificates or certifying algorithms are relied upon in the theory of change. Let’s discuss!
From the thesis:
Proof certificates are a quite broad concept, introduced by[33] : a certifying algorithm is defined as one that produces enough metadata about its answer that the answer’s correctness can be checked by an algorithm which is so simple that it is easy to understand and to formally verify by hand.
The abstract of citation 33 (McConnell et al)[2]
A certifying algorithm is an algorithm that produces, with each output, a certificate or witness (easy-to-verify proof) that the particular output has not been compromised by a bug. A user of a certifying algorithm inputs x, receives the output y and the certificate w, and then checks, either manually or by use of a program, that w proves that y is a correct output for input x. In this way, he/she can be sure of the correctness of the output without having to trust the algorithm.
In this memo, I do an overview by distinguishing a proof cert from a constructive proof, and dive shallowly into the paper. Then I ruminate on how this might apply in AI contexts.
What’s the difference between a proof cert and a constructive proof?
If a cert is just a witness, you might think that proof certs are just proofs from constructive math as distinct from classical math. For example, under some assumptions and caveats (in the calculus/topology setting) a function can be “known” to have a fixed point without us methodically discovering them with any guarantees, while under other assumptions and caveats (namely the lattice theoretic setting) we know there’s a fixed point because we have a guaranteed procedure for computing it. However, they go further: a certifying algorithm produces with each output a cert/witness that the particular output has, in the case of McConnell et al, “not been compromised by a bug”. This isn’t a conceptual leap from constructive math, in which a proof is literally a witness-building algorithm, it looks to me a bit like a memo saying “btw, don’t throw out the witness” along with notes about writing cheap verifiers that do not simply reimplement the input program logic.
One way of looking at a certificate is that it’s dependent on metadata. This may simply be something read off of an execution trace, or some state that logs important parts of the execution trace to construct a witness like in the bipartite test example. In the bipartite test example, all you need is for the algorithm that searches for a two-coloring or an odd cycle, and crucially will return either the two-coloring or the odd cycle as a decoration on it’s boolean output (where “true” means “is bipartite”). Then, a cheap verifier (or checker) is none other than the pre-existing knowledge that two-colorable and bipartite are equivalent, or conversely that the existence of an odd cycle is equivalent to disproving bipartiteness.
Chapter 11 of [3] will in fact discuss certification and verification supporting eachother, which seems to relax the constraint that a cheap verifier doesn’t simply reimplement the certifier logic.
The main difference between a standard proof as in an applied type theory (like lean or coq) and a proof cert is kinda cultural and not fundamental, in that proof certs prefer to emphasize single IO pairs and lean/coq proofs often like to emphasize entire input types. Just don’t get confused on page 34 when it says “testing on all inputs”—it means that a good certifying algorithm is a means to generate exhaustive unit tests, not that the witness predicate actually runs on all inputs at once. It seems like the picking out of only input of interest and not quantifying over the whole input type will be the critical consideration when we discuss making this strategy viable for learned components or neural networks.
Formally:
Skip this if you’re not super compelled by what you know so far about proof certificates, and jump down to the section on learned programs.
Consider algorithms . A precondition and a postcondition form an IO-spec or IO-behavior. An extension of the output set is needed to account for when the precondition is violated. We have a type of witness descriptions. Finally, we describe the witness predicate with that says IO pair is witnessed by . So when is true, must be a witness to the truth, else a witness to the falsehood. McConnell et al distinguish witness predicates from strong witness predicates, where the former can prove but the latter knows which disjunct.
A checker for is an algorithm sending . It’s desiderata / nice to haves are correctness, runtime linear in input, and simplicity.
I’m not that confident this isn’t covered in the document somewhere, but McConnell et al don’t seem paranoid about false functions. If Alice (sketchy) claims that a came from on , but she lied to Bob and in fact ran , there’s no requirement for witness predicates to have any means of finding out. (I have a loose intuition that the proper adversarial adaptation of proof certs would be possible on a restricted algorithm set, but impose a lot of costs). Notice that is with respect to a fixed , would be a different approach! We should highlight that a witness predicate only tells you about one input.
Potential for certificates about learning, learned artefacts, and learned artefacts’ artefacts
The programme thesis continues:
We are very interested in certificates, because we would like to rely on black-box advanced AI systems to do the hard work of searching for proofs of our desired properties, yet without compromising confidence that the proofs are correct. In this programme, we are specifically interested in certificates of behavioural properties of cyber-physical systems (ranging from simple deterministic functions, to complex stochastic hybrid systems incorporating nuances like nondeterminism and partiality).
If AIs are developing critical software (like airplanes, cars, and weaponry) and assuring us that it’s legit, introducing proof certs removes trust from the system. The sense in which I think Safeguarded AI primarily wants to use proof certs certainly doesn’t emphasize certs of SGD or architectures, nor does it even emphasize inference-time certs (though I think it’d separately be excited about progress there). Instead, an artefact that is second order removed from the training loop seems to be the target of proof certs, like a software artefact written by a model but it’s runtime is not the inference time of the model. I see proof certs in this context as primarily an auditing/interpretability tool. There’s no reason a model should write code that is human readable, constraining it to write human readable code seems hard, letting the code be uninterpretable but forcing it to generate interpretable artefacts seems pretty good! The part I’m mostly not clear on is why proof certs would be better than anything else in applied type theory (where proofs are just lambdas, which are very decomposable- you can’t read them all at once but they’re easy to pick apart) or model checking (specs are logical formulae, even if reading the proof that an implementation obeys a spec is a little hairy).
It’s important that we’re not required to quantify over a whole type to get some really valuable assurances, and this proof cert framework is amenable to that. On the other hand, you can accomplish the same relaxation in lean or coq, and we can really only debate over ergonomics. I think ultimately, the merits of this approach vs other approaches are going to be decided entirely on the relative ergonomics of tools that barely exist yet.
Overall impression
McConnell et al is 90 pages, and I only spent a few hours with it. But I don’t find it super impressive or compelling. Conceptually, a constructive proof gets me everything I want out of proof certificates. A tremendous value proposition would be to make these certs easier to obtain for real world programs than what coq or lean could provide, but I haven’t seen those codebases yet. There’s more literature I may look at around proof-carrying code (PCC)[4] which is similar ideas, but I’m skeptical that I’ll be terribly compelled since the insight “just decorate the code with the proof” isn’t very subtle or difficult.
Moreover, this literature just seems kinda old I don’t see obvious paths from it to current research.
consider how our nonconstructive existence proof of nash equilibria creates an algorithmic search problem, which we then study with computational complexity. For example, 2-player 0-sum games are P but for three or more players general sum games are NP-hard. I wonder if every nonconstructive existence proof is like this? In the sense of inducing a computational complexity exercise to find what class it’s in, before coming up with greedy heuristics to accomplish an approximate example in practice.
I like thinking about “what it feels like to write computer programs if you’re a transformer”.
Does anyone have a sense of how to benchmark or falsify Nostalgebraist’s post on the subject?
me:
any interest in a REMIX study group? in meatspace in berkeley, a few hours a week. https://github.com/redwoodresearch/remix_public/
How were the distributions near the top elicited from each participant?
Any tips for getting out of a “rise to the occasion mindset” and into a “sink to your training” mindset?
I’m usually optimizing for getting the most out of my A-game bursts. I want to start optimizing for my baseline habits, instead. I should cover the B-, C-, and Z-game; the A-game will cover itself.
Mathaphorically, “rising to the occasion” is taking a max of a max, whereas “sinking to the level of your habits” looks like a greatest lower bound.
I used to think “community builder” was a personality trait I couldn’t switch off, but once I moved to the bay I realized that I was just desperate for serendipity and knew how to take it from 0% to 1%. Since the bay is constantly humming at 70-90% serendipity, I simply lost the urge to contribute.
Benefactors are so over / beneficiaries are so back / etc.
Cope isn’t a very useful concept.
For every person who has a bad reason that they catch because you say “sounds like cope”, there are 10x as many people who find their reason actually compelling. Saying “if that was my reason it would be a sign I was in denial of how hard I was coping” or “I don’t think that reason is compelling” isn’t really relevant to the person you’re ostensibly talking to, who’s trying to make the best decisions for the best reasons. Just say you don’t understand why the reason is compelling.
the author re-reading one year+ out:
My views on value of infographics that actually look nice have changed, perhaps I should have had nicer looking figures.
“Unlike spreadsheets, users can describe in notebooks distributions, capturing uncertainty in their beliefs.” seems overconfident. Monte carlo has been an excel feature for years. My dismissal of this (implicit) makes sense as a thing to say because “which usecases are easy?” is a more important question than “which usecases can you get away with if you squint?”, but I could’ve done a way better job at actually reading and critiquing excel programs that feature monte carlo.
Viv told me to fix this and I ignored them because I liked the aesthetics of the sentence at the time: “which defines compositional systems as fluid de- and re-composition under the abolition of emergent properties”—viv was right. I’ve changed my views on the importance of writing being boring / non-idiosyncratic, also even my prose aesthetic preferences change over time (I no longer enjoy this sentence).
“(this use case isn’t totally unlike what Ergo accomplishes)” I keep thinking about ergo, perhaps a paragraph in the “whiggish history” section would’ve been appropriate, since API access to some zoo of scoring rules / crowd wisdom is a pretty obvious feature that many platforms would foreseeably prioritize.
I underrated how strong the claims about compositionality (being precisely the sum of parts), since statistical noise is such a fundamental piece of the puzzle.
I haven’t been working on this stuff except a little on the side for most of the last year, but still get excited here and there. I returned to this post because I might have another post about module systems in software design, package management, and estimational programming written up in the not too distant future.
Overall, this remains a super underrated area.
I eventually decided that human chauvinism approximately works most of the time because good successor criteria are very brittle. I’d prefer to avoid lock-in to my or anyone’s values at t=2024, but such a lock-in might be “good enough” if I’m threatened with what I think are the counterfactual alternatives. If I did not think good successor criteria were very brittle, I’d accept something adjacent to E/Acc that focuses on designing minds which prosper more effectively than human minds. (the current comment will not address defining prosperity at different timesteps).
In other words, I can’t beat the old fragility of value stuff (but I haven’t tried in a while).
I wrote down my full thoughts on good successor criteria in 2021 https://www.lesswrong.com/posts/c4B45PGxCgY7CEMXr/what-am-i-fighting-for
AI welfare: matters, but when I started reading lesswrong I literally thought that disenfranching them from the definition of prosperity was equivalent to subjecting them to suffering, and I don’t think this anymore.