Good point, thanks.
Ilio
You may enjoy this old paper (functionnal properties of somatosensory maps are not only plastic but actively constructed):
Of course there’s some pingeonhole principle that works against you if you try to contact, say, Geoffrey Hinton, or the authors of the last making-the-buzz paper. But otherwise yes, most researchers are glad to talk about their work. And it’s kind of a professional duty, which is why most papers include the email of the corresponding author(s).
More generally, congrats on the KevinRoWang for this post! I started my own journey in ML before he was born, and I’m impressed by the maturity of his advices.
The standard proof of the halting problem might not apply to quantum computers with arbitrary unknown quantum state (I guess it doesn’t, because arbitrary states may encode non determinist advice). But if we stick to the quantum computers that admit a classical description (e.g. the usual ones, as Charlie Steiner and JBlack already mentioned) then no cloning doesn’t apply.
Contra #1: Imagine you order a huge stack of computers for massive multiplayers game purpose. Would you expect it might collapse under it’s own weight, or would you expect the builders to be cautious enough that it won’t collapse like passive dust in free fall?
Contra #4: nope. Landauer’s principle implicates that reversible computation cost nothing (until you’d want to read the result, which then cost next to nothing time the size of the result you want to read, irrespective of the size of the computation proper). Present day computers are obviously very far from this limit, but you can’t assume « computronium » is too.
#2 and #3 sounds stronger, imo. Could you provide a glimpse of the confidence intervals and how it varies from one survey to the next?
Any time you use an « IF » statement: 1) you’re not performing a reversible computation (e.g. your tech is not what minimise energy consumption); 2) the minimal cost is one bit, irrespective of the size of your program. Using MWI you could interpret this single bit as representing « half the branches », but not half the size in memory.
1-3: You are certainly right that cold and homogenous black matter is the scientific consensus right now (at least if by consensus we mean « most experts would either think that’s true or admit there is no data strong enough to convince most experts it’s wrong »).
The point I’m trying to make is: as soon as we say « computronium » we are outside of normal science. In normal science, you don’t suppose matter can choose to deploy itself like a solar sail and use that to progressively reach outside regions of the galaxy where dangerous SN are less frequent. You suppose if it exists it has no aim, then find the best non-weird model that fits the data.
In other words, I don’t think we can assume that the scientific consensus is automatically 10^4 or 10^8 strong evidence for « how sure are we that black matters is not a kind of matter that astrophysicist usually don’t botter to consider? », especially when the scientific consensus also includes « we need to keep spending ressources on figuring out what black matter is ». You do agree that’s also the scientific consensus, right? (And not just to keep labs open, but really to add data and visit and revisit new and old models because we’re still not sure what it is)
4: in the theory of purely reversible computation, the size of what you read dictates the size you must throw out. Your computation is however more sounded than the theory of pure reversible computation, because pure reversible computation may well be as impossible as perfectly analog computation. Now, suppose all black matters emits 0,16 mev/bit. How much computation per second and kilo would let the thermal radiation largely below our ability to detect it?
Pardon the half-sneering tone, but old nan can’t resist: « Oh, my sweet summer child, what do you know of fearing noob gains? Fear is for AI winter, my little lord, when the vanishing gradient problem was a hundred feet deep and the ice wind comes howling out of funding agencies, cutting every budget, dispersing the students, freezing the sparse spared researchers..
Seriously, three years is just a data point, and you want to conclude on the rate of change! I guess you would agree 2016-2022 saw more gains than 2010-2016, and not because the latter were boring times. I disagree that finding out what big transformers could do in the three last years was not a big deal, or even that this was low hanging fruits. I guess that it was low hanging fruits for you, because of the tools you were having access to, and I interpret your post as a deep and true intuition that the next step shall demand different tools (I vote for: « clever inferences from functional neuroscience & neuropsychology»). In any case, welcome on lesswrong and thanks for your precious input! (even if old nan was amazed you were expecting even faster progress!)
I wish I was wise enough at your age to post my gut feeling on internet so that I could better update later. Well, internet did not exist, but you got the idea.
One question after gwern’s reformulation: do you agree that, in the past, technical progress in ML almost always came first (before fundamental understanding)? In other words, is the crux of your post that we should no longer hope for practical progress without truly understanding why what we do should work?
Love it, and love the general idea of seeing more ml-like interpretations of neuroscience knowledge.
One disagreement (but maybe I should say: one addition to a good first-order approximation) is over local information: I think it includes some global information, such as sympathetic/parasympathetic level through heart beat, and that the brain may may actually use that to help construct/stabilize long range networks, such as the default node network.
Love the idea. How efficient! :)
About mental breaks, I guess this might helps creativity for the same reason meditation and naps help partial consolidation of memory traces (see below for a recent thesis showing these effects).
Specifically, I would speculate that consolidation means reorganizing memories, and that reorganizing memories helps making sense of this information.
In what sense is the functional behavior different from the internals/actual computations? Could you provide a few toy examples?
Thanks, that clarifies your aims a lot. Did you gave some thoughts on how your approach would deal with cases of embodied cognition and uses of external memories?
Thanks for opening minds to the possibility that agents & their utility function may not be the most fruitful way to think about these questions. Could you provide a few pointers to these « notably not all » from point 5?
Thanks! By interpretability work, you mean in the vein of Colah and the like?
P1 sounds contra the evidences: when an action potential travels a myelinated axon, its precise amplitude does not matter (as long as it’s enough to make sodium channels open at the next Ranvier node). In other words, we could add or substrat a lot of ions at most of the 10^11 Ranvier nodes of the human brain without changing any information in mind.
https://en.m.wikipedia.org/wiki/Saltatory_conduction https://bionumbers.hms.harvard.edu/bionumber.aspx?s=n&v=5&id=100692 https://en.m.wikipedia.org/wiki/Node_of_Ranvier
However I didn’t get how you went from « tiny physical changes should correspond to tiny mental changes » (clearly wrong from above, unless tiny includes zero) to « non-infinitesimal mental change can have an infinitesimal (practically zero) effect on the physical world », so maybe I’m missing your point entirely. Could you rephrase or develop the latter?
I’m puzzled by the apparent tension between upvoting importance of continuous learning on one hand and downvoting agreement with agency on the other hand. When transformers produce something that sounds not from humans, it’s usually because of consistency mistakes (like telling at length that it can’t speak danish… in well formed danish sentences). Maybe it’s true that continuous learning can solve the problem (if that includes learning from its own response maybe?). But wouldn’t we perceived that as exhibiting agency?
Î’ve left so many things unpublished over the years that I know your feeling very well. If you look anything like me, there may be a part of you who feel like you need either slack or working on more important things.
Interesting, thanks! Two quick questions about energy and space:
Typo? Eb > kb.ln2 == Eb > kb.T.ln2
« Shrinking the brain by a factor of 10 at the same power output would result in a 3x temp increase to around 1180K »: shouldn’t we take into account that the less volume the lower total in wire lenght, hence less power output?