• I hon­estly don’t think the trade­off is real (but please tell me if you don’t find my rea­sons com­pel­ling). If I study cat­e­gory the­ory next and it does some cool stuff with the base map, I won’t re­ject that on the ba­sis of it con­tra­dict­ing this book. Ditto if I ac­tu­ally use LA and want to do calcu­la­tions. The philo­soph­i­cal un­der­stand­ing that ma­trix-vec­tor mul­ti­pli­ca­tion isn’t ul­ti­mately a thing can peace­fully co­ex­ist with me do­ing ma­trix-vec­tor mul­ti­pli­ca­tion when­ever I want to. Just like the un­der­stand­ing that the nat­u­ral num­ber 1 is a differ­ent ob­ject from the in­te­ger num­ber 1 peace­fully co­ex­ists with me treat­ing them as equal in any other con­text.

I don’t agree that this view is the­o­ret­i­cally limit­ing (if you were mean­ing to im­ply that), be­cause it al­lows any calcu­la­tion that was pos­si­ble be­fore. It’s even com­pat­i­ble with the base map.

• I wouldn’t be heart­bro­ken if it was defined like that, but I wouldn’t do it if I were writ­ing a text­book my­self. I think the LADR ap­proach makes the most sense – vec­tors and ma­tri­ces are fun­da­men­tally differ­ent – and if you want to bring a vec­tor into the mat­frix world, then why not de­mand that you do it ex­plic­itly?

If you ac­tu­ally use LA in prac­tice, there is noth­ing stop­ping you from writ­ing . You can be ‘sloppy’ in prac­tice if you know what you’re do­ing while think­ing that draw­ing this dis­tinc­tion is a good idea in a the­o­ret­i­cal text book.

• That looks like it also works. It’s a differ­ent philos­o­phy I think, where LADR says “vec­tors and ma­tri­ces are fun­da­men­tally differ­ent ob­jects and vec­tors aren’t de­pen­dent on bases, ever” and your view says “each ba­sis defines a bi­jec­tive func­tion that maps vec­tors from the no-ba­sis world into the ba­sis-world (or from the ba­sis1 world into the ba­sis2 world)” but it doesn’t in­sist on them be­ing fun­da­men­tally differ­ent ob­jects. Like if then they’re the same kind of ob­ject, and you just need to know which world you’re in (i.e. rel­a­tive to which ba­sis, if any, you need to in­ter­pret your vec­tor to).

II don’t think not hav­ing ma­trix-vec­tor mul­ti­pli­ca­tion is an is­sue. The LADR model still al­lows you to do ev­ery­thing you can do in nor­mal LA. If you want to mul­ti­ply a ma­trix with a vec­tor , you just make into the n-by-1 ma­trix and then mul­ti­ply two ma­tri­ces. So you mul­ti­ply rather than . It forces you to be ex­plicit about which ba­sis you want the vec­tor to be rel­a­tive to, which seems like a good thing to me. If is the stan­dard ba­sis, then will have the same en­tries as , it’ll just be writ­ten as rather than .

• Afaik, in ML, the term bias is used to de­scribe any move away from the uniform /​ mean case. But in com­mon speech, such a move would only be called a bias if it’s in­ac­cu­rate. So if the al­gorithm learns a true pat­tern in the data (X is more likely to be clas­sified as 1 than Y is) that wouldn’t be called a bias. Un­less I mi­s­un­der­stand your point.

• Ow. Yes, you do. This wasn’t a typo ei­ther, I re­mem­bered the re­sult in­cor­rectly. Thanks for point­ing it out, and props for be­ing at­ten­tive enough to catch it.

Or to be more pre­cise, you only need one scalar, but the scalar is for not , be­cause isn’t given. The the­o­rem says that, given and , there is a scalar and a vec­tor such that and is or­thog­o­nal to .

• I won­der, what do you think about the chap­ter about dual spaces, dual maps, an­nihila­tor, etc.?

Noth­ing, be­cause it wasn’t in the ma­te­rial. I worked through the sec­ond edi­tion of the book, and the parts on du­al­ity seem to be new to the third edi­tion.

I be­lieve when math­e­mat­i­ci­ans say that in gen­eral P(x) holds, they mean that for any x in the do­main of in­ter­est P(x) holds. Per­haps you want to you typ­i­cal in­stead of gen­eral here. E.g. there is a no­tion called typ­i­cal ten­sor rank of ten­sors of given shape, which means a ten­sor rank which oc­curs with non-zero prob­a­bil­ity when a ran­dom ten­sor of given shape is sam­pled.

Thanks for that, I changed it.

# In­sights from Lin­ear Alge­bra Done Right

• It seems to me that one of the qual­ities of a good ques­tion for the LBO is that a cor­rect an­swer has high util­ity, while an­other is the abil­ity to judge the or­a­cle’s an­swer. If we take only the in­ter­sec­tion be­tween the two sets of ques­tions that meet 1 and 2, it’ll be a rel­a­tively small set. But if there is a set of ques­tions where one might or might not be able to judge the an­swer but which are high util­ity, one could ask the LBO a set of such ques­tions. Then, once the epi­sode is over, give re­ward on those where one can eval­u­ate the an­swers, and give null re­ward on the oth­ers.

Not re­ally a sub­mis­sion, just think­ing out loud.

• Sub­mis­sion for LBO:

In­put a cor­pus of text (could be mul­ti­ple posts) de­scribing tech­ni­cal ap­proaches to al­ign a pow­er­ful AI. Split this into a finite num­ber of items that are rel­a­tively short (such as para­graphs). Ask the or­a­cle to choose the part that is most worth spend­ing more time on. (For ex­am­ple, there might be a para­graph with a dan­ger­ous hid­den as­sump­tion in an oth­er­wise promis­ing ap­proach, and think­ing more about it might re­veal that and lead to con­cep­tual progress.)

Have a team of re­searches look into it for an ad­e­quate amount of time which is fixed (and told to the or­a­cle) in ad­vance (maybe three months?) After the time is over, have them rate the progress they made com­pared to some sen­si­ble baseline. Use this as the or­a­cle’s re­ward.

Of course this has the prob­lem of max­i­miz­ing for ap­par­ent in­sight rather than ac­tual in­sight.

• Either char­i­ties like the Gates Foun­da­tion and Good Ven­tures are hoard­ing money at the price of mil­lions of pre­ventable deaths

My as­sump­tion be­fore read­ing this has been that this is the case. Given that, does a rea­son re­main to up­date away from the po­si­tion that the GiveWell claim is ba­si­cally cor­rect?

For the rest of this post, let’s sup­pose the true amount of money needed to save a life through GiveWell’s top char­i­ties is 50.000$. I don’t think any­thing about Singer’s main point changes. For one, it’s my un­der­stand­ing that de­creas­ing an­i­mal suffer­ing is at least an or­der of mag­ni­tude more effec­tive than de­creas­ing hu­man suffer­ing. If the ar­gu­ments you make here ap­ply equally to that (which I don’t think they do), and we take the above num­ber, well that’s 5000$ for a benefit-as-large-as-one-life-saved, which is still suffi­cient for Singer’s argument

Se­condly, I don’t think your ar­gu­ments ap­ply to ex­is­ten­tial risk pre­ven­tion and even if they did and we de­crease effec­tive­ness there by one or­der of mag­ni­tude, that’d also still val­i­date Singer’s ar­gu­ment if we take my pri­ors.

I no­tice that I’m very an­noyed at your on-the-side link to the ar­ti­cle about OpenAI with the claim that they’re do­ing the op­po­site of what the ar­gu­ment jus­tify­ing the in­ter­ven­tion recom­mends. It’s my un­der­stand­ing that the ar­ti­cle, though plau­si­ble at the time, was very spec­u­la­tive and has been falsified since it’s been writ­ten. In par­tic­u­lar, OpenAI has pledged not to take part in an arms race un­der rea­son­able con­di­tions, which di­rectly con­tra­dicts one of the points of that ar­ti­cle. Quote:

There­fore, if a value-al­igned, safety-con­scious pro­ject comes close to build­ing AGI be­fore we do, we com­mit to stop com­pet­ing with and start as­sist­ing this pro­ject. We will work out speci­fics in case-by-case agree­ments, but a typ­i­cal trig­ger­ing con­di­tion might be “a bet­ter-than-even chance of suc­cess in the next two years.”

That, and they seem to have an ethics board with sig­nifi­cant power (this is based on de­cid­ing not to re­lease the full ver­sion of GPT). I be­lieve they also said that they won’t pub­lish ca­pa­bil­ity re­sults in the fu­ture, which also con­tra­dicts one of the main con­cerns (which, again, was rea­son­able at the time). Please ei­ther re­ply or amend your post.

• I’ll also be at­tend­ing the full 10 day ver­sion. I’ve only been med­i­tat­ing for a cou­ple of months so the prospect of such a long re­treat feels fairly threat­en­ing, but look­ing at the mean out­come, I think it’s the cor­rect call.

• What is the best text­book on anal­y­sis out there?

My goto source is Miri’s guide, but anal­y­sis seems to be the one topic that’s miss­ing. TurnTrout men­tioned this book which looks de­cent on first glance. Are there any com­pet­ing opinions?

• I’ve no­ticed that I can­not tell, from ca­sual con­ver­sa­tion, whether some­one is in­tel­li­gent in the IQ sense.

I can’t re­ally do any­thing ex­cept to state this as a claim: I think a few min­utes of con­ver­sa­tion with any­one al­most always gives me sig­nifi­cant in­for­ma­tion about their in­tel­li­gence in an IQ sense. That is, I couldn’t tell you the ex­act num­ber, and prob­a­bly not even re­li­ably pre­dict it with an er­ror of less than 20 (maybe more), but nonethe­less, I know sig­nifi­cantly more than zero. Like, if I talked to 9 peo­ple evenly spaced within [70, 130], I’m pretty con­fi­dent that I’d get most of them into the cor­rect half.

This does not trans­late into and kind of dis­agree­ment wrt to GPT’s texts seem­ing nor­mal if I just skim them. Or to Robin Han­son’s the­sis.

• No, but I’ve read al­most all of the se­quences on web­site, I think. I didn’t do it sys­tem­at­i­cally, so it’s al­most a guaran­tee that I missed a few, but not many. Read some stuff twice, but again, not sys­tem­at­i­cally.

I think they’re amaz­ing, and they’ve had a profound im­pact on me.

• I do have a spread­sheet where I keep track of pre­dic­tions, though only track­ing the pre­dic­tion, my con­fi­dence, and whether it came true or false. It’s low effort and I think worth do­ing, but I can’t con­fi­dently say that it has im­proved my cal­ibra­tion.

# In­sights from Munkres’ Topology

