DanB

Karma: 212

DanB 2 Dec 2017 17:52 UTC
2 points
on: The New Riddle of Induction: Neutral and Relative Perspectives on Color
Interesting analysis. I hadn’t heard of Goodman before so I appreciate the reference.
In my view the problem of induction has been almost entirely solved by the ideas from the literature on statistical learning, such as VC theory, MDL, Solomonoff induction, and PAC learning. You might disagree, but you should probably talk about why those ideas prove insufficient in your view if you want to convince people (especially if your audience is up-to-date on ML).
One particularly glaring limitation with Goodman’s argument is that it depends on natural language predicates (“green”, “grue”, etc). Natural language is terribly ambiguous and imprecise, which makes it hard to evaluate philosophical statements about natural language predicates. You’d be better off casting the discussion in terms of computer programs, that take a given set of input observations and produce an output prediction.
Of course you could write “green” and “grue” as computer functions, but it would be immediately obvious how much more contrived the program using “grue” is than the program using “green”.

DanB 2 Mar 2018 22:51 UTC
1 point
on: Extended Quote on the Institution of Academia
Holden is a smart guy, but he’s also operating under a severe set of political constraints, since his organization depends so strongly on its ability to raise funds. So we shouldn’t make too much of the fact that he thinks academia is pretty good—obviously he’s going to say that.

DanB 13 Apr 2020 21:31 UTC
1 point
on: College advice for people who are exactly like me
I would add two ideas:
- Try to find a good role model—someone who is similar to you in relevant respects, is a couple of years ahead of you, who has done something you think is awesome, and who you can talk to and observe to some extent. Bill Gates is probably not a good role model.
- Try to form a realistic assessment of how important college actually is; people often err in imagining it to be more or less important than it is in reality (these errors seem to be correlated with social class). I would estimate that the 4 years of college are only modestly more important than other years of your life. What you do right after college is important. What you do when you’re in your late 20s is important.

The Rediscovery of Interiority in Machine Learning

DanB21 Jul 2020 5:02 UTC

5 points

4 comments1 min readLW link

(danburfoot.net)

DanB 21 Jul 2020 5:05 UTC
3 points
on: The Rediscovery of Interiority in Machine Learning
I started this essay last year, and procrastinated on completing it for a long time, until recently the GPT-3 announcement gave me the motivation to finish it up.
If you are familiar with my book, you will notice some of the same ideas, expressed with different emphasis. I congratulate myself a bit on predicting some of the key aspects of the GPT-3 breakthrough (data annotation doesn’t scale; instead learn highly complex interior models from raw data).
I would appreciate constructive feedback and signal-boosting.

DanB 21 Jul 2020 15:51 UTC
1 point
in reply to: Steven Byrnes’s comment on: The Rediscovery of Interiority in Machine Learning
Not a stupid question, this issue is actually addressed in the essay, in the section about interior modeling vs unsupervised learning. The latter is very vague and general, while the former is much more specific and also intrinsically difficult. The difficulty and preciseness of the objective make it much better as a goal for a research community.

Fight Akrasia and Decision Fatigue with DIY Productivity Software

DanB30 Jan 2021 18:38 UTC

25 points

6 comments2 min readLW link

DanB 1 Feb 2021 6:52 UTC
1 point
in reply to: Lukas Trötzmüller’s comment on: Fight Akrasia and Decision Fatigue with DIY Productivity Software
Cool concepts! What tech stack did you use? Was it painful to get the Facebook API working?

[ ]
[deleted]

DanB 19 Feb 2021 21:43 UTC
9 points
on: Utility Maximization = Description Length Minimization
In my Phd thesis I explored an extension of the compression/modeling equivalence that’s motivated by Algorithmic Information Theory. AIT says that if you have a “perfect” model of a data set, then the bitstream created by encoding the data using the model will be completely random. Every statistical test for randomness applied to the bitstream will return the expected value. For example, the proportion of 1s should be 0.5, the proportion of 1s following the prefix 010 should be 0.5, etc etc. Conversely, if you find a “randomness deficiency”, you have found a shortcoming of your model. And it turns out you can use this info to create an improved model.

That gives us an alternative conceptual approach to modeling/optimization. Instead of maximizing a log-likelihood, take an initial model, encode the dataset, and then search the resulting bitstream for randomness deficiencies. This is very powerful because there is an infinite number of randomness tests that you can apply. Once you find a randomness deficiency, you can use it to create an improved model, and repeat the process until the bitstream appears completely random.

The key trick that made the idea practical is that you can use “pits” instead of bits. Bits are tricky, because as your model gets better, the number of bits goes down—that’s the whole point—so the relationship between bits and the original data samples gets murky. A “pit” is a [0,1) value calculated by applying the Probability Integral Transform to the data samples using the model. The same randomness requirements hold for the pitstream as for the bitstream, and there are always as many pits as data samples. So now you can define randomness tests based on intuitive contexts functions, like “how many pits are in the [0.2,0.4] interval when the previous word in the original text was a noun?”

DanB 22 Feb 2021 6:15 UTC
3 points
in reply to: johnswentworth’s comment on: Utility Maximization = Description Length Minimization
I’m not sure exactly what you mean, but I’ll guess you mean “how do you deal with the problem that there are an infinite number of tests for randomness that you could apply?”

I don’t have a principled answer. My practical answer is just to use good intuition and/or taste to define a nice suite of tests, and then let the algorithm find the ones that show the biggest randomness deficiencies. There’s probably a better way to do this with differentiable programming—I finished my Phd in 2010, before the deep learning revolution.

DanB 24 Feb 2021 3:16 UTC
32 points
on: Survey on cortical uniformity—an expert amplification exercise
One very important observation related to this issue is the fact that we often observe specific cognitive deficits (e.g. people who can’t use nouns) but those specific deficits are almost always related to a brain trauma (stroke, etc.) If there were significant cognitive logic coded into the genome, we should see specific cognitive deficits in otherwise healthy young people caused by mutations.
What links here?
- Book review: “A Thousand Brains” by Jeff Hawkins by Steven Byrnes (4 Mar 2021 5:10 UTC; 116 points)

The Japanese Quiz: a Thought Experiment of Statistical Epistemology

DanB8 Apr 2021 17:37 UTC

11 points

0 comments9 min readLW link

DanB 10 May 2021 2:53 UTC
9 points
on: Sympathy for the ferryman of Hades, or why we should keep Trump off Twitter
Why isn’t this an argument for banning all politically powerful people from Twitter?

Compositionality: SQL and Subways

DanB19 Jul 2021 5:37 UTC

30 points

4 comments7 min readLW link

DanB 21 Jul 2021 5:31 UTC
2 points
in reply to: Timothy Johnson’s comment on: Compositionality: SQL and Subways
Thanks for the tip about Kusto—it actually does look quite nice.

A Small Vacation

DanB29 Aug 2021 19:55 UTC

44 points

21 comments8 min readLW link

DanB 31 Aug 2021 3:38 UTC
1 point
in reply to: Grant Demaree’s comment on: A Small Vacation
Thanks for the positive feedback and interesting scenario. I’d never heard of Birobidzhan.

DanB 26 Sep 2021 1:12 UTC
6 points
on: Designing Low Upkeep Software

Having a budget where initial creation is essentially free (fun!) while maintenance is extremely expensive (drugery!) is a dramatic exaggeration for most software development.

My feeling is that most software development has exactly the same cost parameters; the difference is just that BigTech companies have so much money they are capable of paying thousands of engineers handsome salaries, to do the endless drudgery required to keep the tech stacks working.

The SQLite devs pledge to support the product until 2050.

DanB 21 Jan 2022 2:20 UTC
14 points
on: Land Ho!
Copied from a previous comment on Hacker News

I wish you well and I hope you win (ed, here I mean I hope the proposal is approved)

I am pessimistic though. I don’t think people really understand how much current homeowners do not want additional housing to be built. It makes sense if you consider that the net worth of a typical homeowner is very substantially made up of a highly leveraged long position in real estate. If that position goes south—because of an increase in housing supply, or because of undesirable new people moving into the neighborhood—the homeowner’s net worth could be decimated.

Now, most people will not come out and say directly that they are opposed to new housing for the obvious economic reason, because they don’t want to seem selfish and greedy and maybe racist. So they have to find a socially acceptable cover story to oppose new housing—environmentalism, concerns about safety, etc etc.

DanB

The Redis­cov­ery of In­te­ri­or­ity in Ma­chine Learning

Fight Akra­sia and De­ci­sion Fa­tigue with DIY Pro­duc­tivity Software

The Ja­panese Quiz: a Thought Ex­per­i­ment of Statis­ti­cal Epistemology

Com­po­si­tion­al­ity: SQL and Subways

A Small Vacation

The Rediscovery of Interiority in Machine Learning

Fight Akrasia and Decision Fatigue with DIY Productivity Software

The Japanese Quiz: a Thought Experiment of Statistical Epistemology

Compositionality: SQL and Subways