DavidHolmes

Karma: 196

https://www.davidholmes.nl

DavidHolmes 8 Apr 2026 10:49 UTC
LW: 2 AF: 2
0
AF
on: [Paper] Stringological sequence prediction I
Thank you for sharing this interesting work! I will be very interested to see how far you can push this to non-repetition-based complexity measures in future.

I’d be interested to know what kind of results your LZP algorithm gives for text prediction? I’m assuming “not great”, but to me it would be an interesting point of comparison to the algorithm you are (I guess) eventually working towards.

A couple of tiny comments: for me it would have been helpful to briefly recall what is on page 2; and there’s a typo just after it first appears (“number of mistake ”).

DavidHolmes 15 Mar 2026 8:48 UTC
3 points
0
in reply to: dr_s’s comment on: New LessWrong Editor! (Also, an update to our LLM policy.)
I agree with you to some extent; in the end a false statement is a false statement, whether it came from an LLM or a bad use of google (or anywhere else). But I think there are a lot of people who over-estimate the reliability of the LLMS they are using in their writing, so that the overall effect is more confident wrong claims than we had pre-LLM-use (there’s a reason the term “AI-slop” exists despite the fact that humans can also produce nonsense). I am generally in favour of policies that nudge authors towards extra checking in case of heavy LLM use.

DavidHolmes 15 Mar 2026 8:39 UTC
3 points
0
in reply to: RobertM’s comment on: New LessWrong Editor! (Also, an update to our LLM policy.)
Do you still stand by this comment in the light of the comment of Jeffrey Heninger on the Solar Storms post saying that he showed it to an expert and “The plasma physics in this post is mostly wrong.”? I think I was the first person to call into question whether the post was basically correct. I hesitated to do so because I knew I might be wrong and there was a risk of causing a pile-on. But in the light of the comment I mentioned above, I am inclined to think I made the right call?

DavidHolmes 14 Mar 2026 14:09 UTC
4 points
1
on: New LessWrong Editor! (Also, an update to our LLM policy.)
For me, knowing when I am reading “ text written by a human, which includes facts, arguments, examples, etc, which were researched/discovered/developed with LLM assistance” is in fact way more important than knowing whether or not the actual words of the text were written by an LLM. This site is called LessWrong, and LLMs are not yet good at being it.

Perhaps a policy that facts which have been produced by an LLM and not independently verified should be flagged as such?

DavidHolmes 12 Mar 2026 7:35 UTC
6 points
0
in reply to: Croissanthology’s comment on: Solar Storms
Thanks! For me that helps a lot. I do really appreciate the effort you have put into this, and I don’t want to suggest that no-one is allowed to talk about anything without becoming/consulting experts. At the same time I definitely agree with Gwern that in the age of LLM writing, it is more important than ever to be really clear about the epistemic status of our work.

DavidHolmes 11 Mar 2026 15:58 UTC
8 points
1
in reply to: Croissanthology’s comment on: Solar Storms
Maybe this is a communication issue? The style of writing comes across as rather authoritative, the way you write gave me the impression that you are an expert on this topic. The only red flag that I found in the text was the “research by Claude” thanks at the end. Personally I would have appreciated a disclaimer near the start of the article. Saying that epistimics were discussed on a twitter thread not linked from the article is not helpful to me. I do not have a twitter account, so I’m afraid I’ve still not read it.

DavidHolmes 11 Mar 2026 14:14 UTC
47 points
19
on: Solar Storms
I feel like a terrible person for writing this, so apologies in advance. But when I read “Thanks to … Opus 4.6 for a lot the research”, and then in the comments people are pointing out what seem to be multiple factual errors, I can’t help but wonder whether this is all true? More precisely, it’s not clear to me how much I should update in the direction of any of the claims made in this post. Could you tell us a bit more about what fact-checking happened?

DavidHolmes 17 Dec 2025 14:42 UTC
9 points
0
on: Scientific breakthroughs of the year
I love the idea of this! But it worries me a bit that when I look through the ones under “mathematics” the ordering seems pretty erratic. I’m a professional mathematician and managing editor for a good mathematics journal, so this should be the field I know best, and my doubts here make me question the rest.

It’s awkward for me to criticise the ranking of specific papers publicly, but to give one example the paper “Progress in the mirror symmetry program?: a criterion for the rationality of cubic fourfolds” seems vastly under-rated on the “big if true” axis relative to many other works (I think the p(generalises) for that paper is fair).

On the other hand, mathematics has a reputation for being hard for outsiders to evaluate; I’m curious of what people think of the rankings in other fields?

DavidHolmes 27 Jan 2025 9:30 UTC
1 point
0
in reply to: Dmitry Vaintrob’s comment on: On polytopes
Hmm, so I’m very wary of defending tropical geometry when I know so little about it; if anyone more informed is reading please jump in! But until then, I’ll have a go.

tropical geometry might be relevant ML, for the simple reason that the functions coming up in ML with ReLU activation are PL

I’m not sure I agree with this argument.

Hmm, even for a very small value of `might’? I’m not saying that someone who wants to contribute to ML needs to seriously consider learning some tropical geometry, just that if one already knows tropical geometry it’s not a crazy idea to poke around a bit and see if there are applications.

The use of PL functions is by no means central to ML theory, and is an incidental aspect of early algorithms.

I agree this is an important point. I don’t actually have a good idea what activation functions people use in practise these days. Thinking about asymptotic linearity makes me think about the various papers appearing using polynomial activation functions. Do you have an opinion on this? For people in algebraic geometry it’s appealing as it generates lots of AG problems (maybe v hard), but I don’t have a good feeling as to whether it’s got anything much to do with `real life’ ML. I can link to some of the papers I’m thinking of if that’s helpful, or maybe you are already a bit familiar.

I don’t see why one wouldn’t just use ordinary currents here (currents on a PL manifold can be made sense of after smoothing, or in a distribution-valued sense, etc.).

I think you’re right; this paper just came to mind because I was reading it recently.

whether tropical geometry has ever been useful (either in proving something or at least in reconceptualizing something in an interesting way) in linear programming.

A little googling suggests there are some applications. This paper seems to give an application of tropical geometry to complexity of linear programming: https://inria.hal.science/hal-03505719/document and this list of conference abstracts seems to give other applications: https://him-application.uni-bonn.de/fileadmin/him/Workshops/TP3_21_WS1_Abstracts.pdf Whether they are ‘convincing’ I leave up to you.

1 Algebraic geometry in general (including tropical geometry) isn’t good at dealing with deep compositions of functions, and especially approximate compositions.

Fair, though one might also see that as an interesting challenge. I don’t have a feeling as to whether this is for really fundamental reasons, or people haven’t tried so hard yet.

2 [….] I simply can’t think of any behavior that is at all meaningful from an AG-like perspective where the questions of fan combinatorics and degrees of polynomials are replaced by questions of approximate equality.

There are plenty of cases where “high degree” is enough (Falting’s Theorem is the first thing that comes to mind, but there are lots). But I agree that “degree approximately 5″ feels quite unnatural.

DavidHolmes 26 Jan 2025 18:53 UTC
4 points
0
on: On polytopes
Hi Dmitry,

To me it seems not unreasonable to think that some ideas from tropical geometry might be relevant ML, for the simple reason that the functions coming up in ML with ReLU activation are PL, and people in tropical geometry have thought seriously about PL functions. Of course this does not guarantee that there is anything useful to be said!

One possible example that comes to mind in the context of your post here is the concept of polyhedral currents. As I understand it, here the notion of “density of polygons’ is used as a kind of proxy for the derivative of a PL function? But I think the theory of polyhedral currents gives a much more general theory of differentiation of PL functions. Very naively, rather than just recording the locus where the function fails to be linear, one also records how much the derivative changes when crossing the walls. I learnt about this from a paper of Mihatsch: https://arxiv.org/pdf/2107.12067 but I’m certain there are older references.

I’m really a log person, I don’t know the tropical world very well; sorry if what I write does not make sense!

DavidHolmes 19 Jan 2025 11:58 UTC
17 points
10
in reply to: Mikhail Samin’s comment on: meemi’s Shortform

Get that agreement in writing.

I’m not sure that would be particularly reassuring to me (writing as one of the contributors). First, how would one check that the agreement had been adhered to (maybe it’s possible, I don’t know)? Second, people in my experience often don’t notice they are training on data (as mentioned in a post above by ozziegooen).

DavidHolmes 21 Dec 2023 12:38 UTC
1 point
0
in reply to: Dirichlet-to-Neumann’s comment on: What makes teaching math special
I think this is a key point. Even the best possible curriculum, if it has to work for all students at the same rate, is not going to work well. What I really want (both for my past-self as a student, and my present self as a teacher of university mathematics) is to be able to tailor the learning rate to individual students and individual topics (for student me, this would have meant ‘go very fast for geometry and rather slowly for combinatorics’). And while we’re at it, can we also customise the learning styles (some students like to read, some like to sit in class, some to work in groups, etc)?

This is technologically more feasible than it was a decade ago, but seems far from common.

DavidHolmes 9 Dec 2022 7:13 UTC
1 point
0
in reply to: Charlie Steiner’s comment on: Neural networks biased towards geometrically simple functions?
Thanks Charlie.

Just to be double-sure, the second process was choosing the weight in a ball (so total L2 norm of weights was ⇐ 1), rather than on a sphere (total norm == 1), right?

Yes, exactly (though $\leq T$ for some constant $T$ , which may not be $1$ , but turn out not to matter).

Is initializing weights that way actually a thing people do?

Not sure (I would like to know). But what I had in mind was initialising a network with small weights, then doing a random walk (‘undirected SGD’), and then looking at the resulting distribution. Of course this will be more complicated than the distributions I use above, but I think the shape may depend quite a bit on the details of the SGD. For example, I suspect that the result of something like adaptive gradient descent may tend towards more spherical distributions, but I haven’t thought about this carefully.

If training large neural networks only moves the parameters a small distance (citation needed), do you still think there’s something interesting to say about the effect of training in this lens of looking at the density of nonlinearities?

I hope so! I would want to understand what norm the movements are ‘small’ in (L2, L $\infty$ , …).

LayerNorm looks interesting, I’ll take a look.

DavidHolmes 10 Nov 2022 16:15 UTC
1 point
0
in reply to: redlizard’s comment on: Exams-Only Universities
Maths at my Dutch university also has homework for quite a few of the courses, which often counts for something like 10-20% of final grade. It can usually be submitted online, so you only need to be physically present for exams. However, there are a small number of courses that are exceptions to this, and actually require attendance to some extent (e.g. a course on how to give a scientific presentation, where a large part of the course consists of students giving and commenting on each other’s presentations—not so easy to replace the learning experience with a single exam at the end).

But this differs between Dutch universities.

DavidHolmes 27 Aug 2022 5:39 UTC
4 points
3
in reply to: Ramana Kumar’s comment on: Your posts should be on arXiv
I suspect the arXiv might not be keen on an account that posts papers by a range of people (not including the account-owner as coauthor). This might lead to heavier moderation/whatever. But I could be very wrong!

DavidHolmes 27 Aug 2022 5:34 UTC
6 points
5
on: Your posts should be on arXiv
Some advice for getting papers accepted on arxiv

As some other comments have pointed out, there is a certain amount of moderation on arXiv. This is a little opaque, so below is an attempt to summarise some things that are likely to make it easier to get your paper accepted. I’m sure the list is very incomplete!

In writing this I don’t want to give the impression that posting things to arXiv is hard; I have currently 28 papers there, have never had a single problem or delay with moderation, and the submission process generally takes me <15 mins these days.
1. Endorsement. When you first attempt to submit a paper you may need to be endorsed. JanBrauner kindly offered below to help people with endorsements; I might also be able to do the same, but I’ve never posted in the CS part of arXiv, so not sure how effective this will be. However, even better to avoid need for moderation. To this end, use an academic email address if you have one; this is quite likely to already be enough. Also, see below on subject classes (endorsement requirements depend on which subject class(es) you want to post in).
2. Choosing subject classes. Each paper gets one or more subject classes, like CS.AI; see [https://arxiv.org/category_taxonomy] for a list. Some subject classes attract more junk than others, and the ones that attract more junk are more heavily moderated. In mathematics, it is math.GM (General Mathematics) that attracts most junk, hence is most heavily moderated. I guess most people here are looking at CS.AI, I don’t know what this is like. But one easy thing is to minimise cross-listing (adding additional subject classes for your paper), as then you are moderated by all of them.
3. Write in (la)tex, submit the tex file. You don’t have to do this, but it is standard and preferred by the arXiv, and I suspect makes it less likely your paper gets flagged for moderation. It is also an easy way to make sure your paper looks like a serious academic paper.
4. It is possible to submit papers on behalf of third parties. I’ve never done this, and I suspect such papers will be more heavily moderated.
5. If you have multiple authors, it doesn’t really matter who submits. After the submission is posted you are sent a ‘paper password’ allowing coauthors to ‘claim’ the paper; it is then associated to their arXiv account, orcid etc (orcid is optional, but a really good idea, and free).
Finally, a request: please be nice to the moderators! They are generally unpaid volunteers doing a valuable service to the community (e.g. making sure I don’t have to read nonsense proofs of the Riemann hypothesis every morning). Of course it doesn’t feel good if your paper gets held up, but please try not to take it personally.
What links here?
- Your posts should be on arXiv by JanB (25 Aug 2022 10:35 UTC; 159 points)

DavidHolmes 27 Aug 2022 5:12 UTC
3 points
0
in reply to: habryka’s comment on: Your posts should be on arXiv
The arXiv really prefers that you upload in tex. For the author this makes it less likely that your paper will be flagged for moderation etc (I guess). So if it were possible to export to Rex I think that for the purposes of uploading to arXiv this would be substantially better. Of course, I don’t know how much more/less work it is…

DavidHolmes 21 Aug 2022 19:07 UTC
1 point
0
in reply to: Charlie Steiner’s comment on: Bias towards simple functions; application to alignment?
Hi Charlie, If you can give a short (precise) description for an agent that does the task, then you have written a short programme that solves the task. I think then if you need more space to ‘explain what the agent would do’ then you are saying there also exists a less efficient/compact way to specify the solution. From this perspective I think the latter is then not so relevant. David

DavidHolmes 19 Aug 2022 19:29 UTC
5 points
0
on: AI Safety bounty for practical homomorphic encryption
1. I think that
provable guarantees on the safety of an FHE scheme that do not rely on open questions in complexity theory such as the difficulty of lattice problems.

is far out of reach at present (in particular to the extent that there does not exist a bounty which would affect people’s likeliness to work on it). It is hard to do much in crypto without assuming some kind of problem to be computationally difficult. And there are very few results proving that a given problem is computationally difficult in an absolute sense (rather than just ‘at least as hard as some other problem we believe to be hard’). C.f. P vs NP. Or perhaps I misunderstand your meaning; are you ok with assuming e.g. integer factorisation to be computationally hard?

Personally I also don’t think this is so important; if we could solve alignment modulo assuming e.g. integer factorisation (or some suitable lattice problem) is hard, then I think we should be very happy…
1. More generally, I’m a bit sceptical of the effectiveness a bounty here because the commercial application of FHE are already so great.
2. About 10 years ago when I last talked to people in the area about this I got a bit the impression that FHE schemes were generally expected to be somewhat less secure than non-homomorphic schemes, just because the extra structure gives an attacker so much more to work with. But I have no idea if people still believe this.

DavidHolmes 19 Aug 2022 14:14 UTC
1 point
0
in reply to: evhub’s comment on: Bias towards simple functions; application to alignment?
P.s. the main thing I have taken so far from the link you posted is that the important part is not exactly about the biases of SGD. Rather, it is about the structure of the DNN itself; the algorithm used to find a (local) optimum plays less of a role than the overall structure. But probably I’m reading too much into your precise phrasing.