I might respond in more depth later, and I am sure other team members have opinions, but roughly:
React, our frontend framework, has chosen a kind of weird path where if you want to utilize the latest set of features in React 19, you basically have to use NextJS (more concretely, server functions and server components are two features that you would be extremely hard to use without framework support, and NextJS is the only framework with support).
We’ve been using NextJS for all of the other web projects that we’ve been building (including AI 2027, the new MIRI book website, our internal Lighthaven booking infrastructure, our conference scheduling software Writehaven, and our internal company infrastructure), and it’s generally been a great experience in almost every respect (it’s been a less great experience for LessWrong, which isn’t surprising since it’s a much much bigger and more complicated codebase).
Jim also has some not-great experience working on some non-Lightcone projects
AWS Beanstalk was a kind of terrible deployment/hosting service, or at least we never figured out how to use it properly. Our deploys would routinely take 20+ minutes, and then take another 20+ minutes to roll back, which means we had multiple instances of ~1 hour downtime that could have instead been a 5-minute downtime if deploys and rollbacks had been fast.
NextJS is a serverless framework. There are some developer experience benefits you get from restructuring things in a serverless way. The one I am most excited about is having preview deployments. PR review is much easier if every pull request just has a link to a deployed version of the app attached to it that you can click to visit, click around in to find any bugs or visual issues, and leave comments on directly.
There are some more reasons, but these are the big ones from my perspective.
Here are some other reasons, though I think they’re a bit less central than the ones in Habryka’s comment.
1. I think current AI systems find it much easier to help with NextJS web apps than they did our sui generis palimpsest of frameworks and approaches. It’s a bit unclear if this is on a trajectory to fix itself, but for now it seems like a relatively big difference. I think partly they’re just way more familiar with this newer stuff, and partly serverless stuff is a bit more architecturally suited to LLMs making narrow changes.
2. Another reason is that we had a lot of technical debt that we wanted to pay down. The project that became the hosting transfer was originally known as the “debungle”[1].
The codebase had a bunch of very particular ways of doing things (like you weren’t supposed to just write and export new React components, but call a registration function on them. You weren’t supposed to write direct queries against our GraphQL server, but use a system of helpers).
I don’t think this stuff is necessarily bad. But because Lightcone is largely composed of generalists, onboarding costs are a bit higher. If you have a blessed way to make a query, and that blessed way is itself changing (as it needs to shift for performance or feature reasons), someone who is working on LessWrong one month in three is paying more cost for keeping up with the internal, undocumented framework magic.
There have been several times I’ve asked a distracted Habryka what the Standard Way to do something in our codebase is, implemented his quick answer, only to get a PR review from Robert asking why I’m doing stuff in a semi-deprecated way.
3. Habryka mentioned wanting to use newer React features. I think possibly a bigger issue was the transitive out-of-date dependencies you get if you stick on an old React. You can’t update Material UI, you can’t update some other library, some of them have security holes, so you vendor the old version and patch it by hand, … That stuff starts to grow as a maintenance and jank burden over time.
In general, I’m pro things being crufty and janky and not spending too much time “rewriting thing to be nice”, and a lot of the stuff I listed above can be worked around. I think probably my list alone wouldn’t be worth the effort of the shift. To be clear, I’m unsure if the combination of my list, Habryka’s, the other arguments I’m aware of, and the expected strength of the arguments I’m not aware of, overall make this a worthwhile shift. I’m guessing yes, but it’s too soon to say.
My prediction is that a year from now Jim will still think it was a mistake and Habryka will still think it was a good call because they value different things.
Yeah, I’m also a bit puzzled. Most features of a forum like LW can be implemented as concatenating HTML strings on the server, which is a very simple mental model, and has plenty of simple implementations that can run on generic hosting. The DOM-based mental model of React/Next doesn’t seem to bring much benefit in this case, and carries a ton of overhead.
I am sure that mental model has nothing to do with why Jim thinks this is/was a bad idea. I think we are all really quite happy we are built on React (or something of that family). Gluing HTML strings together would be a crazy nightmare.
I see, yeah, then your team is a different culture than me. To me simple server side rendering (well not literally concatenating strings, but using templating and the like) is basically the only non-”crazy nightmare” way to build web stuff. While a lot of React stuff (like hooks, reducers, hydration) gives me crazy nightmare vibes. But since this isn’t a programming forum, maybe not much use arguing :-)
Yeah, I care a lot about client-side reactivity, which I think you just can’t really achieve that way (unless you want to glue together javascript strings using templates, which I would not recommend).
I think people should just treat the web as an application platform. Doing a roundtrip for each piece of interactivity, or needing to pre-render each piece of interactivity is IMO really not viable at the complexity level of something like LW.
Yeah, this is maybe also about user taste. I use GW because it feels more like a website, while LW feels a bit too much like an application. There’s a certain “website UI feel” that’s distinct from “application UI feel” and makes me happier somehow. Though of course other people can feel differently.
What do you think about the denial-of-wallet risk with this migration? From what I’ve read about Vercel on ServerlessHorrors (a partisan source) and in random internet comments, you can make costly mistakes with Vercel, but they’ll waive the charges.
LW has a continuous onslaught of crawlers that will consume near-infinite resources if allowed (moreso than other sites, because of its deep archives), so we’ve already been through a bunch of iteration cycles on rate-limits and firewall rules, and we kept our existing firewall (WAF) in place. When stuff does slip through, while it’s true that Vercel will autoscale more aggressively than our old setup, our old setup did also have autoscaling. It can’t scale to too large a multiple of our normal size, before some parts of our setup that don’t auto-scale (our postgres db) fall over and we get paged.
Yeah, my model is if someone does this once they’ll waive the charges. We already had autoscaling in our previous hosting context and both under the current setup and the previous setup people could DDos us if they want to take us down. Within a week or so we could likely switch things around to be robust against most forms of DDos (probably at some cost to user-experience and development experience).
If someone does this a lot, we can just turn on billing limits, and then go down instead of going bankrupt, which is roughly the same situation we were in before.
I might respond in more depth later, and I am sure other team members have opinions, but roughly:
React, our frontend framework, has chosen a kind of weird path where if you want to utilize the latest set of features in React 19, you basically have to use NextJS (more concretely, server functions and server components are two features that you would be extremely hard to use without framework support, and NextJS is the only framework with support).
We’ve been using NextJS for all of the other web projects that we’ve been building (including AI 2027, the new MIRI book website, our internal Lighthaven booking infrastructure, our conference scheduling software Writehaven, and our internal company infrastructure), and it’s generally been a great experience in almost every respect (it’s been a less great experience for LessWrong, which isn’t surprising since it’s a much much bigger and more complicated codebase).
Jim also has some not-great experience working on some non-Lightcone projects
AWS Beanstalk was a kind of terrible deployment/hosting service, or at least we never figured out how to use it properly. Our deploys would routinely take 20+ minutes, and then take another 20+ minutes to roll back, which means we had multiple instances of ~1 hour downtime that could have instead been a 5-minute downtime if deploys and rollbacks had been fast.
NextJS is a serverless framework. There are some developer experience benefits you get from restructuring things in a serverless way. The one I am most excited about is having preview deployments. PR review is much easier if every pull request just has a link to a deployed version of the app attached to it that you can click to visit, click around in to find any bugs or visual issues, and leave comments on directly.
There are some more reasons, but these are the big ones from my perspective.
Here are some other reasons, though I think they’re a bit less central than the ones in Habryka’s comment.
1.
I think current AI systems find it much easier to help with NextJS web apps than they did our sui generis palimpsest of frameworks and approaches. It’s a bit unclear if this is on a trajectory to fix itself, but for now it seems like a relatively big difference. I think partly they’re just way more familiar with this newer stuff, and partly serverless stuff is a bit more architecturally suited to LLMs making narrow changes.
2.
Another reason is that we had a lot of technical debt that we wanted to pay down. The project that became the hosting transfer was originally known as the “debungle”[1].
The codebase had a bunch of very particular ways of doing things (like you weren’t supposed to just write and export new React components, but call a registration function on them. You weren’t supposed to write direct queries against our GraphQL server, but use a system of helpers).
I don’t think this stuff is necessarily bad. But because Lightcone is largely composed of generalists, onboarding costs are a bit higher. If you have a blessed way to make a query, and that blessed way is itself changing (as it needs to shift for performance or feature reasons), someone who is working on LessWrong one month in three is paying more cost for keeping up with the internal, undocumented framework magic.
There have been several times I’ve asked a distracted Habryka what the Standard Way to do something in our codebase is, implemented his quick answer, only to get a PR review from Robert asking why I’m doing stuff in a semi-deprecated way.
3.
Habryka mentioned wanting to use newer React features. I think possibly a bigger issue was the transitive out-of-date dependencies you get if you stick on an old React. You can’t update Material UI, you can’t update some other library, some of them have security holes, so you vendor the old version and patch it by hand, … That stuff starts to grow as a maintenance and jank burden over time.
In general, I’m pro things being crufty and janky and not spending too much time “rewriting thing to be nice”, and a lot of the stuff I listed above can be worked around. I think probably my list alone wouldn’t be worth the effort of the shift. To be clear, I’m unsure if the combination of my list, Habryka’s, the other arguments I’m aware of, and the expected strength of the arguments I’m not aware of, overall make this a worthwhile shift. I’m guessing yes, but it’s too soon to say.
We had our eyes on a NextJS switch early on. But we thought it was valuable to do even without that.
My stance at the beginning was that the entire project was a mistake, and going through the process of actually doing it did not change my mind.
It’s true! May history judge who was right in the end.
My prediction is that a year from now Jim will still think it was a mistake and Habryka will still think it was a good call because they value different things.
Yeah, I’m also a bit puzzled. Most features of a forum like LW can be implemented as concatenating HTML strings on the server, which is a very simple mental model, and has plenty of simple implementations that can run on generic hosting. The DOM-based mental model of React/Next doesn’t seem to bring much benefit in this case, and carries a ton of overhead.
I am sure that mental model has nothing to do with why Jim thinks this is/was a bad idea. I think we are all really quite happy we are built on React (or something of that family). Gluing HTML strings together would be a crazy nightmare.
I see, yeah, then your team is a different culture than me. To me simple server side rendering (well not literally concatenating strings, but using templating and the like) is basically the only non-”crazy nightmare” way to build web stuff. While a lot of React stuff (like hooks, reducers, hydration) gives me crazy nightmare vibes. But since this isn’t a programming forum, maybe not much use arguing :-)
Yeah, I care a lot about client-side reactivity, which I think you just can’t really achieve that way (unless you want to glue together javascript strings using templates, which I would not recommend).
I think people should just treat the web as an application platform. Doing a roundtrip for each piece of interactivity, or needing to pre-render each piece of interactivity is IMO really not viable at the complexity level of something like LW.
Yeah, this is maybe also about user taste. I use GW because it feels more like a website, while LW feels a bit too much like an application. There’s a certain “website UI feel” that’s distinct from “application UI feel” and makes me happier somehow. Though of course other people can feel differently.
(Also, to clarify, we were already on React—it’s mostly other bits of framework glue that got tossed out/replaced/etc.)
What do you think about the denial-of-wallet risk with this migration? From what I’ve read about Vercel on ServerlessHorrors (a partisan source) and in random internet comments, you can make costly mistakes with Vercel, but they’ll waive the charges.
LW has a continuous onslaught of crawlers that will consume near-infinite resources if allowed (moreso than other sites, because of its deep archives), so we’ve already been through a bunch of iteration cycles on rate-limits and firewall rules, and we kept our existing firewall (WAF) in place. When stuff does slip through, while it’s true that Vercel will autoscale more aggressively than our old setup, our old setup did also have autoscaling. It can’t scale to too large a multiple of our normal size, before some parts of our setup that don’t auto-scale (our postgres db) fall over and we get paged.
Yeah, my model is if someone does this once they’ll waive the charges. We already had autoscaling in our previous hosting context and both under the current setup and the previous setup people could DDos us if they want to take us down. Within a week or so we could likely switch things around to be robust against most forms of DDos (probably at some cost to user-experience and development experience).
If someone does this a lot, we can just turn on billing limits, and then go down instead of going bankrupt, which is roughly the same situation we were in before.