Disclaimer 1: These views are my own and don’t necessarily reflect the views of anyone else (Eric, Steph, or Eliezer).
Disclaimer 2: Most of the events happened at least a year ago. My memory is not particularly great, so the dates are fuzzy and a few things might be slightly out of order. But this post has been reviewed by Eric, Steph, and Eliezer, so it should mostly be okay.
I’m going to list events chronologically. At times I’ll insert a “Reflection” paragraph, where I’m going to outline my thoughts as of now. I’ll talk about what I could have done differently and how I would approach a similar problem today.
Chapter 0: Eliezer pitches Arbital and I say ‘no’
Around the summer of 2014 Eliezer approached me with the idea for what later would become Arbital. At first, I vaguely understood the idea as some kind of software to map out knowledge. Maybe something like a giant mind map, but not graphical. I took some time to research existing and previous projects in that area and found a huge graveyard of projects that have been tried. Yes, basically all of them were dead. Most were hobby projects, but some seemed pretty serious. None were successful, as far as I could tell. I didn’t see how Eliezer’s project was different, so I passed on it.
Reflection: Today, I’d probably try to sit down with Eliezer for longer and really try to understand what he is seeing that I’m not. It’s likely back then I didn’t have the right skills to extract that information, but I think I’m much better at it today.
Reflection: Also, after working with Eliezer for a few years, I’ve got a better feeling for how things he says often seem confusing / out of alignment / tilted, until you finally wrap your mind around it, and then it’s crystal clear and easy.
Chapter 1: Eliezer and I start Arbital
Early January 2015 I was sitting in my room, tired from looking in vain for a decent startup idea, when Arbital popped back into my mind. There were still a lot of red flags around the idea, but I rationalized to myself that given Eliezer’s track record, there was probably something good here. And, in the worst case, I’d just create a tool that would be useful to Eliezer alone. That didn’t seem like a bad outcome, so I decided to do it. I contacted Eliezer, he was still interested, and so we started the project.
Reflection: The decision process sounds a bit silly, but I don’t think it’s a bad one. I really prefer to do something decently useful, rather than sit around waiting for something perfect. I also still approve of the heuristic of accepting quests / projects from people you think are good at coming up with quests / projects. But if I did it again, I’d definitely put a lot more effort upfront to understand the entire vision before committing to it.
Reflection: Paul Graham wrote in one of his essays that it’s okay (though not ideal) to initially build a product for just one user. There are, of course, several caveats. The user needs to use the product extensively, otherwise you don’t get the necessary feedback on all the features you’re building. And the user needs to be somewhat typical of other users you hope to attract to the platform.
Reflection: Unfortunately, both of these turned out to be false. I’ll elaborate on the feature usage below. But the “typical” part probably could have been foreseen. There are only a few people in the world who write explanations at the scale and complexity that Eliezer does. The closest cluster is probably people writing college textbooks. So, in the beginning, I didn’t have any sense for who the first 10-100 users were going to be. That would have been fine if I was just building a tool for Eliezer, but since my goal was explicitly to create a for-profit consumer startup, this was a big mistake.
Eliezer provided the product vision and design, and I did all the coding. At first, I thought I’d code for a few months and then we would have an MVP that we could show to a few people to gather more interest and get some potential users. But, as I began to understand the overall vision better myself, the shipping date began drifting further and further back. At the time this worried me greatly, because I didn’t want to build a thing that nobody else would use. Eliezer’s argument was that we needed to build a product that was the best tool for a particular workflow. (I’m the Startup Founder 1 in conversation 2.) This made sense to me, but I still felt anxious that we were flying blind. So around April, I went around and showed what I had to some people. There wasn’t much to look at, and what was there wasn’t pretty, so it was mostly me explaining the idea. The reception was lukewarm. People said it seemed interesting, but may be not particularly for them. This was a bit discouraging, but it was also clear that people weren’t getting the full vision.
Reflection: Sigh, this is complicated. In general, I agree that if you are showing / talking about your product to potential users and they are not interested then either you’re talking to the wrong people, your product isn’t useful, or you’re presenting it wrong. In the case of Arbital, though, I think lack of enthusiasm was due to how hard it was to explain the entire vision. There were a lot of moving parts, and a lot of what made Arbital good eventually was the full combination of all those parts.
Reflection: I think the correct thing to do would have been to create detailed UI screens. Then print them and show them to people (and Eliezer). This probably would have taken a month or two, but it would have been worthwhile. The reason I never got around to it, aside from the ugh-field around doing UI mockups, was because it always felt like in a month or two we would be done with the MVP.
Reflection: Eliezer requested a lot of features, and most of them had good justifications for why the final product needed to have them. But, neither of us was very good at prioritizing. (I wouldn’t say we were bad, but we probably could have sped up the development by about 25% if we were better.) It was only around autumn when we finally got better at it.
Reflection: One such feature was a pretty nifty system for questions and answers. Of course, since nobody was using the platform, we didn’t really get any questions or answers, so it was hard to test that feature, and maintaining it felt pointless. Another feature: a private domain, where you could basically have your own private instance of Arbital at your_subdomain.arbital.com.
Around summer of 2015, I finally started to get a grasp for the entire vision. The grand plan had five major problems that needed to be solved: Explanations → Debate → Notifications → Rating → Karma. (Done roughly in that order, but also in parallel.)
Explanations: Arbital as a better Wikipedia. (1, 2) Each page would explain a specific concept (as opposed to Wikipedia pages that list a bunch of facts); the system would create a sequence of pages for you to read to understand a topic, where the sequence would be tailored specifically to you based on your preferences and what you already know.
Notifications: Make sure the user is notified about various events that might interest them (e.g. a new comment in a thread they are subscribed to, a new article to read). Also, if they are a writer, they need to be notified of various related events as well (e.g. someone commented, someone proposed an edit).
Rating: How will the system know which pages, explanations, or comments are good? How will the system be resistant to people trying to game it to make their pages, explanations, or comments appear better than they are? If we do this right, we could replace Yelp (or other services whose primary function is to provide ratings).
Karma: How will we rate users? How will their ratings affect what they can do? How do ratings interact between domains (e.g. math domain vs. art domain)?
Later that year Eliezer wrote a 55 page document describing Arbital and how and why it was different and necessary. (If Eliezer ever gets around to it, he might edit and publish it at some point. I’m mostly mentioning it here to underline the size and complexity of the project.)
Reflection: Once I understood how Arbital was different, it was clear that no previous (nor current) project has even come close to trying to capture that vision. Over the years I’ve had a lot of people send me messages that they or their friend were working on a similar project. And it’s true, for most people who give a cursory glance at Arbital, it seems similar to the other “organize all knowledge” projects. But I’ll still maintain that Arbital is a different kind of beast. And certainly in scope and ambition, I haven’t seen anything close.
Reflection: Now you can probably see how the meme of “Arbital will solve that too” was born. It was a hugely ambitious project for sure, but looking back the only problem with that was that for a while we just didn’t have a good, short explanation of what Arbital was. This made it hard to talk to people about the project and get them excited. It also made prioritizing features more difficult.
So, the first major problem we wanted to solve was Explanations. If we solved it well, it’s possible we could become the next Wikipedia (or at least a much better Quora). Our goal was for Arbital to be the best tool to write and organize online explanations. The primary topic we wanted to explain was, of course, AI safety. But we reasoned that if we just had AI safety content, especially if it was mostly written by Eliezer, the website wouldn’t become generally used and its content widely accepted. (And then we definitely wouldn’t become the next Wikipedia.) This is why later we focused mostly on math explanations.
At the end of 2015 we launched a private beta version for MIRI. A few weeks before, I sat down with a UX designer, Greg Schwartz. We spent a few sessions going over all the screens and redesigning them to be simpler and more understandable. He often pushed me to simplify the project and drop various features. I also had another friend look at UI and help with font and colors. This was definitely time well spent (only about a month), and we later got many compliments on the look and feel of the website.
Reflection: It occurs to me now that while Greg’s feedback had some specifics wrong, it was overall correct in that it was pointing out a deep problem: the project had too many moving parts and a lot of those parts weren’t really used. It would have been hard to guess which parts would end up necessary, but the right solution was to find more users who would want to use the platform now (or very soon) and talk to them.
I was excited about the launch, because I thought that finally some people aside from Eliezer would be using Arbital. Unfortunately, it was only many many months later that other people from MIRI slowly started using it.
Reflection: I think after we reached our “MVP”, I should have switched into “find users” mode. (Ideally, I would have had users lined up at the outset, but even this timing would have been okay.) For example, I could have pushed for Agent Foundations forum to be ported to Arbital. Even though that was more of a Discussion project, these were very reachable users, still within the overall strategy. I think we should have used a greedy user acquisition strategy, instead of trying to stick to our rigid sequential plan.
Reflection: I’d describe one of the main struggles of 2015 as: “we need to build a small MVP quickly and get feedback from users” (Alexei) vs. “users don’t know what they want, and they won’t be able to give you meaningful feedback until they see and use the product” (Eliezer). Like I mentioned above, I think the correct solution here are detailed mockups.
Reflection: Another struggle was: “we need users to make sure we are building things correctly” (Alexei) vs. “I can tell when we are building things correctly, I can get us users as soon as the product is ready” (Eliezer). Unfortunately, we never got the product “ready” enough to test Eliezer’s claim. I think it would have a taken a long while to get there. But, given how things ended up, it’s possible that would have been a better path.
Chapter 2: Eric and Steph join Arbital, and we take destiny into our own hands
Around April of 2016 Eric Rogstad and Stephanie Zolayvar joined the team. We continued following Eliezer’s vision and have him dictate features and their design. Since focusing on AI alignment alone wouldn’t have resulted in a respected platform, we shifted our primary topic to math, specifically: intuitive math explanations.
Reflection: When we ran this idea by people, we got a lot of positive feedback. A lot of people said they wanted that kind of website, but it took me some time to realize that everyone wanted to read intuitive math explanations, but almost nobody would actually spend the time creating them, even if they could in principle.
We invited some people to write the content. We hosted a writing party. We had a Slack channel, where with Eric Bruylant’s help we built a small community. Some people wrote pretty good math explanations, but overall things moved way too slow. We talked to some of our users; we tried various things, like creating projects. But, we simply didn’t have enough writers, and we didn’t know how to find more.
Reflection: I think we should have dropped most of the development and focused on user acquisition at this point. There were several times when I considered pivoting to a “math blogging” platform, but it felt like too big of a shift from wiki-focused plan we were pursuing. Again, I think a greedy “acquire users now!” strategy would have served us well.
One of the biggest features we built around this time was dynamic explanations. A lot of effort went into designing and implementing a system of requisites. Basically each page could teach and/or require certain requisites, which were other pages. It was not clear what overall ontology we wanted, so it took us a while to iterate this feature and we ended up with a lot of edge cases. We built something that worked okay, but, again, it was hard to test because there wasn’t quite enough dense content.
Reflection: I’d say we iterated that feature for way too long. In part this was because Eliezer was consistently not satisfied with what we implemented. At some point things became way too “hacky.” I think if we simply had more pages and more people constructing explanations, it would have helped us answer a lot of the internal debates we had. But instead we were trying to wrangle a set of about 30 pages to work in just the right way. We should have left the feature as good enough and moved on. (But really, we should have been getting more users.)
Not only was it hard to find writers, but the explanations were hard to write as well. In general, writing modular explanations is very hard. Doubly so, when you also want to string those explanations together to form a coherent sequence.
Reflection: we were also trying to build a two-sided marketplace. We needed writers, but writers wanted readers, but readers wanted good content. I think the correct way to solve that would have been to attract people with existing blogs / readership to switch to Arbital and bring their audience with them.
Reflection: Team-wise we absolutely needed someone who would be going after users all the time and talking to them, recruiting them, marketing, etc… Nobody on the team had experience with or affinity for doing that.
To help us showcase the platform, Eliezer wrote the Bayes’ Rule Guide. We’ve went through several iterations of it over the course of a few months, tweaking features and improving retention metrics. The somewhat dense set of pages helped us test a few features easier. Lots of people read the guide and loved it, but it wasn’t obvious if Arbital format helped vs. Eliezer’s writing was good. I think people also didn’t appreciate the magic that happened behind the scene. (How do you communicate to a reader that they could have had a much worse reading experience but didn’t?)
Nate Soares helped us by writing the Logarithm Guide. We thought if we could produce good sequences like that frequently and post them online, we might slowly get traction. Unfortunately, it’s really time consuming to produce content that good, and there are just simply not that many people who can write content of that quality (and have the time to do it for free).
Here is what the front page looked like around that time. At the height of it, we had about a dozen regular users who would come and write a few pages every week. They enjoyed the small community we had and frequently hung out in our Slack channel. They wanted to write math explanations for themselves and their friends. I don’t remember how many readers we had, but it was around 50-200 / day, most of them redirected from Eliezer’s old guide to Bayes’ Theorem.
In August we raised a $300k pre-seed round. We had about 9 investors. Most of them invested because of Eliezer, but a few knew me personally as well.
Also around that time, it became clear to us that things just weren’t going well. The primary issue was that we completely relied on Eliezer to provide guidance to which features to implement and how to implement them. Frequently when we tried to do things our way, we were overruled. (Never without a decent reason, but I think in many of those cases either side had merit. Going with Eliezer’s point of view meant we frequently ended up blocked on him, because we couldn’t predict the next steps.) Also, since Eliezer was the only person seriously using the product, there wasn’t enough rapid feedback for many of the features. And since we wasn’t in the office with us every day, we were often blocked.
So, we decided to take the matter into our own hands. The three of us would decide what to do, and we would occasionally talk to Eliezer to get his input on specific things.
Reflection: Working with Eliezer was interesting to say the least. He certainly had a great overall vision for the product; one that I’m still astonished by to this day. He often had good insight into specific features and how to implement them. But sometimes he would get way too bogged down by certain details and spend longer on a feature than I thought was necessary. (In most of those cases he need things to work a certain way to solve a particular problem he had, but it was wasting our time because we were building something ultra specific to his use case.) This was especially painful for features that nobody, including Eliezer, would end up using.
Reflection: Eliezer also had a tendency sometimes to overcomplicate things and designs systems that I could barely wrap my head around. (I often joked that we would end up building a website that only one person in the world could use.) But then again, there were also many moments where a complicated, messy feature would suddenly click into place, and then it seemed obvious and simple.
Reflection: I’m tempted to draw a lesson like: never ever build a product you don’t understand yourself. But if I did that, I’d certainly miss a huge learning opportunity of working with Eliezer and leveling up my product and UX skills. So, instead, I think the lesson is: if you’re running the project, never ever do anything that doesn’t make sense to you. As soon as you start delegating / accepting things that don’t make sense, you muddy the water for yourself. Now the strategy has opaque components that you don’t understand, can’t explain, and sometimes actively disagree with. There is just no way you can move at the necessary startup speed like that, and you’re also not learning from your mistakes.
Reflection: This is especially true with respect to the overall strategy. Yes, maybe some paths are objectively better or easier. But if it’s not one that makes sense to you, if it’s not one you can execute, then you should take another path.
Reflection: Looking at arbital.com today, I’m actually still very much impressed with it. It’s a good piece of software, and if I wanted to write explanations, I think I’d be hard pressed to find a better website. Ironically, the large part of what makes it really good are all the features that it has.
Chapter 3: Pivot to discussion
It was clear that we couldn’t scale a community around math. So we decided to pivot. It wasn’t a clean and easy pivot; if I remember correctly it took us about a month of struggling and deliberating to decide that our current approach wasn’t working and then settle on a new one.
We decided to skip the Explanations part and go straight for the Discussion. We started build a new design around claims. A claim is a page with a proposition that users can vote on by assigning a probability estimate or by marking the degree of (dis)agreement. The idea was that people would blog on Arbital, and create pages for claims they discussed. People could vote on claims, and thus everyone could see where people mostly agreed and mostly disagreed. Claims could also be reused in other blog posts by the same author or other people.
We kept most of the original architecture, but remade the homepage. We also shifted the focus to the regional rationalist community. We did multiple user interviews. We did UI mockups. We talked to some rationalist bloggers and got some mild support.
One of my favorite artifacts to come out from that time period is this SlateStarCodex predictions page.
Reflection: At the time, I think the pivot decision was correct. And if we continued going with it, it’s possible Arbital would have become LW 2.0, though that wasn’t exactly our intention at the time.
Reflection: One thing we messed up during this time was diluting leadership. Since Eliezer was no longer in charge of product, the responsibility fell on all of us. This resulted in many many discussion about what to build, how to build it, down the minute details. Our pace really slowed down and it took us a while to patch it up.
Chapter 4: End of Arbital 1.0
In the beginning of 2017 I experienced my first burnout. There was simply no way I could work, so I apologized to the team and spent a month playing video games, which I desperately needed. This gave me the time, space, and distance to think about the project. When I came back, I sat down with the team and we had an extensive discussion about the direction of the company.
Eric and Steph wanted to stay the course. I no longer believed that was going to work, and I wanted to try a different approach. My biggest realization during my break was that people (and in this case specifically: most rationalists) were not actually interested in putting any serious effort in improving the state of online debate. While almost everyone wanted better online discussions, just like with math explanations, almost nobody was willing to put in any kind of work.
Furthermore, when we talked to a few rationalists directly, I just didn’t get the feeling of genuine helpfulness or enthusiasm. This was upsetting, because there aren’t that many big projects that the community does. So when I was doing Arbital, I guess I expected that more people would be on board, that more people would put in a bit of an extra effort to help us. But at best people put in minimal work (to satisfy us or themselves, I’m not sure). However, there was a limit to how upset I could be, because I very clearly recognized the same trait in me. So, while it’s still a sad state of affairs, I’d be a hypocrite for being upset with any particular person.
Reflection: I think the rationality community can produce great insights, but mostly due to individual effort. There are great posts, but they rarely lead to prolonged conversations. And you very rarely see debates summarized for public consumption. (No wonder, it takes a lot of time and hard work!) There are a few counterexamples, but I think they prove the point by how much they stand out. (Best recent example I can think of is Jessica Taylor mediating a discussion between Eliezer and Paul Christiano and then writing it up.) (And, of course, not only do those things need to be written, but they also have to be read! And who has time to read…)
Reflection: I’m pleasantly surprised by the currently active LW 2.0. I think this is some evidence against my claim, but overall, I still think that when it comes to building out more detailed models with explicit claims, especially when it involves working with other people, most people are not willing to put in the extra work. (Especially if their name isn’t attached to it.)
It was clear to me how to address this issue. People are willing to do what they are already doing. In particular: blogging. It didn’t seem that hard to take the software we had and really optimize it to be a better blogging platform, at least for some audience (like math bloggers). And it seemed obvious to me that we would at least get some users that way. The key difference from our path at the time was that instead of solving the Discussion problem and trying to get people to do new things, we’d simply focus on building a better tool for a thing people already do. Then once we had people on our platform, we could help improve the ongoing discussions.
Reflection: This was me finally channeling the “greedy” user acquisition strategy.
At some point during the debate we considered trying both projects in parallel, but at the end, Eric and Steph decided to leave. I’d take Arbital in the new direction by myself. (Huge thank you to Anna Salmon for helping to mediate that discussion. I’d say it went pretty well, all things considered.)
Chapter 5: Arbital 2.0
I spent the rest of 2017 working on Arbital 2.0. At first it was going very well. The vision felt very clear to me, I had my mind totally wrapped around it, and all parts of the strategy made sense. But for some reason, around summer of 2017 it became really hard to work. I spent a lot of the time trying to code, but being unable to. Even though intellectually I believed in the idea very much, my spirit was burned out / my System 1 just didn’t believe in the project. After struggling with it on and off for the remaining half of the year, I finally had to admit to myself that it just didn’t have enough momentum to succeed.
(The rest of this chapter is a Reflection.)
My best guess is that I was burnt out again. Even though I didn’t feel as bad as I did in January, the feeling of being unable to even touch the laptop was very similar.
For those curious and for those looking for a startup idea, I’m going to describe my plan for Arbital 2.0. In short, it’s Tumblr for mathematicians. You could use it as a blog, but it’s really a social network. What makes it radically different is the ability for one person to create and own multiple topic-centered channels. (One big issue I see with FB is that it doesn’t scale well with the number of friends. With most friends I only want to talk about certain specific topics. But FB is broadcast-to-all by design.) On Arbital 2.0, I would be able to post about improv, Rick and Morty, AI, scuba-diving, and all my other interests to different channels. People could subscribe to the channels they were interested in. So if you never wanted to listen to politics, you wouldn’t follow people on their political channel. (Or hide all posts with #politics.) Each channel could be grown into a community, where other people could submit their posts too.
I still think this approach is very likely to work:
Write the software. (I have a 70% baked version.)
Go to mathematicians and offer to host their blog on Arbital. (Why mathematicians? There is basically no good blogging software with great support for LaTeX. That feature plus a few others will convince many people to switch or at least to try Arbital.) When I cold emailed 100+ math bloggers, I got a good number of pretty enthusiastic responses. Path for 0 to 1000 users seems straightforward.
Most math bloggers also blog about other things. This naturally will lead them to use the channels feature.
Many people have topics that they can’t discuss on Facebook because they don’t want to spam their friends. (I’d make at least one Rick and Morty post a day if I could.) Arbital would be the perfect outlet for those. (I’d assign 20% probability that it’s possible to skip and succeed with this step directly without bothering with recruiting math bloggers first.)
Then follow the original Arbital plan: Social Network → Explanations → Debate → Notifications → Rating → Karma.
And, of course, put the entire thing on a blockchain to make crazy money in the meantime. ;)
It’s pretty clear to me that for Arbital to work at scale it has to be a social network. Part of why I don’t think most other paths will work is that social media ate all the free time. It’s not that people became lazy, it’s just when it’s a choice between spending another 15 mins on FB or spending that 15 mins creating and linking claims to your new blog post, most people will choose FB. (And while FB is the best example, the problem is more widespread, of course. Everything became more addictive.) This is why the new approach was to create a social network that’s better than FB and would allow you to manage your time and attention. And then from there we could actually put that saved time and attention to more useful things. (Although, I’m still sceptical that there are enough people who will have constructive debates to warrant all this effort.)
One reason I’m not pursuing this right now (aside from being burnt out with the whole enterprise) is that it no longer obviously helps with AI safety. If you recall, one of the assumptions was that if we did Arbital specifically for AI safety, the website wouldn’t get enough credibility. With some recent developments, I think that’s no longer the case. AI safety in general is more accepted, as well as MIRI’s research in particular. So, if I did any work in this space again, I’d, ironically enough, go back to the original vision of creating an explanation website for Eliezer and other people who wanted to write about AI safety. (But, actually, I think the bottleneck is good content and people actually reading it.)
What’s going to happen with Arbital?
I’m currently in the process of shutting down the company. All the software and IP is going to be turned over to MIRI. A few people expressed interest in having Arbital be open sourced, including one of our investors, so that’s likely to happen.
Arbital tech stack
Arbital 1.0: Golang on BE, Angular 1.0 on FE, MySQL DB, markdown editor.
Arbital 2.0: NodeJS on BE, React on FE, ArangoDB, SlateJS editor. (Much much better choices all around!)
How much do you think technical skill mattered on the margin?
A lot. This was a pretty complex project, so managing code complexity was important. We also needed to continuously optimize things to make sure everything loaded decently fast. We had to make sure there weren’t many bugs, because most things were user-facing. And being able to code decently fast helped a lot, since the amount of features we had to implement was fairly large.
I hated working with our lawyers. This was may be the most frustrating part of the entire project.
Lesson 1: work only with a team who is recommended to you personally by someone you trust and who has worked with them before.
Lesson 2: Ask how much something will take upfront. If the lawyer wants to spend more than an hour on anything, have them double check that you want it done. Have them send you a weekly report of time spent.
Lesson 3: Consider just not going with lawyers and using standard paperwork. Before Series A it just doesn’t matter and you can restructure things any way you want later.
Being a single founder is not great, but there are actual reasons for it that in principle could be mitigated.
It’s unlikely you have all the skills. (Note that the situation is not that different if you have a co-founder, but they are very similar to you.) More important than the skills though, is your personality / inclination. Personally, I’d rather code than talk to users. So an ideal co-founder for me would be someone outgoing, who’d prefer talking to users than doing other things.
Not having someone to talk to day-to-day means you might end up with a tunnel vision / stuck / doing unimportant things / forgetting to take a step back / making basic mistakes. Having someone to talk to on frequent basis is important.
It feels bad when you don’t work / are stuck and the project doesn’t move forward. When you’re working with someone else, they usually make progress even when you don’t.
It’s important that in each area there is a single person who is ultimately responsible for it. Product: one person. Business: one person. And, overall, for the company: one person. Assigning a responsibility to more than one person will significantly increase the communication overhead and slow things down.
Again, a heartfelt thank you to everyone who has participated in this adventure. It has been quite a journey and, at the end of the day, I wouldn’t take any of it back.