GraphQL tutorial for LessWrong and Effective Altruism Forum

This post is a tu­to­rial on us­ing GraphQL to query for in­for­ma­tion about LessWrong and the Effec­tive Altru­ism Fo­rum. It’s mostly in­tended for peo­ple who have wanted to ex­plore LW/​EA Fo­rum data but have found GraphQL in­timi­dat­ing (this was the case for my­self un­til sev­eral weeks ago).

Gen­eral steps for writ­ing a query

(This sec­tion will make more sense if you have seen some ex­am­ple queries; see next sec­tion.)

For the queries that I know how to do, here is the gen­eral out­line of steps:

  1. Go to https://​​www.less­wrong.com/​​graphiql or https://​​fo­rum.effec­tivealtru­ism.org/​​graphiql de­pend­ing on which fo­rum you want to query data for.

  2. Figure out what the out­put type should be (e.g. com­ments, com­ment, posts, post).

  3. Type {out­put_type(in­put)} into GraphiQL and hover over in­put.

    Here is what it looks like for the com­ment out­put type:

    Here is what it looks like for the com­ments out­put type:

  4. Click on the type that ap­pears af­ter in­put (e.g. Mul­tiCom­men­tIn­put, Sin­gleCom­men­tIn­put). A column on the right should ap­pear (if it was not there already). Depend­ing on the fields listed in that column, there will now be two ways to pro­ceed. (Gen­er­ally, it seems like sin­gu­lar out­put types (e.g. com­ment) will have se­lec­tor and plu­ral out­put types (e.g. com­ments) will have terms.)

    Here is what it looks like for the com­ment out­put type. In the image, I have already clicked on Sin­gleCom­men­tIn­put so you can see se­lec­tor un­der the doc­u­men­ta­tion (right­most) column.

    Here is what it looks like for the com­ments out­put type. Again, in this image, I have already clicked on Mul­tiCom­men­tIn­put so you can see terms un­der the doc­u­men­ta­tion (right­most) column.

    In the fields listed, if there is se­lec­tor (e.g. for com­ment):

    • Click on the se­lec­tor type (e.g. Com­men­tS­elec­torUniqueIn­put). Use one of the fields (e.g. _id) to pick out the spe­cific item you want.

      Here is what you should click on:

      What it looks like af­ter you have clicked:

    If there is terms (e.g. com­ments):

    • Go to the col­lec­tions di­rec­tory in the LessWrong 2.0 code­base, and find the views.js file for your out­put type. For ex­am­ple, if your out­put type is com­ments, then the cor­re­spond­ing views.js file is lo­cated at col­lec­tions/​com­ments/​views.js.

    • Look through the var­i­ous “views” in the views.js file to see if there is a rele­vant view. (There is also a de­fault view if you don’t se­lect any view.) The main things to pay at­ten­tion to are the se­lec­tor block (which con­trols how the re­sults will be filtered) and the op­tions block (which mainly con­trols how the re­sults are sorted).

    • Pass in pa­ram­e­ters for that view us­ing keys in the terms block.

  5. Start a re­sults block, and se­lect the fields you want to see for this out­put type. (If you don’t se­lect any fields, it will de­fault to all fields, so you can do that once and delete the fields you don’t need.)

Examples

I’ve built a sam­ple in­ter­face for both LessWrong and EA Fo­rum that al­lows an easy way to ac­cess the queries used to gen­er­ate pages:

By pass­ing for­mat=queries in the URL to any page, you can view the GraphQL queries that were made to gen­er­ate that page. Rather than show­ing many ex­am­ples in this post, I will just show one ex­am­ple in this post, and let you ex­plore the reader.

As an ex­am­ple, con­sider the page https://​​eafo­rum.is­sarice.com/​​?view=top. Click­ing on “Queries” at the top of the page takes you to the page https://​​eafo­rum.is­sarice.com/​​?view=top&offset=0&be­fore=&af­ter=&for­mat=queries Here you will see the fol­low­ing:

    {
      posts(in­put: {
        terms: {
          view: “top”
          limit: 50
          meta: null  # this seems to get both meta and non-meta posts



        }
      }) {
        re­sults {
          _id
          ti­tle
          slug
          pageUrl
          post­edAt
          baseS­core
          voteCount
          com­mentsCount
          meta
          ques­tion
          url
          user {
            user­name
            slug
          }
        }
      }
    }

Run this query



    {
      com­ments(in­put: {
        terms: {
          view: “re­cen­tCom­ments”
          limit: 10
        }
      }) {
        re­sults {
          _id
          post {
            _id
            ti­tle
            slug
          }
          user {
            _id
            slug
          }
          plain­tex­tEx­cerpt
          htm­lHigh­light
          postId
          pageUrl
        }
      }
    }

Run this query

Click­ing on “Run this query” (not linked in this tu­to­rial, but linked in the ac­tual page) be­low each query will take you to the GraphiQL page with the query preloaded. There, you can click on the “Ex­e­cute Query” but­ton (which looks like a play but­ton) to ac­tu­ally run the query and see the re­sult.

I should note that my reader im­ple­men­ta­tion is op­ti­mized for my own (prob­a­bly un­usual) con­sump­tion and learn­ing. For ar­ti­cle-read­ing and com­ment­ing pur­poses (i.e. not for learn­ing how to use GraphQL), most users will prob­a­bly pre­fer to use the offi­cial ver­sions of the fo­rums or the GreaterWrong coun­ter­parts.

Tips

  • In GraphiQL, hov­er­ing over some words like in­put and re­sults and then click­ing on the re­sult­ing tooltip will show the pa­ram­e­ters that can be passed to that block.

  • Fo­rum search is not done via GraphQL. Rather, a sep­a­rate API (the Al­go­lia search API) is used. Use of the search API is out­side the scope of this tu­to­rial. This is also why the search re­sults page on my reader (ex­am­ple) has no “Queries” link (for now).

  • For queries that use a terms block: even though a “view” is just a short­hand for a se­lec­tor/​op­tions pair, it is not pos­si­ble to pass in ar­bi­trary se­lec­tor/​op­tions pairs (due to the way se­cu­rity is han­dled by Vul­can). If you don’t use a view, the de­fault view is se­lected. The main con­se­quence of this is that you won’t be able to make some queries that you might want to make.

  • Some queries are hard/​im­pos­si­ble to do. Ex­am­ples: (1) get­ting com­ments of a user by plac­ing con­di­tions on the par­ent com­ment or post (e.g. find­ing all com­ments by user 1 where they are re­ply­ing to user 2); (2) query­ing and sort­ing posts by a func­tion of ar­bi­trary fields (e.g. as a func­tion of baseS­core and voteCount); (3) find­ing the high­est-karma users look­ing only at the past days of ac­tivity.

  • GraphQL vs GraphiQL: /​graphiql seems to be end­point for the in­ter­ac­tive ex­plorer for GraphQL, whereas /​graphql is the end­point for the ac­tual API. So when you are ac­tu­ally query­ing the API (via a pro­gram you write) I think you want to be us­ing https://​​www.less­wrong.com/​​graphql and https://​​fo­rum.effec­tivealtru­ism.org/​​graphql (or at least, that is what I am do­ing and it works).

Acknowledgments

Thanks to:

  • Louis Franc­ini for helping me with some GraphQL queries and for feed­back on the post and the reader.

  • Oliver Habryka for an­swer­ing some ques­tions I had about GraphQL.

  • Vipul Naik for fund­ing my work on this post and some of my work on the reader.