Rohin Shah comments on A survey of tool use and workflows in alignment research

Rohin Shah 28 Mar 2022 9:18 UTC
LW: 12 AF: 6
0
AF
I’m curious how well a model finetuned on the Alignment Newsletter performs at summarizing new content (probably blog posts; I’d assume papers are too long and rely too much on figures). My guess is that it doesn’t work very well even for blog posts, which is why I haven’t tried it yet, but I’d still be interested in the results and would love it on the off chance that it actually was good enough to save me some time.
What links here?
- Prize for Alignment Research Tasks by stuhlmueller (29 Apr 2022 8:57 UTC; 64 points)
- jacquesthibs 4 Apr 2022 21:35 UTC
  LW: 7 AF: 4
  0
  AF Parent
  We could definitely look into making the project evolve in this direction. In fact, we’re building a dataset of alignment-related texts and a small part of the dataset includes a scrape of arXiv papers extracted from the Alignment Newsletter. We’re working towards building GPT models fine-tuned on the texts.
  - Logan Riggs 5 Apr 2022 17:33 UTC
    LW: 3 AF: 1
    0
    AF Parent
    Ya, I was even planning on trying:
    [post/blog/paper] rohinmshah karma: 100 Planned summary for the Alignment Newsletter: \n>
    Then feed that input to.
    Planned opinion:
    to see if that has some higher-quality summaries.
    - Rohin Shah 4 Sep 2022 7:43 UTC
      LW: 4 AF: 3
      0
      AF Parent
      Well, one “correct” generalization there is to produce much longer summaries, which is not actually what we want.
      (My actual prediction is that changing the karma makes very little difference to the summary that comes out.)