[Epistemic Status: This is an artifact of my self study. I am using it to remember links and help manage my focus. As such, I don’t expect anyone to fully read it. If you have particular interest or expertise, skip to the relevant sections, and please leave a comment, even just to say “good work/good luck”. I’m hoping for a feeling of accountability and would like input from peers and mentors. This may also help to serve as a guide for others who wish to study in a similar way to me. ]

Review of 3rd Sprint

My goals for this sprint were:

SSJ--1 -- Write
- AIA Terminology Lit Review
- Math in my AI Alignment Goals
SSJ--2 -- Read
- Shallow Review of Technical AI Safety 2024
SSJ--3 -- Math
- Do some Linear Algebra reading and practice.
SSJ--4 -- Experimentation
- Go through Transformers From Scratch.
SSJ--5 -- Tooling
- Do an informal literature review on MI Tooling and Data Visualization for High Dimensional Data.
- Places to start for MI Tooling:
  - The Interpretability Toolkit
  - TransformerLens & Callum McDougall’s guide for it.
  - Nostalgebraist’s transformer-utils library
  - Google PAIR’s Learning Interpretability Tool (LIT)
  - Google PAIR’s What-If Tool
  - Jesse Vig’s BERTViz
  - LOOM
  - CircuitVis
SSJ--6 -- Social
- Email math profs after finished writing “Math in my AI Alignment Goals”.
- Consider other places to find potential mentors.
  - Consider what kinds of feedback I am looking for.
  - Should I reach out specific people for general advice on my SSJ or only once I have specific questions for them and their work?
  - Make some posts in various forums asking for people willing to review and comment on my SSJ.
- I am also networking with the goal of finding future funding or paid roles. Consider strategies for that.

So how did I do?

Daily Worklog

Date	Progress
Wd, July 23	No progress. I did talk on the PauseAI discord a bit and think about OIS, but it’s not really actively relevant. Alas.
Th, July 24 - We, Aug 13	No progress with SSJ goals while studying for final exam, but i did write two LW articles while procrastinating. Also went camping so no progress. But now I’m done with school and ready to fully focus on this : )
Th, Aug 14	Spent about 5 hours thinking about the social media concept I’m now calling “maat” and wrote Tristan’s Projects which I plan to keep as an updated index of the projects I’m interested in developing or contributing to.
Fr, Aug 15	Spent 3 hours compiling an overview of my ndsp project.
Mo, Aug 18	Spent 5h 36m applying to SPAR. There are a lot of cool researchers with a lot of cool projects there. I could have spent a lot more time reading about their work and answering their questions, but I want to move on.
Tu, Aug 19	Spent an hour or two moving my maat ideas closer to a post. Also spent some time reading and considering “Agent foundations: not really math, not really science”… I’ll probably write a post with some of my thoughts.
Aug—Oct	For the rest of August through September and October I got various combinations of too busy, distracted, and depressed to continue proper documentation. Within that time I: Applied to several fellowships Attended an EA Summit in Vancouver Continued discussing and considering AIA Recovered some standards of self care lost during my BSc I continue to find it surprising how easy it is to lose large amounts of time and how difficult it is to treat a self managed project with the same seriousness of a full time job.

Sprint Summary

Overview

During the past months I neglected focus on any of my chosen goals in favour of ad-hoc applying to fellowships. This both that my goals are not in alignment with what I think is really important, but also that I have difficulty prioritizing overhead organization and documentation, which I think are valuable, so I would like to improve and get back to doing this journal work.

SSJ--1 -- Write

I still like the idea of doing an AIA Terminology Review. I am less convinced by the value of discussing Math in my AIA goals specifically, rather, multiple people have suggested I should focus on writing and communicating about my “Theory of Change” both focusing on the importance of my (potential) role and the importance of my agendas. I would also like to try putting together better article to briefly explain my OIS concept based on various conversations and thoughts, and finally write the MAAT article I was planning to write.

SSJ--2 -- Read

I started skimming this but didn’t really engage and then got busy. I think it may be a good resource to be included in my AIA Terminology Review.

SSJ--3 -- Math

I didn’t actually practice any linear algebra. It’s much harder to do so when I’m not handing it in to a professor for marks. I did however pick up my copy of C.Kosniowski’s “algebraic topology” which I find engaging. The concepts may or may not have value for my thinking on semantic spaces. I think I still have a shallow understanding of topology which I would like to deepen towards a solid understanding of manifolds.

SSJ--4 -- Go through Transformers From Scratch.

This still seems high value. I still haven’t started.

SSJ--5 -- Literature review on MI Tooling and Etc...

I didn’t focus on this. I’m torn between wanting to jump in and begin implementation work, however, a review beforehand is probably quite valuable.

SSJ--6 -- Social

The focuses in this section may have been of higher benefit than applying to the various fellowships I have applied to. Additionally, I often got an email about a fellowship deadline, worked on an application before the deadline and then submitted it and moved on to something else. This approach seems somewhat aimless and disorganized. I would like to be keeping a better track of which fellowships I have targeted with what levels of effort and also have a better sense of what fellowships are out there and what other options I can consider.

Goals for 4th Sprint

The Goals:

SSJ--1 -- Write
- Make an article or doc to contain and organize articles I would like to write.
- Theory of Change
- OIS explainer
- MAAT
- AIA Terminology Review
SSJ--2 -- Read
- Search and read various articles for AIA Terminology Review.
- Spend some time reading and comment on one random LW article 4 days / week.
SSJ--3 -- Math
- Low priority: Continue reading C.Kosniowski’s “algebraic topology”
SSJ--4 -- Experimentation (copied from last sprint)
- Go through Transformers From Scratch.
  - SERIOUSLY! Clock in some time on this!
SSJ--5 -- Tooling (copied from last sprint)
- Do an informal literature review on MI Tooling and Data Visualization for High Dimensional Data.
- Places to start for MI Tooling:
  - The Interpretability Toolkit
  - TransformerLens & Callum McDougall’s guide for it.
  - Nostalgebraist’s transformer-utils library
  - Google PAIR’s Learning Interpretability Tool (LIT)
  - Google PAIR’s What-If Tool
  - Jesse Vig’s BERTViz
  - LOOM
  - CircuitVis
SSJ--6 -- Social
- Develop my networking plan
  - Create a list of people I respect who may be worth reaching out to for mentorship or networking.
  - Research and reach out to people where possible and pragmatic.
  - Clarify the problems I am interested in focusing on and the capacity in which I am interested in focusing on them. (High overlap with SSJ--1 “Theory of Change” )

Well. The last sprint went of the rails. Somewhat disappointing but I am wishing myself luck getting back on track.

List of common acronyms:

Mechanistic Interpretability (MI)
AI Alignment (AIA)
Outcome Influencing System (OIS)
n-Dimensional Interactive Scatter Plot (NDISP)
Machine Learning (ML)
Large Language Model (LLM)

TT Self Study Journal # 4