TT Self Study Journal # 4
[Epistemic Status: This is an artifact of my self study. I am using it to remember links and help manage my focus. As such, I don’t expect anyone to fully read it. If you have particular interest or expertise, skip to the relevant sections, and please leave a comment, even just to say “good work/good luck”. I’m hoping for a feeling of accountability and would like input from peers and mentors. This may also help to serve as a guide for others who wish to study in a similar way to me. ]
Review of 3rd Sprint
My goals for this sprint were:
SSJ--1 -- Write
AIA Terminology Lit Review
Math in my AI Alignment Goals
SSJ--2 -- Read
SSJ--3 -- Math
Do some Linear Algebra reading and practice.
SSJ--4 -- Experimentation
Go through Transformers From Scratch.
SSJ--5 -- Tooling
Do an informal literature review on MI Tooling and Data Visualization for High Dimensional Data.
Places to start for MI Tooling:
TransformerLens & Callum McDougall’s guide for it.
Nostalgebraist’s transformer-utils library
Google PAIR’s Learning Interpretability Tool (LIT)
Google PAIR’s What-If Tool
Jesse Vig’s BERTViz
SSJ--6 -- Social
Email math profs after finished writing “Math in my AI Alignment Goals”.
Consider other places to find potential mentors.
Consider what kinds of feedback I am looking for.
Should I reach out specific people for general advice on my SSJ or only once I have specific questions for them and their work?
Make some posts in various forums asking for people willing to review and comment on my SSJ.
I am also networking with the goal of finding future funding or paid roles. Consider strategies for that.
So how did I do?
Daily Worklog
| Date | Progress |
| Wd, July 23 | No progress. I did talk on the PauseAI discord a bit and think about OIS, but it’s not really actively relevant. Alas. |
Th, July 24 - We, Aug 13 | No progress with SSJ goals while studying for final exam, but i did write two LW articles while procrastinating. Also went camping so no progress. But now I’m done with school and ready to fully focus on this : ) |
| Th, Aug 14 | Spent about 5 hours thinking about the social media concept I’m now calling “maat” and wrote Tristan’s Projects which I plan to keep as an updated index of the projects I’m interested in developing or contributing to. |
| Fr, Aug 15 | Spent 3 hours compiling an overview of my ndsp project. |
| Mo, Aug 18 | Spent 5h 36m applying to SPAR. There are a lot of cool researchers with a lot of cool projects there. I could have spent a lot more time reading about their work and answering their questions, but I want to move on. |
| Tu, Aug 19 | Spent an hour or two moving my maat ideas closer to a post. Also spent some time reading and considering “Agent foundations: not really math, not really science”… I’ll probably write a post with some of my thoughts. |
| Aug—Oct | For the rest of August through September and October I got various combinations of too busy, distracted, and depressed to continue proper documentation. Within that time I:
I continue to find it surprising how easy it is to lose large amounts of time and how difficult it is to treat a self managed project with the same seriousness of a full time job. |
Sprint Summary
Overview
During the past months I neglected focus on any of my chosen goals in favour of ad-hoc applying to fellowships. This both that my goals are not in alignment with what I think is really important, but also that I have difficulty prioritizing overhead organization and documentation, which I think are valuable, so I would like to improve and get back to doing this journal work.
SSJ--1 -- Write
I still like the idea of doing an AIA Terminology Review. I am less convinced by the value of discussing Math in my AIA goals specifically, rather, multiple people have suggested I should focus on writing and communicating about my “Theory of Change” both focusing on the importance of my (potential) role and the importance of my agendas. I would also like to try putting together better article to briefly explain my OIS concept based on various conversations and thoughts, and finally write the MAAT article I was planning to write.
SSJ--2 -- Read
I started skimming this but didn’t really engage and then got busy. I think it may be a good resource to be included in my AIA Terminology Review.
SSJ--3 -- Math
I didn’t actually practice any linear algebra. It’s much harder to do so when I’m not handing it in to a professor for marks. I did however pick up my copy of C.Kosniowski’s “algebraic topology” which I find engaging. The concepts may or may not have value for my thinking on semantic spaces. I think I still have a shallow understanding of topology which I would like to deepen towards a solid understanding of manifolds.
SSJ--4 -- Go through Transformers From Scratch.
This still seems high value. I still haven’t started.
SSJ--5 -- Literature review on MI Tooling and Etc...
I didn’t focus on this. I’m torn between wanting to jump in and begin implementation work, however, a review beforehand is probably quite valuable.
SSJ--6 -- Social
The focuses in this section may have been of higher benefit than applying to the various fellowships I have applied to. Additionally, I often got an email about a fellowship deadline, worked on an application before the deadline and then submitted it and moved on to something else. This approach seems somewhat aimless and disorganized. I would like to be keeping a better track of which fellowships I have targeted with what levels of effort and also have a better sense of what fellowships are out there and what other options I can consider.
Goals for 4th Sprint
The Goals:
SSJ--1 -- Write
Make an article or doc to contain and organize articles I would like to write.
Theory of Change
OIS explainer
MAAT
AIA Terminology Review
SSJ--2 -- Read
Search and read various articles for AIA Terminology Review.
Spend some time reading and comment on one random LW article 4 days / week.
SSJ--3 -- Math
Low priority: Continue reading C.Kosniowski’s “algebraic topology”
SSJ--4 -- Experimentation (copied from last sprint)
Go through Transformers From Scratch.
SERIOUSLY! Clock in some time on this!
SSJ--5 -- Tooling (copied from last sprint)
Do an informal literature review on MI Tooling and Data Visualization for High Dimensional Data.
Places to start for MI Tooling:
TransformerLens & Callum McDougall’s guide for it.
Nostalgebraist’s transformer-utils library
Google PAIR’s Learning Interpretability Tool (LIT)
Google PAIR’s What-If Tool
Jesse Vig’s BERTViz
SSJ--6 -- Social
Develop my networking plan
Create a list of people I respect who may be worth reaching out to for mentorship or networking.
Research and reach out to people where possible and pragmatic.
Clarify the problems I am interested in focusing on and the capacity in which I am interested in focusing on them. (High overlap with SSJ--1 “Theory of Change” )
Well. The last sprint went of the rails. Somewhat disappointing but I am wishing myself luck getting back on track.
List of common acronyms:
Mechanistic Interpretability (MI)
AI Alignment (AIA)
Outcome Influencing System (OIS)
n-Dimensional Interactive Scatter Plot (NDISP)
Machine Learning (ML)
Large Language Model (LLM)
Good job, and I wish you all the luck in your endeavors! I think the format of the journal could benefit from adding something like a quantitative assessment of how much you have done compared to what you planned to do. It would (hopefully) help you calibrate, state clear and achievable goals, ease readability, and, in addition, help others calibrate which tasks are harder or easier.
Thank you!
I agree about the value of quantitative assessments, but assigning numbers based on vibes feels hokey to me, so I’ll only do it if I can find quantifiable metrics that seem valuable. I think “do task x on y number of days” seem quantifiable, so I may include more goals like that.
I definitely want to improve clarity and readability. I have a bit of animosity towards “achievable goals” I’m the sort of person who wants to set goals I want regardless of if they’re achievable, and then stubbornly keep trying on my impossible goals until I make progress, but as mentioned in “A Pragmatic Vision for Interpretability”, separating those kinds of “impossible” or “north star / guiding” goals from the object level goals I expect to actually achieve is probably a good idea. I’ll put some thought into that.
Thanks again for engaging. It means a lot to me : )
Nice work in keeping up your public journal.