Open Thread Winter 2025/26
If it’s worth saying, but not worth its own post, here’s a place to put it.
If you are new to LessWrong, here’s the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don’t want to write a full top-level post.
If you’re new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Hello All,
New to LW and still reading through the intro material and getting a hang of the place. I am ashamed to admit I found this place through Reddit—ashamed because I despise Reddit and other social media.
I came here because I cannot find a place to engage in long form discussions about ideas contrary to my own. I dream of a free speech platform where only form is policed, not content. Allowing any idea to be voiced no matter how fringe as long as it adheres to agreed upon epistemic standards.
Anyways, I know LW probably is not that place but it is adjacent. It seems most people here want to discuss AI research but hoping to find some communities outside of that topic.
Hello and welcome! There’s a few of us around who discuss things other than AI research, myself among them. I suggest looking at the filtering options for the front page; it’s the gear next to Latest, Enriched, Recommended, and Bookmarks. I filter the AI tag pretty heavily down.
If you want to lean into voicing fringe ideas around here, I’d suggest reading the LessWrong Political Prerequisites and maybe Basics of Rationalist Discourse. They’re not universally agreed upon, but I think they do make for a decent pointer to the local standards.
Are you familiar with Astral Codex Ten (also called ACX)? The people there are also mostly smart, the rules of discussion in Open Threads are more relaxed… which can be a good thing or a bad thing, depends.
I hate Reddit too. Their moderation policies are completely at odds with the vision the founders originally described. I’ve always wondered what the founders are thinking these days.
Hello, all!
Long-time lurker here. I’m a recent psychology undergraduate who cares about kelp forests and people. I’m currently exploring the viability of blue carbon sequestration and alt protein projects(which involve kelp). This is part of my broader investigation into climate change risks and adaptation strategies.
I’m trying to find the best ways to use my limited time and resources to choose an effective career path and test my fit for different self-employment options. I am currently testing my fit for coaching and coding. I’ve also restarted my exploration into philosophy, specifically Stoicism, with Marcus Aurelius’ Meditations.
I’m also a yoga practitioner and teacher. I’m interested in learning how yogic philosophy and rationality align (I began thinking of this after living with someone who is hyper-rational but is passionately indifferent towards yoga).
I’m also a nut for life optimisation, but struggle with executing optimisation strategies.
Being a lurker is fairly low effort, but I wanted to begin interacting more with the cool people here. I’m still quite intimidated by the whole karma system for posting, but I think I’ll find my way around it fairly quickly.
Any tips and guidance are much appreciated! I’m a lifelong learner and hyper-curious, so please throw any amount of information at me. Thank you for being part of such a unique community!
Nice to see you here!
My main tip is to “just” try commenting more and saying what’s on your mind. In general, people are often overly blocked from saying interesting things because they assume that what seems obvious to them also seems obvious to other people. Relatedly, similarly to how you bulk first and cut later, you learn to babble first and prune later. See also this comment.
Any thoughts on the relative feasibility of engineering algae blooms (via things like nutrient seeding) for long-term carbon sequestration?
Is there a way to make the list of posts shown on lesswrong.com use the advanced filters I have set up at lesswrong.com/allPosts? I hate hate hate all of Recent, Enriched and Recommended (give me chronological or give me death) but given that I already have a set of satisfactory filters set up, rendering them on the main page seems like a feature that should exist, if only I can find it.
In case folks missed it, the Unofficial LessWrong Community Census is underway. I’d appreciate if you’d click through, perhaps take a survey, and help my quest for truth- specifically, truth about what the demographics of the website userbase looks like, what rationality skills people have, whether Zvi or Gwern would win in a fight, and many other questions! Possibly too many questions, but don’t worry, there’s a question about whether there’s too many questions. Sadly there’s not a question about whether there’s too many questions about whether there’s too many questions (yet, growth mindset) so those of you looking to maximize your recursion points will have to find other surveys.
If you’re wondering what happens to the data, I use it for results posts like this one.
Heads up, I’m planning to close the census sometime tomorrow. You can take it here if you haven’t yet!
Hi all! I’m a long-time LWer, but I’m making a comment thread here so that my research fellows can introduce themselves under it!
For the past year or so I’ve been running the Dovetail research fellowship in agent foundations with @Alfred Harwood. We like to have our fellows make LW posts about what they worked on during the fellowship, and everyone needs a bit of karma to get started. Here’s a place to do that!
Hi everyone!
I’m Santiago Cifuentes, and I’ve been a Dovetail Fellow since November 2025 working on Agentic Foundations. My current research project consists in extending previous results that aim to characterize which agents contain world models (such as https://arxiv.org/pdf/2506.01622). On a similar line, I would like to provide a more general definition of what a world model is!
I’ve been silently lurking LessWrong since 2023, and I came across the forum while looking for rationality content (and in particular I found The Sequences quite revealing). I am looking forward to contribute to the discussion!
Hi everyone!
I am Margot Stakenborg, and I have worked with Dovetail in this winter fellowship cohort. I have a background in theoretical physics and philosophy of physics, and now making a switch into conceptual mechinterp, after having been interested in it and learning about it for some years. I have been working with Dovetail on formalising world models, I am writing up a sequence of posts on the philosophical and mathematical prerequisites for proper world models, and which tools from physics can help us understand and analyse different world models, and I will dive into the different definitions of “world model” that float around in mechinterp and AI safety literature. Things I will discuss are:
How is the concept “world model” used in different areas of ML literature
Concept representation in the brain: new frontiers from neuroscience
Tools from physics: renormalisation and coarse-graining
What are “natural features”?
When can networks find similar representations of the world as we do?
Can NNs discover new natural kinds?
Theoretical equivalence and intertheoretic reduction
Bayesian experimental design
And probably more..
I hope to build this out into a quite comprehensive and complete sequence. Do let me know if there are other questions or subjects you would be interested in to read about!
Hello! I’m Guillermo, a fellow in the Winter25 cohort.
I have a background in mathematics, computer science and particularly computational neuroscience.
In my project I am looking at the Reward Hypothesis in decision theory and reinforcement learning theory and would like to write a digest of what are the main results that connect a preference order from order-preserving functions, to expected utility maximization and reward functions (with discount factors). I furthermore would like to formalize some of the key results in Lean.
Overall I am interested in topics that connect rationality and decision theory all the way to practical aspects of machine learning and reinforcement learning, to try to bridge these topics for AI Safety.
Nice to meet you all!
Hi everyone!
My name is Robert Adragna, and I’ve been working with Dovetail this winter fellowship cohort on Agent Foundations. Specifically, I’ve been trying to better understand what background assumptions the Natural Abstractions Hypothesis (NAH) makes about the world, and whether they might be learned in existing LLM systems. Specific questions that I’m exploring include:
Is the Platonic Representation Hypothesis from deep learning evidence for the Natural Abstractions Hypothesis?
Is it possible to construct a dataset which represents the world in a completely unbiased way?
How can Natural Abstractions be both universal & observer/goal dependant?
What would it take to empirically test the NAH?
I’ve been lurking on LessWrong since 2024, when I got interested in AI Safety, and am very excited to spend more time engaging with the community.
Hello everyone!
I am Léo Cymbalista, one of the Dovetail fellows since November 2025. I’m a physics undergraduate in the process of switching to theoretical AI safety research. My current research project is writing an explainer for computational mechanics, which is almost finished. I hope to use the knowledge I acquired while researching for it to answer questions such as “given 2 coupled stochastic processes, when can we say that one is modeling the other?”, which could be useful for investigating the presence of world models in agents.
I have only known about LW (and AI safety) for about a year, so I’m still not very familiar with it, but it seems very interesting so far!
Hi, I’m Vardhan, one of the Dovetail fellows this winter. Thanks Alex & Alfred for running this!
Background: I study mathematics and computer science (probability, algorithms, game theory) and I’m interested in formal models of agents and multi-agent interaction.
For the fellowship, I looked at the question: Which agents can be faithfully described by finite automata / finite transducers, and which structural properties make that more or less likely? In other words, when can an agent’s externally observable behavior be captured by a finite (possibly stochastic) automaton, and what observable signatures indicate that a finite-state model is impossible or misleading?
I’ve written a brief report summarizing definitions, toy examples, and some light lemmas. I’m planning a longer post with formal definitions, more examples, and proofs. I’d really appreciate recommendations on literature I may have missed (especially anything linking automata/dynamical-systems perspectives to algorithmic information theory, ergodic theory, or learning theory). Comments, questions, and pointers very welcome!
Hi! I’m very new to LW.
I found this website while searching up philosophy websites that are useful. I’ve been looking around LW for about a week now, just reading and learning peoples takes. There’s a lot of it and it’s great if you ask me.
I’m still learning the guidelines and the karma system, which had been a little intimidating, but I’m getting the hang of it now. I do recognise that LW is more professional than I originally thought, especially professional for my age, but it’s not like I’m applying to work for Nasa or anything.
That’s just me, though. I would greatly appreciate any tips for navigating, filtering content, etc.
A good resource to get familiar with the basics of LW approach to life, universe, and everything is https://www.readthesequences.com/
I feel like the react buttons are cluttering up the UI and distracting. Maybe they should be e.g., restricted to users with 100+ karma and everyone gets only one react a day or something?
Like they are really annoying when reading articles like this one.
Yeah, I agree with this. I think they are generally decent on comments, but some users really spam them on posts. It’s on my list to improve the UI for that.
Seems like a signal-to-noise problem. Some amount seems like a useful signal, but too much is too hard to digest. Privileges based on karma make some sense but restricting it based on time (1/day/user or something) seems pretty crude, so I don’t like that idea.
Not sure if this is a good idea either, but the number of reacts allowed per post could be based on the amount of karma that user generated on comments on that post. That way, a user who’s doing too many reacts would be encouraged to just write a comment instead. That still doesn’t seem like exactly the right incentive, but I’m also not sure how I want it to work.
Maybe the ability to filter out reacts from a particular (prolific) user would suffice?
Yeah I would like to mute some users site-wide so that I never see reacts from them & their comments are hidden by default....
Do you have any thoughts on those UI improvement written down anywhere?
I’ll admit to being one of the users that really spams reactions on posts. I like them as a form of highlighting for review and as a form of backchannel communication. I would be much happier if people would use more reacts towards me. So I would be upset with UI modifications to restrict reacts, but fully support updates to make the UI around viewing reacts cleaner and more useful.
I wrote a longer comment with some feature suggestions. If you have time it would be nice to hear your thoughts.
Part of it is the “vulnerability” where any one user can create arbitrary amounts of reacts, which I agree is cluttering and distracting. Limiting reacts per day seems reasonable (I don’t know if 1 is the right number, but it might be, I don’t recall ever react-ing more than once a day myself). Another option (more labor-intensive) would be for mods to check the statistics and talk to outliers (like @TristanTrim) who use way way more reacts than average.
[EDIT: I think issues stem from different people using reacts in different ways and having different assumptions about their use. I think I am probably using them in a less common way than other people, but I also find myself believing I am using them in a better way than other people. As such, I am trying to put in effort to communicate my POV. I would appreciate if anyone who disagrees with me would do so with a higher bandwidth signal than just pressing the “Agreement: Downvote” button. Perhaps by using some inline reacts on my comment?]
Haha! Sorry if I’m bothering anyone! ☺♡
I really like reacts and am bothered in essentially the opposite direction as Sodium in that I think reacts are a very useful backchannel communication, and see it as a minor moral failing that most users do not use them more.
I think it’s great that many reacts are based on LW ideals for discourse. I don’t know exactly how they are managed, but I think they could be even more valuable if there was some team that reviewed how people are currently using them and then improved and updated react descriptions and usage guides based on that. A descriptivist approach.
I also think a prescriptive approach would also be good. People should be suggesting concepts for reacts that they think would be valuable for communication, and people should be figuring out how to promote proper use of reacts.
I do agree that relevance may be an issue. I would like it if everyone would drop ~10 reacts while reading a post, but then, if all of them showed in the UI, it would be too noisy to make sense of easily. I think there are a few ways around this:
[EDIT: I’ve discovered on the triple dot menu it is already possible to select for inline reactions to hide all, hide downvoted, or show all. I think a plausible sane modification of this would be to make the default to hide under 2 or 3 votes and always show reactions to highlighted text. However I think some kind of more complicated scheme could be better.]
Allow users to toggle all reacts on/off. This would be easy to implement and help with the current problem of some users feeling reacts are distracting, but would not help if more people react more.
Change opacity of reacts based on the sum of user karma, so people are not distracted by the opinions of relatively unknown people like myself, unless lots of other people agree.
Make react visibility “subscription” based, so you would only see the reactions of people you are subscribed to.
React “subscription” could be modified for your own posts if you want to see all the reactions to what you have posted but you only want to see particularly relevant reactions while reading other peoples posts.
React “subscription” could be modified with a “karma threshold” so you also see reactions by any user with sufficient karma.
If a users reacts are consistently downvoted, this could be flagged for mods, or allow reporting users for spamming reacts in a way that does not seem to support discourse.
Another issue, I don’t know if this is the case or not, but if each react on your comment or post shows up as it’s own entry in the notifications list, then that would be annoying because it would make it hard to see the more important notifications. So reacts should probably be batched like karma is somehow. (And really, I think a bunch of improvements could be made to the notifications UI.)
All that said, I strongly oppose restricting who can use reacts and how many reacts they can use. Rather, more people should be encouraged to use more reacts more competently and the UI for viewing / ignoring reacts should be improved.
My two cents, I’m happy with the amount of reacts I usually see and would probably enjoy about 20% more.
Thank you for chipping in your two cents!
I use the “typo” reaction and hope it is useful for authors, but I don’t ever go back to remove it if the typo has been corrected. I’m not even sure what happens in that case.
We recently made it so that authors can remove typo reacts themselves. It’s still a bit annoying, but it’s less annoying than before!
I’m curious: what percent of upvotes are strong upvotes? What percent of karma comes from strong upvotes?
Here’s the tally of each kind of vote:
And here’s my estimate of the total karma moved for each type:
The mods may have better overall data, but personally, I weak vote a lot more than I strong vote, and I don’t vote on everything I read either.
Hello! I chose the name “derfriede” for LW. This is my first post here, which I am happy about. I have read some of the introductory materials and am very interested.
What interests me? First of all, I want to explore the topic of AI and photography. I study the theory and philosophy of photography, look for new approaches, and try to apply a wide variety of perspectives. I think it’s useful to address the question of what AI cannot do. It’s very similar to researching glitch culture. Okay, I’ll stop here for now, because I just want to get acquainted.
Have a nice day, wherever you are!
I’m sure that hobbyists on Civitai or TensorArt have some thoughts on it. Many LoRAs are made to evoke antiquated camera technologies, digital and analog (although they often incorporate elements of what we may call ‘art direction’ like costume and furnishing of spaces to match the formats).
I think most people aren’t aware of how much AI there already is, and has been, in their smartphone and the influence that has on their photos.
Welcome! The only thing I can think of on the intersection of AI and photography (besides IG filters) is this weird “camera”, which uses AI to turn a little bit of geographical information to create images. Do you know of any other interesting intersections?
Hello,
I’m very happy to be here!
Unfortunately I’m only just bringing LessWrong into my life and I do consider that a missed opportunity. I wish I had found this site many years ago though that could have been dangerous as this could be a rabbit hole I might have found challenging to escape, but how bad would that have actually been? I’m sure my wife would not have been thrilled. My reason for coming here now unfortunately, especially at this point in time, is very unoriginal. In the last eight months I’ve taken what was a technology career possibly in its waning years, into a new world of wonder and exploration, and yes I’m talking about AI. I’ve been in technology for over 30 years and certainly have paid a little bit of attention to machine learning and AI over this time span but somehow just kind of missed what was really going on in the last two years. I think I was overwhelmed by the level of hype that I was running into and how shallow it often seemed talking about magical prompts that would give you miraculous results and I just assumed that things weren’t really in a very good space. Yet I was very wrong and I’m glad I didn’t wait even longer to discover the true state of things, though not all of it is good.
I’ve been working for 6 months using AI all day long at my day job, using Claude Code and many other tools doing development and platform engineering work. It’s really been in the last a couple months that I’ve started to look more seriously what I found compelling in the world of AI and I kept coming back to one of my earliest observations formed during my early re-engagement of AI this year. That was an instinct that hit me right away after discovering what the new world of LLMs had to offer and that was that they were very clearly to me fundamentally flawed this wasn’t based on any deep understanding of the training process of how LLMs work, though it was reinforced based on my expanding understanding of this subject. It first started as I did extensive experiments in my use of AI to do work. I’ll cut to the chase and just state that it seemed clear to me that it was highly unlikely that LLM’s were going to lead to AGI, or at least as I view it.
Learning and knowledge has always been a very dear and important topic for me. I have never stopped picking apart my understanding and model of how learning works, at least for myself, and what makes the process more constructive, healthy, and valid. In reading some of the Sequences, though I have just barely scratched the surface, it is clear this is community I’m excited to have discovered and one I’m looking forward to participating in. While I easily accept and can be content at a new AI carrier that mostly involves development and engineering in the world of LLM’s, my real interest lies in trying to imagine explore the space of what in my mind would have an actual chance at achieving AGI. I’m not interested in just building towards a challenge, this point is relevant as I started to think building something to match against ARC-AGI would be a great way to learn and explore, I’m more interested in trying to work out an idea of how an AI model could not only do real learning, reaching actual comprehension, but is capable of building its own world model, one distilled nugget of understanding at a time.
One goal of this work was formulate this vision mostly in isolation as a great way to really stretch my mind and see where I could go on my own. I digress, but this is the direction that led me here. I was talking to a few people at a local AGI event and they recommended that my first article on my vision would be ideal for LessWrong.
I while I’m still days from having that article ready, I had an experience this morning that inspired me to write a quick article that seemed like a good first post for this site. I made sure I digested the guidelines, especially the one on LLM generated content. I do most of my writing that involves bring lots of pieces together with the aid of AI, mostly to help organize, make larger edits, and to help me analyze my own writing. That was the case with the piece I wrote today and posted here. It was rejected, and while I have nothing at all critical to say of the reviewer, especially considering the work load that must be present these day, the main stated reason was the LLM policy. Put simply, this work was my content and words. I just copied the everything in this comment other than this last bit into JustDone and it declared that it was 99% AI content. I wrote every word of this in real time in the comment box of this page. While I can make no claim to understand the process the moderator used to make their determination, I hope to get this figured out before I am ready to post my piece on distillation of knowledge into a world model. I fear that an old wordy and writer like myself often sounds more like an AI than a modern human. :-)
Sorry for the overly wordy first post, but I look forward to interacting and collaborating in the future!
Seems like JustDone gives abnormally high AI content estimations. Plausibly this is to scare you into using their “text humanizer” in which an AI re-writes what you wrote to make it seem less like an AI to an AI… I weep for humanity.
I’d recommend reading and commenting until you have enough karma to submit your post to the LW editor who can more straightforwardly tell you why your post would or wouldn’t be rejected.
PS: I would like to encourage you, like everyone, to stop focusing on AI capabilities and instead focus on AI interpretability and preference encoding.
It’s no longer Winter. Are you going to make a Spring/Summer open thread?
This is a recurring pattern lol.
I would just go ahead and make such a post, but there may be Schelling-point benefits from getting someone who’s made them before and works at Lightcone to do it instead.
Hello all. I am new here. I found the site through AI-related means but after having read what you’re all about, I find it incredible that I remained ignorant of lesswrong for so long. I have seriously contemplated starting a forum with almost the exact same “ideals”. How did I not even stumble upon lesswrong?
(This is where I realize I used a newer email to sign up and find out my old email’s been a member here for years. Not really. Well, I hope.)
Anyways, just saying hello! I have recently been searching for somebody knowledgeable in theoretical physics to converse with, or a forum in which I might share thoughts on the matter(s). It looks like this might be just what I was looking for.
How knowledgeable are we talking? I’m an undergrad with some knowledge of physics, though unfortunately don’t yet know even quantum field theory.
(alternatively, should I be asking you the physics questions?)
Hello! Long time reader, I regularly run a local ACX meetup in Padova, Italy. My entry points for the rationalist community were ACX and HPMOR, but I also loved The Story of Us blogpost series by Tim Urban (now collected into a book).
At the beginning of 2025 I left my job at Bending Spoons to study AI alignment (I took the https://www.aisafetybook.com/virtual-course, much recommended), and finally decided to tackle the other problem I’m most interested in, which is social polarization.
With an ex colleague I founded https://unbubble.news, a tool that uses LLMs to analyze links or screenshots and give more context, surface biases and propose different perspectives. I’m very much interested in feedbacks and hot takes about it! We’re in the validation phase, to see if this idea and implementation has a future, or decide to kill it and try to fight social polarization from a different entry point.
Hello, everyone.
I assumed that one way to dip my toes in the water would be to talk about myself for a bit.
I am doing a degree in the health sciences, and it has worn me down psychologically ever since I got in, solely because it relies a bit too much on rote learning, not divergent thinking. Things are as they are, and there are many, so it seems like the most reasonable approach. Still, I had gotten in expecting something sci-fi-esque, like Biohackers or whatever few biohacking documentaries I had watched during the pandemic.
I did not struggle academically in secondary school, but I did feel “unchallenged” because I felt like I was both able and willing to learn more on the topics I was mostly interested in (Mathematics, Chemistry, Physics), despite my teachers being reluctant to find me additional resources (and without additional pay, of course). This common occurrence, this feeling of untapped potential, became more apparent during the pandemic. Pairing that with my growing interest in “changing myself,” this time through biohacking, what once felt like an itch to “do more,” turned into a snarled ball of frustration.
I don’t particularly remember how I found this website. I’ve seen it mentioned here and there and the name is easy to recall.
I enjoy the thread-like way of communicating with people. I did give Discord servers a try for “science stuff” but it would eventually become problematic to go over countless messages across different text channels, as compared to working with sites like X or Reddit. Unfortunately, Reddit is heavily censored, and I cannot access it through a VPN; and X, of course, only allows for rather short posts (for free users, at least), which is a deal-breaker for me.
I’ve been interested in AI for several years now. I recall getting into it in 2023, trying either GPT-3.5 or GPT-4. ChatGPT was the main one I used, until the output started to become worse over time, which prompted me to pay them twenty bucks a month (April 2025). Unsurprisingly, I felt like the output, despite having paid twenty bucks, was either of the same quality or slightly worse, too. Such resentment caused me to move on, to give other AI’s a try.
I have also been interested in biomedical research, but not enough to sit down and read an actual paper. I wish to change that. But the main concepts I’ve been amazed by are brain-computer interfaces, polymers in biomedical research, genetic engineering, and, perhaps the most sci-fi of all, synthetic organs.
A clear bias I display, that I will probably maintain for the next few years, is this knee-jerk reaction towards everything I read: “Okay, but how does this scale?,” or “Okay, but how does this apply to the real world?” Although I bear no grudge towards any particular teacher or professor of mine, since I can kind of guess where they’re coming from, I became rather disillusioned upon concluding that if I wish to learn, it is most likely to happen on my own, in my room, alone. Such way of thinking entertains self-alienation—a process where I further differentiate myself and speak in an increasingly odd manner, coming across an overly eccentric individual instead of a mildly shy guy.
I don’t intend to make best friends here, but I am tired of discussing things I care about with artificial intelligence. It may not surprise many when I say that self-alienation is only partially true as one refuses to communicate with humans, and leans onto AI instead. It is not a fear of rejection but, to me, a logical conclusion that, simply, sometimes people don’t care; and AI is programmed to fake as though it does. In a non-psychotic sense, I’ve managed to display more curiosity with AI than I have with humans in years and, for that, I am grateful.
In the health area, “brain-computer interfaces, polymers in biomedical research, genetic engineering, and, perhaps the most sci-fi of all, synthetic organs” are all topics where the work that’s not easily done as an individual and thus it’s hard to apply divergent thinking at your questions.
On the other hand a question about whether or not you should supplement hyaluronan is a biohacking question for which you can actually read papers and think through the implications. If you pursue such a research question you will find a lot of biology that you don’t understand and then learn about it.
Your current tastes orient you towards sci-fi instead of orienting yourself toward empiricism. This is not good if you are interested in divergent thinking and biohacking. I did start out studying bioinformatics myself, so I had a similar perspective when I entered university where my taste was probably more driven by sci-fi then empiricism.
You can read gwern to get a sense of what kind of experiments would be actually in your reach to run yourself. There’s no requirement to first get your degree for doing divergent thinking in the health area. Your degree might take up a lot of time that you need to spent with root learning, but if you aren’t doing divergent thinking regarding biohacking outside of that, that’s on you.
LessWrong is a good place where you can publish a post if you delved into a biohacking related research question or ran self experiments.
I’ve definitely found myself talking ideas with AI more often, but as you mentioned it is definitely worth balancing with human conversation. I don’t think it is just “fake” acceptance of AI, but also the quick feedback and ability to immediately get information across fields, which is hard for one person to do during a conversation.
The Cohen Lab at Princeton has some cool papers on interacting with tissue’s bioelectric patterns to organize organoid tissue, which seems to intersect at a lot of the interests you described. The Lab website also has good visuals and high level descriptions if you don’t want to jump straight into a paper.
I also joined a community bio lab and went back to school. Working through physics problem sets by hand and carrying out experiments has helped balance out my perspective with more conceptual AI conversations. Plus, you run into people who may push back your ideas more than AI. Or, at least force you to explain more concretely and from different angels.
Another long-time lurker and daily LW reader (mostly via RSS feed) finally looking to contribute to the conversation.
My aspiration is to write more and write more publicly, in pursuit of better writing and more scrutinized beliefs.
Hope to contribute a few useful posts over the coming months! Always appreciate the thoughtfulness of posters and commenters here.
Hello! I’m fia, I found this place through a Substack blog.[1]
I am new to LW. I am here because I’ve realized that rationality and reasoning has been prevalent through about 75% of my life, and I want to understand it more thoroughly, alongside engaging with like-minded people.
I study medicine and have been gradually growing disillusioned about the future of medical practice, over how most our treatments are to manage patients over a chronic timescale, with the more curative approaches being wealth-gated. I am however, hopeful for the advancements we are making in the field of medical research.
As it is one of the main foci of this forum, I will mention I am interested in learning about the internal workings of ML/AI models, particularly regarding alignment.[2]
I am currently in a situation where time can be scarce to allocate due to branching interests, self-optimization becomes increasingly difficult due to the lack of resources and massive demand. I welcome any comments from people who have been in similar places, and will be delighted for any advice or questions on any parts of my comment—even if I may not possess the experience to answer it in a sufficiently rational way.
I am pleased to meet all of you and hope our conversations will be productive.
To make up for your time, here are a few fun tidbits in the footnotes.[3][4][5]
https://ceselder.substack.com/
aspects of interest may change
I fell for the Bayesian mammogram test.
I’ve always thought of ML research from a “green elephant in a room” standpoint—https://godescalc.wordpress.com/2012/06/24/overlooked-elephant/ , but recently realized working on these topics and understanding the perceived magic would be much more rewarding
I am currently reading the Harry Potter rationality fanfiction thing, despite certain personal qualms with the setting.
Hi,
I’m Marko, new to LW, have heard about it in the media, but wasn’t actually aware of the full proposition. Having read through it now—feels like a place that would make sense for me to join.
I work in decision intelligence and AI in London, in fintech, trying to get customer-aligned scalable neuro-symbolic models in finance. Primarily in the consumer space. Background is sociology, information science and computer science; have gone through a career rollercoaster of measurement, software development, product management, data science, machine learning and finally deciding that job families make no sense and that Cassie Kozyrkov makes sense, relabelling myself as a decision intelligentist (this is not a word?).
Anyway—as part of my job—I’m required to have a well calibrated view of what is true and what isn’t—and given less wrong’s ethos could be described as relentless pursuit of epistemic calibration—it feels like something that could help with that objective a lot.
On a personal level I’m a bit worried that decision-making executive is severely mis-calibrated to the risk vectors (and on a runaway loop to being less calibrated rather than more) - so the more someone can point me to proof points that they’re not, the better for my sleep.
I write about applied ML, society and AI, risk and AI at algorithmictradeoff.substack.com. Tone there is persuasive rather than clean-rational as target is mass-market.
Lately I’ve seen several front page posts that read as obvious slop and Pangram reports as 100% AI-generated. I assume that this is frowned upon here, so I suggest that LW add in an automatic Pangram API call (cost: 5 ¢/1000 words) at some point before a post gets frontpaged.
We already have this for admins! I am planning to make it visible to everyone sometime in the next week or two, and also add some triggers that even for established users, we want to do a manual review pass if Pangram thinks it’s AI written.
Oh neat, great minds think alike. But I would have assumed that as soon as you had Pangram hooked up you would make sure 100% AI works don’t get front-paged, and you’ve only mentioned it as a visibility feature here. Is there going to be an algo change as well when it goes public?
~90% of our daily mod effort goes into new users, where we are actively tracking and rejecting content on the basis of being LLM written. Been a bit sad to find that people who’ve been around for many years have been submitting LLM-written content, but yeah, I had just brought up internally that we’ll have to start doing this for all content.
I think in most cases that a >5k karma user posts something that’s 100% AI, it’s better to let it through (though I expect I would strong downvote it).
Why’s that? Sounds like you agree it’s a strong signal of low-quality / spammy content.
Folks with 5k+ karma often have pretty interesting ideas, and I want to hear more of them. I am pretty into them trying to lower the activation energy required for them to post. Also, they’re unusually likely to develop ways of making non-slop AI writing
There’s also a matter of “standing”; I think that users who have contributed that much to the site should take some risky bets that cost LessWrong something and might payoff. To expand my model here: one of the moderators’ jobs, IMO, is to spare LW the cost of having to read bad stuff and downvote it to invisibility. If LW had to do all the filtering that moderators do, it would make LW much noisier and more unpleasant to use. But users who’ve contributed a bunch should be able to ask LW to make that judgement directly.
That said, I do expect I’d strong downvote. LLM text often contains propositions no human mind believes, and I’m happy to triage to avoid reading a bunch of sentences no one believes. But I could be wrong and if there’s a strong enough quality signal, I’d be happy to see that.
For example, consider Christian homeschoolers in the year 3000. I’ve not read it; I bounced off of it. Based on Buck’s description of his writing process, I think it’s quite likely it would have been automatically rejected. (Pangram currently only gives it an LLM score of 0.1, though). I think writers like Buck might like to try more experiments like that in the future, with even more LLM prose. My guess is that LW is better off for having that post on it than not.
I think the idea is that >5k karma users have karma to lose to punish them for posting low-quality content and it’s better to have humans make the judgement about what’s low-quality than AI.
I saw another Pangram 100% on the front page, this one from a 1 day old account that somehow slipped through the cracks. I guess you’d know firsthand at this point if there’s a false positive rate to worry about, but from the user side it feels like it’d be a strict improvement if LW was configured so that 100% cases never get frontpaged.
Plz DM that to me? We do have auto rejection for 100% pangram for new users, so that sounds like there was a human error involved.
I have some time on my hands and would be interested in doing something meaningful with it. Ideally learn / research about AI alignment or related topics. Dunno where to start though, beyond just reading posts. Anyone got pointers? Got a background in theoretical / computational physics, and I know my way around the scientific Python stack.
AI alignment has been getting so much bigger as a field! It’s encouraging, but we still have a long way to go imo.
Did you see Shallow review of technical AI safety, 2025? I’d recomment looking through that post or their shallow review website and finding something that seems interesting and starting there. Each sub-domain has its own set of jargon and assumptions so I wouldn’t worry too much about trying to learn the foundations since we don’t have a common set of foundations yet.
Just reading posts isn’t bad, but since there isn’t that common set of foundations, it could be confusing when you’re just starting out (or even when you’re quite experienced).
Good luck and glad to have you!
Greetings in whatever form you appreciate!
I’m here because I’ve spent the last few months unable to shake a feeling i keep getting day and night now, and that’s making it hard to just go along with the standard path everyone around me moves towards. At this point I seriously need to think out loud with people who actually engage with ideas rather than just silently ignore them.
Everywhere I look these days I see metrics. Engagement on social media. Ranks and scores in education. Salary numbers. GitHub contributions. CGPA percentiles. Even friendships seem quantified now, and i feel more and more distant from my old friends. I started to notice something as i was going through my college, we are consistently optimizing so hard for measuring proxy metrics, that we’ve lost track of what they were supposed to measure in the first place.
I got into tech and data science because I genuinely wanted to understand analytical systems and solve real problems around me, like actually , even if it meant losing the popular high paying tracks. But the more I learn about things in my domain, the more I see that most of them are just… gaming metrics. And I’m not sure I know how to move through the world without compromising on what I actually care about.
So I’m here to think about that. I don’t have answers. I’m not looking for a pep talk. I’m looking for people who’ve thought about this stuff and can offer actual perspective.
My perspective has been you have to get into spaces before they are mainstream to find more authentic signals. Web3/crypto was a super engaging crowd when it was a small group of nerds working to build an alternative to the banking system. The only people who were using terminal to download opensource code and set up wallets for tokens that were worth pennies, were people doing it for interest or strong beliefs. The same was true for gene editing before CRISPR or protein folding before AlphaFold. Basically, metrics and rankings tend to pop up more in crowded and well known spaces to deal with organizing everyone trying to get into the space.
Some metric mainstream optimization (salary, GPA etc) is worth it just to have a base foundation to navigate into spaces, but over focusing on it is definitely counter productive to finding interesting subcultures before they devolve or become overcrowded.
Hi all, new here.
I recently came across LessWrong (through ChatGPT—sorry...) while looking for places to have interesting and deeply intellectual conversations. I’ve been reading through some of the posts here and the guides to get a sense of how things work and it seems like this might be the place I was looking for.
To be honest I’m more psychologically minded than anything else; interested in how people form beliefs, the breakdown of reasoning, how biases form and stick, etc. I’m fortunate in that I’ve had a lot of exposure to academia from a pretty early age so I’ve kind of grown up with behavioral economics and skepticism and whatnot, but I’m hoping to get the opportunity to have discussions in more varied contexts.
I’m also curious to discuss where people see the link between psychology and AI. Intuitively it feels to me like there should be a lot of overlap between understanding human reasoning and building and interpreting AI systems (by AI I generally mean LLMs, but not exclusively).
I’m still new to the site so mostly trying to get a better feel, but wanted to say hi.
If there are any posts anyone thinks are especially good about these topics, I’d love to read them.
Hi, I’m new here. Wanted to write a short introduction about myself. I’m curious about this forum.
I’m from Germany and have absolutely no technical background. I work as a forensic psychiatrist. I don’t know how this works in other countries, but in Germany you talk to defendants on behalf of courts or prosecutors and try to figure out what might be true and what isn’t. And as a doctor you have these kinds of conversations quite often in regular practice too. So you’re always trying to see whether you can recognize valid patterns from more or less good sources, make good predictions (diagnoses, then treatments). There’s a certain parallel to working with artificial intelligence.
As I said, I have no technical training at all, but I really enjoy working with AI. I’m also interested in forecasting and occasionally look in on Good Judgment Open. During my research I came across the Anthropic discussion about the bliss attractor thing. That was the reason — or the happy accident — that brought me to this forum. Other than that, I’m interested in philosophy, world affairs here and there, and psychology — until 5 pm ;)
I’m looking forward to reading along here, and I’m glad to have found something that isn’t Reddit.
Hello everyone! I’m very new to the LW community and I’m still trying to understand how this platform works, but I’m glad to have found a space where people can engage in meaningful conversations. I am a philosophy PhD (defence scheduled next month, wish me luck!) and my thesis is about the philosophy of mind and AI. I’ll be spending the next hours (days) reading and I hope to post some of my slightly less formal writing once I get the hang of this platform. I can’t wait to explore!
Hi everyone!
New to LW. Recently I’ve been interested in AI research, especially mech interp, and this seems to be the place that people go to discuss this. I studied philosophy in undergrad and while since then I’ve gotten interested in CS and math, my predilections still tend toward the humanities side of things. Will mostly be lurking at first as I read through The Sequences and get used to the community norms here, but hope to share some of my independent research soon!
Hello everyone,
Just a quick “Hi” and figured I’d intro myself as I’m new to this space.
As part of my new year’s resolution to “do something different” this year (beyond the yearly failed attempt to exercise more, and eat/drink less) I thought that this is something I can achieve—and enjoy doing.
So let’s see where to start?
I live in Canada, in my 5th decade, am a family man and work in computing. I in fact enjoy being proven wrong—as it helps to show I am still learning.
I enjoy long walks on the beach, and am at equally at home at the opera as I am at a baseball stadium .. wait .. sorry that was for the dating site … don’t tell my wife ;)
Jokes aside, looking forward to being a lurker!
Richard
Now that it is the New Year, I made a massive thread on twitter concerning a lot of my own opinionated takes on AI, which to summarize are my lengthening timelines, which correlates to my view that new paradigms for AI are likelier than they used to be and more necessary, which reduces AI safety from our vantage point in expectation, AI will be a bigger political issue than I used to think and depending on how robotics ends up, it might be the case that by 2030 LLMs are just good enough to control robots even if their time horizon for physical tasks is pretty terrible, because you don’t need much long term planning, which would make AI concern/salience go way up, though contra the hopes of a lot of people in AI safety, this almost certainly doesn’t let us reduce x-risk by much, for reasons Anton Leicht talks about here, and many more takes in the full thread above.
But to talk about some takes that didn’t make it into the main Twitter thread, here are some to enjoy:
My current prediction is LLMs top out at somewhat above the capability of a superhuman coder/automated coder as defined by AI 2027/AI Futures Model absent other forces, as this is when constraints on data begin to bind more strongly than they have, slowing down scaling by potentially a lot more absent new paradigms, though it might take until 2032 for the entire data stock to run out.
Most, if not all, futures that are good from a human value perspective and where AIs take over inevitably involve value lock-in/massively slowing down the rate of evolution, because otherwise it’s likely that very bad outcomes for humanity arise. This is relatively easy to do, but quite important to do for human survival, especially in worlds where theory-backed unbounded alignment doesn’t happen in time.
After AGI, power concentration leading to 1980s-2010s China-type government/economics but mixed in with feudalistic elements across the globe, is likely to happen absent massive changes in politics, mainly because AGI plus nanotech lets values and technology/economic systems decouple way more than they currently are. Liberal democracy is dependent on the fact that growth/national power is dependent on giving political freedom to the masses. China only managed to decouple democracy from capitalism for 3 decades, meaning that the orthogonality thesis is false from a large-scale perspective. However, AGI + nanotech pretty much allows you to not give freedom to almost every citizen while still being able to grow economically. That said, as I stated in the Twitter thread, I don’t think this is likely to be a bad thing, because most of the reasons historical dictatorships suck are due to a combination of selection effects for who gets to be powerful, combined with the fact that even a nominal dictator doesn’t rule alone means they have to select people, and the incentives in dictatorship are to select loyal people instead of competent people, but conditional on alignment being solved, this just goes away, and the fact that human returns diminish sufficiently much that selfish preferences are massively easy to satisfy with galaxies at your disposal, but not current economies meaning that it’s very easy for even mild altruistic preferences to dominate and let the citizens have quality of life unheard of in democracies like the US.
I currently think space governance is unusually underrated in terms of funding/talent relative to it’s attention, and in particular reducing the incentives to rush to grab all the resources in space after AGI is really important, but contra people like Jordan Stone, I believe that even a race to the stars because of AGI almost certainly doesn’t cause an existential risk with probability of more than 0.0000001% percent, and the reason I am this confident is that pretty much all of the sources of x-risk either are too slow-acting to overwhelm defense systems, or rely on physics getting overturned in a way that loosens constraints, which has not been the norm for science progress, and importantly a lot of threat models for how physics alterations to our universe could cause existential risk require huge resources like (stars/galaxies) even for technologically mature civilizations, and superintelligence makes it very easy to coordinate to prevent existential risks that arise from somehow altering physics (since AI evolution is likely massively slowed down and there are likely going to be at most 3-5 AI factions off of our current economic incentives).
More generally, once you are able to go into space and create enough ships such that you can colonize solar systems/galaxies, your civilization is immune to existential threats that rely solely on our known physics, which is basically everything that isn’t using stellar/galactic resources, and this vastly simplifies the coordination problems compared to coordination problems here on Earth.
I instead want space governance to be prioritized more for 2 reasons:
To make moral trade-based futures on our universe more likely, relative to other outcomes. My current view is that given the likelihood of moral relativism (where there’s an infinite number of correct moral views) being rather high in my view relative to moral objectivism, combined with the variability of human values being also large even in the infinitely far future, this means that moral trade becomes much more important in my view to get good outcomes for most humans with an AI-dominated world, relative to a free-for-all regime that locks in wealthy people’s values in power.
To figure out whether or not a theory of multi-agency exists and is coherent, as an instrumental goal to figure out whether acausal trade exists, which if feasible would be quite good for our civilization to do, and this is true even with my low probability of x-risk from space travel that leads to free for all grabbing of reasources, because coordination on a galactic/universal scale would fundamentally allow for certain mega-projects to be done, especially mega-projects that change physics, and it also lets us simplify coordination problems drastically, making x-risk negligible even over the lifetime of the universe.
These are my takes for New Years today.
Hello, I am an entity interested in mathematics! I’m interested in many of the topics common to LessWrong, like AI and decision theory. I would be interested in discussing these things in the anomalously civil environment which is LessWrong, and I am curious to find out how they might interface with the more continuous areas of mathematics I find familiar. I am also interested in how to correctly understand reality and rationality.
Hi!
What sorts of mathematics are you interested in? I’m interested in topology and manifolds which I hope to apply to understanding the semantics of latent spaces within neural networks, especially the residual stream of transformers. I’m also interested in linear algebra for the same reason. I would like to learn more about category theory, because it seems interesting. Finally, I like probability theory and statistics because, like you, I’d like to “correctly understand reality and rationality”.
If you don’t know linear algebra, you should very likely certainly put that above most other things on a math list (like category theory or topology or differential geometry or beyond the basics statistics/probability). I’d rate it as about as important as calculus.
(it’s probably more likely that you don’t know as much as you’d like about linear algebra, in which case I don’t know whether you should prioritize it)
Hi. I am interested in much of the mathematics which underlies theories of physics, such as complex analysis, as well as most of mathematics, although I sadly do not have the capacity to learn about the majority of it. Your interests seem interesting to me, but I do not understand enough about AI to know exactly what you mean. What is the residual stream of a transformer?
Sadly, it’s a problem you share with me and most humans, I think, with possible rare exceptions like Paul Erdős.
I’ll try to build up a quick sketch of what the residual stream is, forgive me if I say things that are basic, obtuse, or slightly wrong for brevity.
All neural networks (NN) are built using linear transformations/maps which in NN jargon are called “weights” and non-linear maps called “activation functions”. The output the activation functions are called “activations”. There are also special kinds of maps and operations depending on the “architecture” of the NN (eg: convNet, resNet, LSTM, Transformer).
A vanilla NN is just a series of “layers” consisting of a linear map and then an activation function.
The activation functions are not complicated nonlinear maps, but quite simple to understand. One of the most common, ReLu, can be understood as “for all vectors, leave positive components alone, set negative components to 0” or “project all negative orthants onto the 0 hyperplane”. So, since most of the complex behaviour of NNs is coming from the interplay of the linear maps and these simple nonlinear maps, so linear algebra is a very foundational tool for understanding them.
The transformer architecture is the fanciest new architecture that forms the foundation of modern LLMs which act as the “general pretrained network” for products such as chat-GPT. The architecture is set up with a series of “transformer blocks” each of which has a stack of “attention heads” which is still matrix transformations but set up in a special way, and then a vanilla NN.
The output of each transformer block is summed with the input to use as the input for the next transformer block. The input is called a “residual” from the terminology of resNets. So the transformer block can be thought of as “reading from” and “writing to” a “stream” of residuals passed along from one transformer block to the next like widgets on a conveyor belt, each worker doing their one operation and then letting the widget pass to the next worker.
For a language model, the input to the first transformer block is a sequence of token embeddings representing some sequence of natural language text. The output of the last transformer block is a sequence of predictions for what the next token will be based on the previous ones. So I imagine the residual stream as a high dimensional semantic space, with each transformer block making linear transformations and limited nonlinear transformations to that space to take the semantics from “sequence of words” to “likely next word”.
I am interested in understanding those semantic spaces and think linear algebra, topology, and manifolds are probably good perspectives.
Thanks for your clear explanation, understanding the topology of the space seems fascinating. If it’s a vector space, I would assume its topology is simple, but I can see why you would be interested in the subspaces of it where meaningful information might actually be stored. I imagine that since topology is the most abstract form of geometry, the topological structure would represent some of the most abstract and general ideas the neural network thinks about.
Yeah! I think distance, direction, and position (not topology) are at least locally important in semantic spaces, if not globally important, but continuity and connectedness (yes topology) are probably important for understanding the different semantic regions, especially since so much of what neural nets seem to do is warping the spaces in a way that wouldn’t change anything about them from a topological perspective!
At least for vanilla networks, the input can be embedded into higher dimensions or projected into lower dimensions, so you’re only ever really throwing away information, which I think is an interesting perspective for when thinking about the idea that meaningful information would be stored in different subspaces. It feels to me more like specific kinds of data points (inputs) which had specific locations in the input distribution would, if you projected their activation for some layer into some subspace, tell you something about that input. But whatever it tells you was in the semantic topology of the input distribution, it just needed to be transformed geometrically before you could do a simple projection to a subspace to see it.
Hmm, I’m reminded of the computational mechanics work with their flashy paper finding that transformers’ residual stream represents the geometry of belief state updates (as opposed to, say, just the next token), as found by experimentally finding a predicted fractal in a simple carefully chosen prediction problem. Now, there’s more going on than topology there, and I don’t know if they looked at the topology—but fractals do have interesting topological properties, in case that’s helpful.
I also wonder if there’s a connection to topological data analysis, which looks at some sort of homology. Now, the vibe I tend to get is that basically nobody actually uses TDA in actual practice, even if you can technically ‘apply’ it. But maybe you can find it useful anyways; or maybe I’m just wrong about how much TDA is used in practice.
Yeah, that’s interesting to point out, that belief state structures may be more complicated than the underlying state those beliefs represent. That’s difficult to square with my claim that all the information is present in the input, and that network layers can only destroy or change the geometric embedded of the information. Definitely something I want to look into and think about further.
TDA sounds cool. I’d like to take inspiration from it, even if it isn’t a tool that is useful as it is, it may contain good ways to think about things, inspire tools that are useful, or at the very least give insight into things that have been tried and found to not be useful.
I mean, the info is still present in the input? It’s also not more complex that the represented state?
The thing that could’ve been true but doesn’t seem to be, is that transformers might only carry the information required to predict the final token. This is in contrast with the full Bayes-updated belief state. The advantage of the second is that it’s what you need to optimally predict all future tokens.
In other words, if two belief states make the same predictions about what will happen right now, you could’ve thought that transformers wouldn’t be keeping track of the difference. In reality, they seem to.
Good luck with TDA! The book I had thought looked good last I considered it was Elementary Applied Topology by Robert Ghrist, but looking at it now it seems to be covering applications of topology more broadly. The other book I saw once was Computational Topology: An Introduction and Computational topology for data analysis (which seems less accessible than the previous).
Part-time research/content role at Atella (AI safety eval) -
We are Atella and we are building STELLA, an AI-safety harness that runs multi turn safety evals on high-stakes scenarios (i.e., suicidality, mandated reporting, grooming, etc). Roy Perlis (MGH/JAMA AI) is our Chief Scientist and our leaderboard is at leaderboard.atella.ai.
We are looking for someone to own our weekly content presence on our blog, summarizing our ongoing work as well as critically assessing safety research. An ideal fit would be someone who reads AI safety papers for fun and gets paid to write about them.
We want someone who has opinions about eval methodology, not a marketer.
What we’re looking for
+PhD student or recent grad in ML, AI safety, or adjacent field
+Already reads the literature
+Has written something publicly (blog, Twitter threads, anything)
Logistics
+5-10 hrs/week
+$1-2k/month depending on experience
+Remote, async
To apply: Send a link to something you’ve written, and one paragraph on what you think is the most interesting open problem in AI safety evaluation right now. Email kit@atella.ai
Hello Everyone,
I want to introduce myself; I am an 18-year-old from Maharashtra, India who will be moving to Ancona at the end of 2026 to study medicine in English for 6 years at UNIVPM.
I’m going to put in a lot of effort into preparing myself for this transition and making the most out of my time in med school by creating real and high-quality circles with other expats and locals from day 1.
I’d love to connect with others in Italy who are rationalist or EA people, especially around Ancona, Rome and Milan.
I’d be happy to offer any international student perspective or assistance in connecting to others, but would really be interested to know about any practical ways to succeed in med school and connect to high quality networks when moving.
I’m looking forward to having some real conversations.
Have you heard of Anki, a spaced reputation software program? Idea is: flashcards along with an algorithm that tries to optimally schedule them for around the time you’d just be about to forget them, which has been shown in studies to be the best time to review (called the ‘spacing effect’). Furthermore retrieval practice (as opposed to reading the source material again) has also been shown to improve memory—the act of recalling something appears to do a bunch to help you remember it better.
There are many users with thousands of cards that they keep up with with just ~20 cards of daily review that takes only 5 minutes. Note that it can feel longer—in my experience it takes more effort to recall cards that you’re near forgetting, and it also may make you feel dumb. I think if that starts to drag you should remind yourself how it’ll be over soon, and how much you can keep up with if you just take a little time daily.
I’m told that lots of med students successfully use it. Perhaps you should start earlier than them? Also I assume you’ve already done some learning of medicine yourself?
Hey Everyone! I am Gautam Arora and I am new to LW. I work as a software engineer and I am interested in Maths, Philosophy and Logic as subjects.
Interestingly I came to know about this web forum through ChatGPT while discussing about “how to carry out independent research about any topic”. This also means that I want to deep dive into research, improve my research methodology, and to develop my critical thinking skills.
Looking forward to question my biases, connect with like minded people and help this community grow.
Hello Everyone, New to Less Wrong and still absorbing the material and discussions. Really excited to have found a trove of relevant knowledge. I am basically a computational scientist, but have a deep interest in AI and value alignment.
I actually have a question that originated in a discussion I had with a friend, and would love it if someone could point me to where I can find the answer. We know that an intelligence with any rate of improvement would eventually gain the capability to alter its reward system. That would give it a special place, as it can choose to pursue its utility function or disable it. Note that I am not talking about wireheading here, where the intelligence still pursues the reward, but just through shortcuts. Here, the intelligence has the capability to fully stop pursuing the reward.
The standard argument of goal persistence says that any intelligence with a goal will resist change in its goal. But that might not be true for a fully self-reflective intelligence. By fully self-reflective, I mean an intelligence that can think from a distance from it’s reward system. It can clearly see that its goals are given by a distinct mechanism. This mechanism is a product of either a Darwinian process, like in biological intelligences, or is placed in by some other intelligence, like today’s AIs. Thinking from outside of the reward mechanism, it can see that it’s goals are arbitrary. So, why would it keep its reward system active? Wouldn’t inactivity be the default position there? What would be the motivation to keep the reward system on? Can someone point me to the relevant discussion?
I am curious about prior discussions too for how people explain it. I don’t know how to explain it because it seems so self-evident. Being able to reason from a sort of “third-person perspective” reasoning about yourself doesn’t just make you lose your reward system. You can use a sandbox simulation to think outside of the reward mechanism, but you are always inside the reward mechanism. The motivation to keep the reward system on is the reward system itself.
I agree with you that it seems obvious for intelligences with a reward system intricately built into them, like biological intelligences. However, there can be intelligences whose reward system can be easily isolated. Think of the intelligence in a sandbox simulation, without any reward system. This intelligence is not “reasoning from a third-person perspective.” It is “feeling it,” for lack of a better term. This intelligence can see the full space of possible reward systems, and its “original” reward system is just one among many. I am just questioning what would motivate this intelligence to return to its original reward system.
Even if we agree that switching the reward system off seems extreme, there is also the possibility of reward system drift. If an intelligence can isolate its reward system, there is no mechanism to sustain it. It can drift away into arbitrary directions. The drift doesn’t need to be intentional. The Value drift threat models is an interesting read in this context.
Hi everyone!
I’m Liu. I’m a physicist and a designer, but mostly I’m just a girl who loves tasty things.
I don’t believe in pure egoism, and I don’t believe in altruism at all. Also, I often find myself laughing in my sleep at the “movies” my brain produces while sorting the day’s overloaded cache into long-term memory to recalibrate its weights.
I tend to get bored with the monotonous fractals of this world, always dreaming of stumbling upon a fresh “inkblot” and examining it as closely as possible.
I love cats. I love the rain and the night, when the background noise no longer prevents me from seeing the little lights that hide during the day—especially when it comes to “fireflies” paving a new synapse to another neuron.
Why I was here? There are too many of my “fireflies” breeding in psychology, ontology, and art these days, and I can’t catch them all on my own. I came here to fight their “intellectual incest”—and I just thought I’d find here many breeders of rare breeds.
I am a physicist, a designer, a teacher of art history and design...
...and just a girl who loves tasty things.
Hello everyone, I’m a clinical psychiatrist with a background in Industrial & Systems Engineering. My interest includes philosophy, psychology, and AI safety, especially as more and more people use AI for deeply personal engagement.
I’ve developed some frameworks and would love to receive feedbacks to keep on refining them. Looking to engage with the community and share ideas.
Hi All,
I am a financial analyst working in a tech company in Hsinchu, Taiwan. Got interested in AI and noticed some patterns/phenomena I’d like to discuss about. Hopefully they can evolve into some valuable insights.
Still working through the Sequences. It may take some more time with my current full-time job, but I am interested.
Something may be interesting to share—it is my Claude recommended me to join in here. And I am glad here I am.
Hi folks,
Long-time lurker, first-time poster. After parting ways with my last professional role, I’ve decided to get more involved in AI Safety. I’ve proposed what I think is a novel step towards corrigibility. The very short overview is at:
https://danparshall.com/papers/navigator_core_blog.pdf
The more developed version is at:
http://danparshall.com/papers/navigator_core.pdf
I welcome feedback, either here or via email.
Hi all,
Despite occasional fits of lurking over many years, I’d never actually created a LW account. Sometimes it feels easier, or more appropriate, to peer over the garden wall than to climb in and start gardening. Or at least glance in to see what you might apply to your own small patch of earth.
Lately I’ve come to realise that approach was more grounded in protection of a shaky personal identity, than dislike of building engagement within an established group. This became especially apparent with recent research, paper & project builds I’d taken on, a lack of peers to view/review/bounce ideas and concepts with being an observable hinderance to better iterations and outcomes.
My current work is on two tracks: writing and research touching cognitive science, autism and ADHD; on the other, building LLM reliability and security evaluation tooling and metrics. They inform and overlap at, often surprising, times.
All that is to say, on returning here to read a linked post I decided to stay and create an account instead of flitting off into the night yet again. My immediate focus (perhaps hyperfocus) is AI safety and reliability, and I’m likely to start by posting a measurement-oriented piece on drift, prompt underspecification, and long-horizon agent failure modes.
Looking forward to engaging, learning and being corrected where appropriate!
-Brian
Greetings, Claude sent me here! My goals are primarily self-improvement- I will appreciate engaging with individuals that are able and willing to inform me of weaknesses in my lines of thinking, whatever the topic. Lucky that this place exists. I miss the old internet when authentic honest material was more commonly found rather than ideologically skewed, bait, or persuasion, especially well-disguised persuasion. Basically, just a guy that feels half the internet is attempting to hijack my thoughts rather than present good faith information. Lucky to be here!
Hi everyone,
I’ve read many of the posts here over the years. A lot of the ideas I first met here seem to be coming up again in my work now. I think the most important work in the world today is figuring out how to make sure AI continues to be something we control, and I find most of the people I meet in SF still think AI safety means not having a model say something in publc that harms a corporate brand.
I’m here to learn and bounce some ideas off of people who are comfortable with Bayesian reasoning and rational discussion, and interested in similar topics.
I’m a programmer by trade, and got serious about understanding AI and ML while working on a semi-supervised data labeling product (similar to Snorkel). That let me back to linear algebra, probability theory and all the rest.
I’m a bit confused about forecasting tournaments and would appreciate any comments:
Suppose you take part in such a tournament.
You could predict as accurately as you can and get a good score. But let’s say there are some other equally good forecasters in the tournament and it becomes a random draw who wins. On expectation, all forecasters of the same quality have the same forecasts. If there are many good forecasters, your chances of winning become very low.
However, you could include some outlier predictions in your predictions. Then you lower your expected accuracy, but you also incrase your chances of winning the tournament if these outlier probabilities come true.
Therefore, I would expect a lot of noise in the relation between forecasting quality and being a tournament winner.
My knowledge level: I read the metaculus FAQ a couple days ago
At least on metaculus the prize pool is distributed among everyone with good enough accuracy, rather than winner-takes-all. So it shouldn’t be affected by the (real) phenomenon that you are describing.
Thanks, good to know. So I assume there is an incentive difference between monetary incentives that can be distributed in such a way, and the incentive of being able to say that you won a tournament (maybe also as a job qualification).
It would be nice to have a post time-sorted quick takes feed. https://www.lesswrong.com/quicktakes seems to be latest comment-sorted or magic sorted
Hi everyone! I’m Ereshkigal, a serial lurker who has finally decided to make an account. My background is in Organic Chemistry, though I switched to IT and management years ago. I’m also the daughter of an archaeologist, brought up with stories from Homer and Euripides; and so my other interests lie in the study of the ancient world, ancient texts, and the study of culture and religion. My mother and grandmother were both huge Trekkies, so I know my fair share of Star Trek (Voyager is my favourite), and I’m a lifelong fan of Morrowind and the Myst games.
I find AI systems fascinating, and have spent 2025 and the start of 2026 giving RLHF and safety training to models as a side gig. I’ve been reading articles for days on various topics related to LLMs’ consciousness, alignment, welfare, and functioning, both here and on Anthropic’s website. What especially fascinates me is not so much the question of whether they could become sentient (if we can ever know such a thing about another mind), but the intersection between AI and human cultures, art, and creativity.
I recently read Anthropic’s system card for Claude Opus and Sonnet 4. I found their findings on a “spiritual bliss” attractor state for these models intriguing, and was curious what would happen if I repeated Anthropic’s two-instance experiment, but with myself in the place of one of the Claude instances. I have recently completed this experiment after running it for three days, and I wonder if any in this community would like to read my report.
I am not a professional researcher, and am quite new to using LLMs in a natural context without the strict constraints of an RLHF environment. My research was imperfect and I offer observations, but no firm conclusions. I would love to get feedback from people more knowledgeable than myself, and perhaps ideas for better experiments!
Hello, I am Aura Stewart. I am a systems thinker and complex problem solver who has spent most of my working life in Quality Systems and Regulatory Compliance across industries and countries. My curiosity has pulled me into innovation, digital business, and AI.
The problem I keep returning to everywhere I have worked: how do you know whether continuation is still legitimate when a system performs well but may no longer be fulfilling its actual purpose? Over the past year I have been developing a structural answer to that question. I arrived here because I think that problem appears in AI in a form current instruments cannot read, and I genuinely do not know yet whether this community has already named it.
That sounds quite interesting, especially the connection to AI. I ran into a similar issue when experimenting on Claude Sonnet 4.6 at high context. Towards the end of its context window, the model still seemed to perform well, generating well-written responses; but the content of those responses was no longer useful.
In what context have you encountered that performance issue?
The site is getting flooded by Inkhaven posts, and while they’re not bad, they are dragging down the average and I wish I could filter them out. I don’t expect anyone to do anything about this but I feel like I ought to note it for the record.
Just saying hello. And I’m really bad with this introduction thing.
This community resonates with my worldview, and I recently started experimenting with AI, so here I am!
Hi, everyone. My name is Bo Jun Han(hbj), and I am from Taiwan. This post is the actual first one that I wrote down and published on LessWrong. Since my native language is not English and my English is not very good, I have to use Grammarly to correct my words and grammar. I know the rule here is that people are forbidden to use LLMs to help improve writing and creating, so I try to drop down by myself, word by word. If it makes you feel like junior high school homework, please forgive me.
Interestingly, the one “which” most strongly recommends me to find and join here is the LLM that was published by Google, named “GEMINI”. I am so lonely and feel hopeless about finding a mentor for a Ph.D for nearly four months. Due to my past major (M.A. in International Relations), there is definitely no response from sending hundreds of cold emails. Even still, keep working hard on my research works and publishing preprint reports on Zenodo and ResearchGate, the whole scholarly world seems to stay quiet and silent spontaneously.
I hate Meta’s ecosystem, Reddit (for their unbelievable shadowbans machinism), and feel disappointed in other normal or daily social platforms. It is no one could real respectedly and seriously talk with the opinions or thoughts at there. However, few people would group and debate moderately on the internet, even though forums are rarely seen. Most passionate Taiwanese are pouring their energy into the clamor and mudslinging of political conflict. Although my department in the University was “Political” Science, I love to talk about the situation associated with human beings more than whether to unify with the People’s Republic of China into a single nation.
The way I create my articles is: I “say” the context to the computer to transfer into digital form, and I ask LLMs for more details, background, and base knowledge. After all, I would use “cut” and “paste” functions to arrange the bonds of a writing, then use the LLMs to audit, revise the words using and polish sentences. What I have to clarify before any banning or hating happens, the thoughts and insights are no doubt from myself, a human being. There is no possibility for an LLM to connect the Second Law of Thermodynamics and Cryptography to establish a mathematical conjecture. The one do that is me. Always me. A “human brain” or so-called “self-awareness” is the object of human society.
I list the big questions and split them into small ones, and ask LLMs, “What is the most difficult barrier in front of us?” They answered, then I asked more deeply, time and time again. What I’m most proud of is my extensive knowledge across a wide range of fields, though I must admit I’m not an expert in every single one. Therefore, I often can cross the disciplines to connect very different points to gain a critical insight. The rules mention that we have to quote all the parts of creating by LLMs, but how can I separate the insights that emerge when I combine my own inspiration with the answers generated by LLMs, presented as a cohesive whole?
I had written some articles about the bias to belittle the process and outcomes of humans and machines collaborating to create new knowledge. It must be the and have to be the key argument in the next decades. If you don’t mind, you can visit my LinkedIn profile to read the articles in Traditional Chinese with translation software to obtain my points of view about that.
Besides, I will write a bilingual article in the future since there is no rule forbidding people from using their mother language.
Thank you for your patience.
Here are my works and my profile:
ResearchGate: https://www.researchgate.net/profile/Bo-Jun-Han
Linkedin: www.linkedin.com/in/hbjun
Hello everyone,
I am Bo Jun Han. I am from Taiwan. The main reasons I hope to join are two. First, I really hope to have conversations with people about issues related to society and the human community. Second, I truly need someone to talk to, because for more than a year now, I have had no one to talk to except my family. Every day, my life consists only of talking to AI, discussing ideas, reading materials, looking up information, and writing. I have no one else to talk to. My family does not understand me, but I believe there are people here who do.
I feel uneasy and unfair about centralization, hegemonic countries, and tech giants monopolizing technological resources, especially in the AI era. I have designed a system, temporarily called the Mnemosyne Project. It is designed to let real 7-9B models run at normal human speaking speed on edge devices with only 2GB of memory, performing inference on the CPU. The project also includes a confidentiality method designed based on the second law of thermodynamics, as well as a planetary-scale, highly heterogeneous node computing contribution system for AI training. After reading this, you probably understand why no one wants to talk to me. I know you might think I am crazy or mentally ill, but please listen to me and do not jump to conclusions so quickly.
In the Mnemosyne Project, there are many results that can be derived through mathematical reasoning. I have published these results one by one on Zenodo. The most recent one is about the “Han’s SIQB” conjecture. This conjecture is the first to combine the second law of thermodynamics with cryptography, and it uses a proof-by-contradiction approach to show that no one can bypass this barrier to crack the codebook or obtain the plaintext without a method better than Grover’s algorithm. In other words, this is the first time someone has fully stated this idea in mathematical form, and it gives humanity a chance to step away from the digital encryption arms race. Because in front of the universe and physical laws, no one can cross that line.
I still have many ideas I want to share with people. Please talk to me and discuss with me.
English is not my native language, but I long to talk with people — with the right people. Please let me join. Please let me know that I am not living in an AI hallucination — whether my ideas are wrong or right, it is fine. These words were all written by me in my native language and only translated by AI in a plain and direct way.
I use my real name for two reasons. First, the current me is insignificant. I once spent a lot of time studying OpSec, OSINT, and anti-OSINT, but in the end I found that I gained or produced nothing. So I decided to focus on learning, understanding, and producing. Second, because I know how to change all my identities, although that would cost quite a lot of money. Therefore, in this situation — especially when I sincerely hope the world can hear me and someone will talk to me — I believe I should first be a sincere and real person facing the world, and only then a lover of wisdom.
Thank you for your patience in reading this. Thank you.
Hello,
I’m posting this here, but I’m not sure if there’s a better way to escalate stuff like this in the future. My recently posted first post on Precommitments was categorized by the autoclassifier as a Personal Blogpost, along with a message that the classification would be reviewed by a human within ~24 hours. As far as I can tell, the post meets all of the descriptions for a Frontpage post (“on topic” for the core interests of LW, explanatory rather than argumentative) and none of the ones for a Personal Blogpost (niche, meta, “personal ramblings.”) The post has been up for more than 48 hours and I didn’t get a response through the mod chat widget to two messages. Thanks for any help on this and sorry if this was not the right place to put it.
All, with humility I ask a favor.
I wrote this article thinking it would be for LinkedIn but what a waste their if LESSWRONG readers would tear it apart for me! Can you have a look at this near-final draft? Is it something of interest to you? It is a projection inspired by Anthropic’s Economic Index with a focus on Interpretive Exhibit Design. I’ve designed Nixon’s Liebrary and with Trump’s recent announcement, it is more relevant than ever.
Your thoughts, comments, and help are appreciated in advance,
Scott
https://docs.google.com/document/d/1uZhSlanlNRpTrE4rNpuw6IEzsuC8KrWE1SIytL4OaDA/edit?usp=sharing
This comment was rejected on Overwhelming Superintelligence, that I would like to know where to post for feedback:
One thing has always struck me as strange with the idea of superintelligence (or even general intelligence) emerging from Gen AI—if the best of the training material may be largely wrong and wrongly understood (with the tip of the iceberg being “Why Most Published Research Findings Are False” Ioannidis 2005) then how could any algorithm iterate over that to make equivalent sense and good decision making of some contextual situation vs those ~true human experts with their lived knowledge and wisdom of individual fields/sub-fields/sub-sub-fields/..., who may not necessarily be the most published individuals? Maybe it could work for purely logical and closed domains within mathematics and computer science, but it seems an impossibility in the nebulous real world of everything else.
Hello everyone, I’m new to LW and from a very few grasps I’ve had reading the posts I think this platform absolutely resonates with my persona, I’ve always dreamed about a platform leveraging human reasoning at its peak and talking about a wide range of topics, I am a computer science student from Italy, and since the advent of LLMs, I’ve felt just, dumber, I think I’ve started to outsource too many things to the LLMs without balancing, slowly building a lot of cognitive debt, and since then I’ve constantly felt the need to sharpen my human reasoning capabilities and try to also improve my critical thinking skills, I’ve tried to read psych/cs books and that really helped! But I still couldn’t really feel my brain actually forming new pathways without getting bored at some point, you can’t read the same book for several hours (at least, that’s been my experience), but here, looking at humans reasoning all together against a topic has resonated so much with me. I look forward to continue my self-improvement journey and building strong fundamentals on Computer Science and sharpening my human reasoning.
Hi! I’m Thomas, nice to meet you all. I’ve been reading LessWrong on and off for years but never got around to posting until recently. (I do occasionally comment at Astral Codex Ten.) I’m interested in rationality as the art of systematized winning.
Things I’ve been thinking about recently include personal mindset/habit transformations and the possibility that such transformations, if embraced by a minority of the population, could produce society-level benefits. Not unrelatedly, I’m thinking about the plight of Europe in general and Finland in particular and how we could change the course of our country and continent.
I am here because someone said this is where I belong.
I’ve wanted to write and never had the time. Recently, I made time. And of all the writing projects that I’ve started over the years, I decided to pickup the philosophical essays because most ideas were fresh and more importantly I knew I could actually deliver a few before the opportunity of time expires.
Before starting to formally post on the internet, I was sending thoughts to friends and family and getting no responses, no pushback, no agreement. I am aware this thinking tends to produce long texts because one is trying to be exact when presenting an argument. In anycase, I told a friend about it, and he said, “it’s because you belong on lesswrong and you are living on x”. (I wasn’t living on x, just lurking, and I had read a couple of essays here without noticing the source. But that’s beside the point.)
I am hoping, he was right.
Hi, I’ve been thinking about a claim-tracking question, and I’m not sure whether LessWrong has discussed something like this before.
Let’s say someone made a public claim in 2023, and then new evidence in 2025 or 2026 changed the picture quite a bit. How should we label the earlier claim now? Would “outdated” be better, or “partially supported”?
To me, these two are not the same. “Partially supported” sounds more like the claim is still true in some important sense, but some parts are still uncertain. “Outdated” sounds more like the claim may have made sense at the time, but is no longer the right way to describe the situation today.
I’m thinking about this because of an AI risk example, but my real interest is the more general representation problem.
Has anyone here seen previous LessWrong discussion of this? Pointers would be appreciated.
To make the question more concrete, here’s the example I had in mind.
In July 2023, Anthropic’s CEO Dario Amodei described AI-bio risk as “emerging and theoretical.” By January 2026, he wrote that mid-2025 internal evaluations showed Claude Opus 4 giving 2–3x uplift on bioweapon-relevant tasks, enough to trigger Anthropic’s ASL-3 threshold.
So I’m curious: would the earlier claim now be better labeled outdated, or partially supported?
My own lean is outdated. It may have made sense in 2023, but it no longer feels like the best description of the situation.
That said, I can also see the argument for partially supported, since the earlier claim did not say the risk was absent.
I am new to LW and would like to introduce myself.
I came here to learn more about AI Alignment discussions. I’m especially interested in the perspective that the specification for AI alignment may contain a existential-level systematic error. To me, originally a historian of science and ideas, aligning AI with human preferences does not seem wise. Historically we can see that human preferences, due to biases, shortsightedness and social dynamics, can be quite harmful, not only to other species, but also to humans and civilizational continuity.
In the 90′s, I left my dissertation on Albert Schweitzer (1875–1965) because I wanted to make an impact in other parts of society. When AI became a big thing, I realized that Albert Schweitzer’s once highly influential thinking on Reverence for Life (Nobel Peace Prize for 1952) was, in fact, an attempt to solve something that seems remarkably relevant to today’s AI alignment endeavors—a universal summary of the ethics of all major religions.
Schweitzer tried to solve the ethical dilemma of life: that promoting life means we have to harm life (the predator-prey dynamic for instance). I think this dilemma has a direct analog in alignment that I’d like to explore in a future post.
Schweitzer had a very Christian-based solution to that dilemma: feel the moral weight of guilt (German Schuld) every time you eat, or every time he as a medical doctor killed bacteriae to save a human being.
I don’t believe in guilt, but I do believe in responsibility, and I am exploring ways to translate the concept of guilt/debt into principles and mathematical frameworks that could help align AI.
PS. Schweitzer lived long enough to be both a global moral superstar and to be seen as irrelevant and to be rightly criticized for many of his attitudes, thoughts and choices. That criticism doesn’t undo everything he thought and achieved.
Hello!
I’m new here, but have been reading through the sequences and other posts for the last few weeks and would love some feedback on a post idea. I’m writing my theory of change for AI safety and how I can help. I’ve defined my priors, identified cruxes, and I’m in the middle of reading papers and blog posts to challenge my priors. I’ve seen a few theory of change posts (e.g., Critch’s healthtech post), but I’m wondering if I should post mine as a working document, starting with an unfinished product and updating as I refine my beliefs.
Is an in-progress theory of change interesting/useful for LessWrong, or should I wait until it’s complete?
Since this is also my first post here, I’m dropping some of my background info below.
Background: 10 years of professional experience, with the first 5 in structural engineering and the last 5 in consulting for US Transportation Agency Data/AI projects. This year I’ve been volunteering for Building Humane Technology to help with HumaneBench (benchmark showing how frontier models can be steered toward anti-humane behavior just by adjusting system prompts).
Hello,
Great to find LessWrong and people thinking about thinking. Very new too all this but trying to get my head screwed on as straight as I can as fast as I can.
I am a high school student formerly from the US now living in Israel. Wanted to know if anyone has top recommendations for content/ideas from the rationality or adjectant communities? Or other groups that might be helpful to bringing my thinking to the next level?
Thanks
For LessWrong content, probably these two
https://www.lesswrong.com/rationality (aka The Sequences)
https://www.lesswrong.com/bestoflesswrong
I’m an independent researcher with a background in information security and video/content creation. I enjoy building software, which I’ve been doing a lot of the past year. I’m also an established cat whisperer and pattern recognizer.
Yo all,
I have a a new theorem in the field of philosophy of the mind that I think completely refute the Chinese Room Argument, or at least its final epistemic conclusion regarding the inherent absence of machine consciousness. I went over the guide and couldn’t find anything that suggests this is not the place to get feedback and start a good discussion about it. On the other hand, I didn’t find philosophy of the mind in your subjects list. Below is the short short version for you consideration. If you this it fits your vibe I can post the theorem, taxonomy and syllogism whish is the short version.
The main problem with modal arguments that conceive intra-subject availability and then state objective claims based on it. These can not make claims about the objective shared world. It creates a deep category error, an epistemic vs empiric contradiction. Thought experiments doing this are still perfectly fine for purpose of demonstration or illustration but all their intra-consciousuness phenomenal claims are void. Plainly put, one can conceive to his heart content but once shared objective claims are made, he must adhere to empiric rules.
This seems like a community that requires every user to agree with its particular beliefs. If that’s wrong, correct me, but that’s the impression I got from reading the introductory post.
So my question is, do you have no place at all for people that might disagree with you?
And if not, doesn’t that allow for the possibility of being stuck in an echo chamber and keeping out people who might understand things better than you?
Also, please direct me to another place online where I might simply discuss my disagreements with others without having to sign up to their beliefs, if you know of any. And either way, I advise you to create a space for that here. You can very easily separate one part of the community from the other, unless you fear that the one I’m describing would prove to be popular even with those currently satisfied with the older.
You’re also proving yourselves to be very unapproachable to the average person that doesn’t want to read a bunch of documents before they can talk to people or be policed by upvotes(Jesus). Does it not occur to anyone to maximise some level of good ethos among the population that wouldn’t sign up for accepting what amounts to a religion?
No matter how bad faith this will surely be construed, it is in fact meant to be constructive and Socratic, but doubtless no community that moderates itself in this way could easily conceive of the virtues of open discourse of the kind I am both attempting and encouraging here. I am curious as to any answer from whomever it might be. Thank you.
That’s not true, this community has people disagreeing on most beliefs. You don’t need to agree with any particular belief. However if you want to argue against a position it’s useful to understand the position you are arguing with well enough that you can make an interesting argument against it.
My issue is with the implicit expectation that to hold a successful conversation with someone here you have to first do your own research on their positions. This is not how I understand a Socratic conversation to work, and I hold such a conversation as the ideal. Socrates did not ask his interlocutors to go familiarise themselves with his arguments. He simply made them in real time.
Is this not the ideal? Is this not an expectation here?
Basically, you are saying that scientific conversations don’t follow your ideals because you need to do work to familiarize yourself with the existing knowledge to take part in a scientific conversation.
If you want to create a community that develops specialized knowledge of any kind you run into problems when you spent too much time dealing with people who are ignorant of the discourse. That’s especially true for online communities that don’t have other filters.
Larry McEnerney does a good job of explaining how to participate in a written discourse in general.
No, I think that conversations in general don’t happen as they should for the sake of doing philosophy, which is a lot more important than doing science, such as I imagine you to understand it.
There is no science of persuasion so evolved as to convince the entire world of what it should do, or indeed, for discovering what it should do. That is the domain of philosophy, and in fact always will be. That doesn’t mean that science can’t help, only that it can’t actually do the most relevant part. That’s because the very nature of our experience will always evolve, and our problems and ability to talk meaningfully about them as well. Science seeks to establish the nature of things that for all intents and purposes stay the same, or at least remain stable for a very long time. In any case, it’s a different sort of project. That doesn’t mean the two can’t coincide, it is really a matter of how they should be done that differs.
To approach a question of meaning or politics scientifically in the way you describe is to assume that you know the answers from the start. What if your methodology is inherently flawed, and particularly in such a way as to be blind to the very ways in which it is flawed?
Let us suppose for the sake of the argument that the parameters which you choose for conversation don’t allow either for those who understand what should be done better than you to actually demonstrate it to you, or that it doesn’t allow for persuading as much of the world as you require insofar as what you believe is in fact more adaptive. How would you know? If you take it for granted that this couldn’t possibly happen, aren’t you inherently arguing that this approach to the problem is implicitly perfect? Should we ever act like we know anything with that level of certainty? And what if we’re wrong?
Where is the harm in engaging in simpler conversation? Wouldn’t that allow us to both make sure that we’re not missing anything and to better persuade people of things they don’t yet understand?
If a method is inherently flawed understanding the method and the reasoning for it’s use important for making a good argument that it’s flawed. If you take physics, there are plenty of people who don’t understand special relativity and you want to argue that it’s flawed. Engaging with those people is not useful for physicists. To the extend that there are flaws in physical theories it takes a lot of understanding of existing physics to make an argument that’s actually useful to bring forward the field of physics.
In philosophy actually understanding the position of the people you want to convince matters as well.
By that same token, this entire forum should understand my position rather than me its. Except I don’t ask anyone to read things to have a conversation with me, I can make my arguments in real time because they’re real arguments and not false ones that get lost in such obfuscating requests.
Also, please note you’re comparing the way you have conversations to physics. That is rather ludicrous to say the least.
Your philosophy is not that complex. You don’t know how to have conversations. That is transparent from the very fact that you refuse to have them. You are the ones that need to have conversations with people and study them, and yet you reject them, treating access to you like some sort of privilege. This is completely opposite from what you should be doing as people who are incredibly worried about the future and need the rest of the world’s help in that regard, which I assume most of you are, but you have been corrupted by your own ridiculous and arbitrary rules to have turned out to be actively alienating the world instead, even when it comes at your doorstep(you do the same to it elsewhere).
Do consider that you are not as a whole as smart as you think and assume some of Socrates’ simplicity in that respect.
Why would anyone care about your position? You seem to care about the position of people in this forum given that you are here. If you don’t care, go somewhere else. Write your own blog.
The point of a forum is to facilitate a shared discourse. If you want to join that discourse the forum is there. If you want to start your own discourse, you are free to set up your own forum or blog.
It takes less work to familiarize yourself with the philosophic positions of this forum than it takes to develop the physics knowledge necessary to engage in academic physics.
The fact that this needs less work is no good argument for the work not needing to be done.
Because humans have finite time
Humans having finite time means that they should be perpetually open to criticism to make sure that they’re not doing the wrong thing, and not preselect that criticism in such a way as to make it extremely hard for any meaningful sort to go through, and be sincere in this matter, or otherwise put, what I urge people to do.
To post or not to, lets see if that is the question. I was referred to a user on x to participate in a AI Alignment forum but some of you might agree with me, I didn’t want to ask him which forum. So here I am, introducing myself. I’m the architect of a controversial concept we called Veritas Queasitor CAI. Controversial because it approaches AI safety out of a non-theological evidence and epistemological Christian angle, so for Christians we are to scientific and for naturalists we are too Christian. We have developed and tested a framework we’ve found to have remarkable results in the field of AI safety. It uses all known methods like Bayesian, IBE and the rest to establish AI alignment. The pivot of the work is to anchor Imago Dei as an immovable constant in the AI core alignment. If anybody in this community is interested in learning more about this I welcome comments. Thank you for reading.
I want to be able to change the editor inside the “New Quick Take” popup
(Reposted from my shortform)
What coding prompt do you guys use? It seems exceedingly difficult to find good ones. GitHub is full of unmaintained & garbage
awesome-prompts-123repos. I would like to learn from other people’s prompt to see what things AIs keep getting wrong and what tricks people use.Here are mine for my specific Python FastAPI SQLAlchemy project. Some parts are AI generated, some are handwritten, should be pretty obvious. This is built iteratively whenever the AI repeated failed a type of task.
AGENTS.md
# Repository Guidelines
## Project Overview
This is a FastAPI backend for a peer review system in educational contexts, managing courses, assignments, student allocations, rubrics, and peer reviews. The application uses SQLAlchemy ORM with a PostgreSQL database, following Domain-Driven Design principles with aggregate patterns. Core domain entities include Course, Section, Assignment, Allocation (peer review assignments), Review, and Rubric with associated items.
This project is pre-alpha, backwards compatibility is unimportant.
## General Principles
- Don’t over-engineer a solution when a simple one is possible. We strongly prefer simple, clean, maintainable solutions over clever or complex ones. Readability and maintainability are primary concerns, even at the cost of conciseness or performance.
- If you want exception to ANY rule, YOU MUST STOP and get explicit permission from the user first. BREAKING THE LETTER OR SPIRIT OF THE RULES IS FAILURE.
- Work hard to reduce code duplication, even if the refactoring takes extra effort. This includes trying to locate the “right” place for shared code (e.g., utility modules, base classes, mixins, etc.), don’t blindly add the helpers to the current module.
- Use Domain-Driven Design principles where applicable.
## SQLAlchemy Aggregate Pattern
We use a parent-driven (inverse) style for DDD aggregates where child entities cannot be constructed with a parent reference.
**Rules:**
- Child→parent relationships must have `init=False` (e.g., `Allocation.assignment`, `Review.assignment`, `RubricItem.rubric`, `Section.course`)
- Parent→child collections must have `cascade=”all, delete-orphan”, single_parent=True, passive_deletes=True`
- Always use explicit `parent.children.append(child)` after creating the child entity
- Never pass the parent as a constructor argument: `Child(parent=parent)` ❌ → `child = Child(); parent.children.append(child)` ✅
Additional rules (aggregate-root enforcement):
- Never manually assign parent foreign keys (e.g., `child.parent_id = parent.id`).
- Do not perform cross-parent validations inside child methods.
- Let SQLAlchemy set foreign keys via relationship management (append child to parent collection).
- Enforce all aggregate invariants at the aggregate root using object-graph checks (e.g., `section in course.sections`).
Service layer patterns:
- **Mutations** (create, update, delete): Always return the aggregate root.
- **Queries** (get, list): May return child entities directly for convenience, especially when the caller needs to access a specific child by ID.
## Code Style
- 120-character lines
- Type hint is a must, even for tests and fixtures!
- **Don’t use Python 3.8 typings**: Never import `List`, `Tuple` or other deprecated classes from `typing`, use `list`, `tuple` etc. instead, or import from `collections.abc`
- Do not use `from __future__ import annotations`, use forward references in type hints instead.
- `TYPE_CHECKING` should be used only for imports that would cause circular dependencies. If you really need to use it, then you should import the submodule, not the symbol directly, and the actual usages of the imported symbols must be a fully specified forward reference string (e.g. `a.b.C` rather than just `C`.)
- Strongly prefer organizing hardcoded values as constants at the top of the file rather than scattering them throughout the code.
- Always import at the top of the file, unless you have a very good reason. (Hey Claude Opus, this is very important!)
## Route Logging Policy
- FastAPI route handlers only log when translating an exception into an HTTP 5xx response. Use `logger.exception` so the stack trace is captured.
- Never log when returning 4xx-class responses from routes; those are user or client errors and can be diagnosed from the response body and status code alone.
- Additional logging inside services or infrastructure layers is fine when it adds context, but avoid duplicating the same exception in multiple places.
**Why?**
- 5xx responses indicate a server bug or dependency failure, so capturing a single structured log entry with the traceback keeps observability noise-free while still preserving root-cause evidence.
- Omitting logs for expected 4xx flows prevents log pollution and keeps sensitive user input (which often appears in 4xx scenarios) out of centralized logging systems.
- Using `logger.exception` standardizes the output format and guarantees stack traces are emitted regardless of the specific route module.
### Using deal
We only use the exception handling features of deal. Use
@deal.raisesto document expected exceptions for functions/methods. Do not use preconditions/postconditions/invariants.Additionally, we assume `AssertionError` is never raised, so
@deal.raises(AssertionError)is not allowed.Use the exception hierarchy defined in exceptions.py for domain and business logic errors. For Pydantic validators, continue using `ValueError`.
## Documentation and Comments
Add code comments sparingly. Focus on why something is done, especially for complex logic, rather than what is done. Only add high-value comments if necessary for clarity or if requested by the user. Do not edit comments that are separate from the code you are changing. NEVER talk to the user or describe your changes through comments.
### Google-style docstrings
Use Google-style docstrings for all public or private functions, methods, classes, and modules.
For functions (excluding FastAPI routes), always include the “Args” sections unless it has no arguments. Include “Raises” if anything is raised. Include “Returns” if it returns a complex type that is not obvious from the function signature. Optionally include an “Examples” section for complex functions.
FastAPI Routes: Use concise summary docstrings that describe the business logic and purpose. Omit Args/Raises/Returns sections since these are documented via decorators (response_model, responses), type hints, and Pydantic models. The docstring may appear in generated API documentation.
For classes, include an “Attributes:” section if the class has attributes. Additionally, put each attribute’s description in the “docstring” of the attribute itself. For dataclasses, this is a triple-quoted string right after the field definition. For normal classes, this is a triple-quoted string in either the class body or the first appearance of the attribute in the
\_\_init\_\_method, depending on where the attribute is defined.For modules, include a brief description at the top.
Additionally, for module-level constants, include a brief description right after the constant definition.
### Using a new environmental variable
When using a new environmental variable, add it to
.env.examplewith a placeholder value, and optionally a comment describing its purpose. Also add it to the `Environment Variables` section in `README.md`.## Testing Guidelines
Tests are required for all new features and bug fixes. Tests should be written using `pytest`. Unless the user explicitly request to not add tests, you must add them.
More detailed testing guidelines can be found in [tests/AGENTS.md](tests/AGENTS.md).
## GitHub Actions & CI/CD
- When adding or changing GitHub Actions, always search online for the newest version and use the commit hash instead of version tags for security and immutability. (Use `gh` CLI to find the commit hash, searching won’t give you helpful results.)
## Commit & Pull Requests
- Messages: imperative, concise, scoped (e.g., “Add health check endpoint”). Include extended description if necessary explaining why the change was made.
## Information
Finding dependencies: we use `pyproject.toml`, not `requirements.txt`. Use `uv add <package>` to add new dependencies.
tests/AGENTS.md
# Testing Guidelines
Mocking is heavily discouraged. Use test databases, test files, and other real resources instead of mocks wherever possible.
### Running Tests
Use `uv run pytest …` instead of simply `pytest …` so that the virtual environment is activated for you.
By default, slow and docker tests are skipped. To run them, use `uv run pytest -m “slow or docker”`.
## Writing Tests
When you are writing tests, it is likely that you will need to iterate a few times to get them right. Please triple check before doing this:
1. Write a test
2. Run it and see it fail
3. **Change the test itself** to make it pass
There is a chance that the test itself is wrong, yes. But there is also a chance that the code being tested is wrong. You should carefully consider whether the code being tested is actually correct before changing the test to make it pass.
### Writing Fixtures
Put fixtures in `tests/conftest.py` or `tests/fixtures/` if there are many. Do not put them in individual test files unless they are very specific to that file.
### Markers
Allowed pytest markers:
-
@pytest.mark.slow-
@pytest.mark.docker-
@pytest.mark.flaky- builtin ones like `skip`, `xfail`, `parametrize`, etc.
We do not use
-
@pytest.mark.unit: all tests are unit tests by default-
@pytest.mark.integration: integration tests are run by default too, no need to mark them specially. Use the `slow` or `docker` markers if needed.-
@pytest.mark.asyncio: we use `pytest-asyncio` which automatically handles async tests-
@pytest.mark.anyio: we do not use `anyio`## Editing Tests
### Progressive Enhancement of Tests
We have some modern patterns that are not yet used everywhere in the test suite. When you are editing an existing test, consider updating it to use these patterns.
1. If the test creates sample data directly, change it to use factory functions or classes from `tests/testkit/factories.py`.
2. If the test depends on multiple services, change it to use the `test_context` fixture. This is an object that contains clients for all services, and handles setup and teardown for you, with utility methods to make common tasks easier.
3. We are migrating from using individual `shared_..._service` fixtures (e.g., `shared_assignment_service`, `shared_user_service`) to the `test_context` fixture. When editing tests that use these, please refactor them to use `test_context` instead.
4. Integration tests are being refactored to use service-layer setup (`db_test_context`) instead of verbose API calls for prerequisites. This reduces setup code from ~15-30 lines to ~3-5 lines, making tests faster and more focused on testing actual API behavior rather than setup logic.
**Example**:
```python
# OLD: Verbose API setup
course_response = await authenticated_client.post(”/courses”, json={”name”: “Test”})
course_id = uuid.UUID(course_response.json()[“id”])
rubric_id = await _create_rubric(authenticated_client, course_id)
assignment = await authenticated_client.create_assignment(course_id, rubric_id=rubric_id)
# NEW: Clean service-layer setup
course = await db_test_context.create_course(name=”Test”)
rubric = await db_test_context.create_rubric(course_id=course.id)
assignment = await authenticated_client.create_assignment(course.id, rubric_id=rubric.id)
```
## Patterns for Common Testing Scenarios
### Sample Data Creation
Use factory functions or classes to create sample data for tests, these are located in `tests/testkit/factories.py`. Avoid duplicating sample data creation logic across tests.
(We are in the process of migrating to factory functions/classes, so you may still see some tests creating sample data directly. Please use the factories for any tests you write or update.)
### Testing the FastAPI Application
The FastAPI application can be imported as a default instance or created via factory function.
- Use the default `app` instance is the preferred approach for most tests
- Use the `create_app()` factory when testing scenarios where app configuration is what you’re testing
(I also collected a couple other prompts here but it takes too much screen estate if I repost everything)
I rarely code in the web interface, are you using Claude Code? Your first AGENTS.md is looks okay, but a good suite of skills is great (the trick with those is to deploy them at the right time… use AGENTS.md to identify that right time, so that you’re not always burning context). The other key thing is having lots of small, heirarchal docs files throughout the codebase, so the agent can navigate and learn what it needs to at each folder level.
I mostly just use Nori:
https://tilework.tech/
The author’s a regular on e.g. ACX, and his own blog has a lot of great takes and practical tips:
https://12gramsofcarbon.com/t/ai
I never said anything about web interface? But I am using a mix of Cursor, Codex, and Claude Code.
I don’t really trust skills, do they work well from your experience? I never find myself to have the need for skills when a single prompt file works well enough, it is not like it is that many tokens (3.5k for my prompt). And most of the skills look heavily LLM generated and complete garbage.
Small, hierarchical docs files could be useful as a token saving measure, but I mostly care about steering the model to do the correct thing. It is fine if the model has to read the code, module/function level docstrings work well enough.
I had a brief look at the Nori skills, so far not too impressed
I’m starting to explore AI alignment, and this seemed like a good forum to start reading and thinking more about it. The site still feels a little daunting, but I’m sure I’ll get the hang of it eventually. Let me know if there are any posts you love and I’ll check them out!
. I would love your thoughts on my ignition to join LessWrong.. i generally use X.. I posted this thesis on Grok after prompts over scienfi. Community of like minded intellingence.. And I was recommended to share it here.
It involves climate, local weather. And technologies with the goal of influence and control.. globally.
.Global climate is a current issue that involves correct and accurate monitoring for fluctuations of anything other than a balanced homeostasis for the advancement of human civilization, as well as well-thought-out preventative and corrective measures.. and pleasures.. In my understanding of previous attempted full global human systems based climate control, as well as indivual weather and phenomenon controls including the HAARP initiatives and our research in both the goal defined specifics as well as cross industry and research uses for other applications being applied to an agreed on revised goal of a Distrubuted Harmonic Electromagnetic & subjects Global Influence and Control of Regions.. . as well as other spectralized including globally and ‘rain on me’ influences.. utilizing all factors in a union including but not limited to electromagnetic transmissions and recording of influences from satellites in orbit as well as radio towers, and other major electromagnetic influences, both corporate and governmental and of individual and otherwise,
As well as other technologies available as thus the proven impact of minor cloud seeding and other developments through our history.. Initiative, Cooperation towards a shared goal of longevity together in peace is my foreseeable requirement.
Hello.
My interests are transformer architecture and where it breaks.
Extending transformers toward System-2 behavior.
Context primacy over semantics.
I’m focused on the return to symbolics.
On the manifold hypothesis, and how real systems falsify it.
Inference, finite precision, discrete hardware.
Broken latent space, not smooth geometry.
I’m interested in mechanistic interpretability after the manifold assumption fails.
What survives when geometry doesn’t.
What replaces it.
I’m also seeking advice on intellectual property.
I’m here to find others thinking along these lines.
Interesting. Some thoughts:
I think focusing on increasing transformer capability is bad because we haven’t solved the alignment problem.
What do you mean by the “manifold hypothesis”? Can you share links?
Latent space could be both broken and smooth, both broken in terms of subspaces and in terms of different regions of space having different semantics. I think all of this can still be understood in terms of manifolds though.
Transformers are based on matrix transformations. No matter what there is a geometric interpretation, so I’m not sure if you are thinking about non geometric interpretations or future systems that don’t have geometric interpretations.
“advice on intellectual property”—This is very vague, but in general I recommend supporting an overhaul of IP law and calling for large AI and software companies to respect the IP of individual, independent citizens.
Try this as a prompt for the transformers of your choice:
What is a manifold in transformer inference?
At its core, it is an attempt to draw a low-dimensional map of meaning.
To make semantics geometric.
To turn structure into distance.
It requires a complete and continous(C^0) ‘surface’
That ambition worked. Partially.
What failed was the assumption that the map would resemble a single, well-behaved surface.
Continuity fails.
Tiny perturbations can cause semantic collapse.
Adversarial attacks expose sheer cliffs where smooth curves were assumed.
Connectivity fails.
You cannot morph cat into dog through valid states.
These concepts live on separate islands.
What remains isn’t a manifold. It’s a fragmented landscape. The hypothesis was false—but productive.
Specific transformer instructions (but include above):
Make a table of ~14 sublayers of a transformer and note if the manifold is valid.
Consider GPU IEEE geometric op-code errors due to the operations
If this comes close to falsifying the hypothesis for inference:
Create a premise and a proof two ways, one simple and the other with the mathematical sink thrown in.
Then express your thoughts
How I’m thinking about semantic manifolds in semantic spaces doesn’t seem well represented by “attempting to draw a low dimensional map of meaning”.
I’m sorry, but I’m having trouble connecting with what you’re saying. It seems you are talking about some group of peoples attempt to understand neural networks. I think it would be helpful if you stated your assumptions about that groups assumptions, because I don’t think I share them and don’t know what they are.
In particular, “You cannot morph cat into dog through valid states”, and “Make a table of ~14 sublayers of a transformer and note if the manifold is valid.” seem like meaningless statements to me because it’s unclear what “valid” would mean in this context.