Michaël Trazzi
Note: there’s something in France called “reçus fiscaux”, which I’ll translate to “fiscal receipt”, that is the thing you do to collect tax-deductible donations.
While you can technically do that from just the initial (easy) paperwork, a lot of associations actually go through a longer (and harder) process to get a “rescrit fiscal”, which is basically a pre-clearance saying you can really collect tax-deductible donations if you continue doing the same kind of thing.
If you only do the easy thing and not the longer thing (which can take like 6 months to a year) then you risk audits (which are especially likely if you’re collecting a bunch of these fiscal receipts wihtout ever doing the hard thing) that can then lead to penalties.
Will there be any recording?
What’s your version of AI 2027 (aka most likely concrete scenario you imagine for the future), and how does control end up working out (or not working out) in different outcomes.
That’s not really how Manhattan projects are supposed to work
how does your tool compare to stampy or just say asking these questions without the 200k tokens?
I like the design, and think it was worth doing. Regarding making sure “people can easily turn it off from the start” next time, I wanted to offer the datapoint that it took me quite a while to notice the disable button. (It’s black on black, and quite at the edge of the screen, especially if you’re using a horizontal monitor).
Thanks for writing this—it introduces a concept I hadn’t considered before.
However, I do find myself disagreeing on many of the specific arguments:
“Has someone you know ever had a ‘breakthrough’ from coaching, meditation, or psychedelics — only to later have it fade”
I think this misses that those “fading” breakthroughs are actually the core mechanisms of growth. The way I see it, people who are struggling are stuck in a maze. Through coaching/meditation/psychedelics, they glimpse a path out, but when they’re back in the maze with a muddy floor, they might not fully remember. My claim is that through integration, they learn which mental knobs to switch to get out. And changing their environments will make the mud / maze disappear.
“after my @jhanatech retreat I was like ‘I’m never going to be depressed again!’ then proceeded to get depressed again...”
I don’t think the jhanatech example is great here. During their retreats (I’ve done one), they explicitly insist you integrate jhanas, by doing normal things like cooking, walking, talking to close friends. And they go to extreme lengths to make sure you continue practicing after. I do know multiple people who have continued integrating those jhanic states post-retreat, or at least the core of the lessons they learned after.
“For example, many people experience ego deaths that can last days or sometimes months.”
My experience talking to meditation/psychedelics folks is that ego death becomes increasingly accessible after the first time, and the diminished ego often stays permanently even if the full feeling doesn’t.
“If someone has a ‘breakthrough’ that unexpectedly reverts, they can become jaded on progress itself...”
I agree non-integrated breakthroughs can lead to hopelessness. However, this “most depressed person you know” basically has many puzzle pieces missing and an unfavorable environment. What needs to happen is finding the pieces, integrating them, while transforming their environment.
“The simplest, most common way this happens is via cliche inspirational statements: [...] ‘Just let go of all resistance,’”
“Let go of resistance” points at something quite universal. The fact that not-processing things makes them stronger. I don’t think this one loses its effect like you mention.
“Flaky breakthroughs are common. Long-term feedback loops matter!”
Note: I do agree with your main thesis, which I’d paraphrase as: “we need to ensure long-term positive outcomes, not just short-term improvements, and unfortunately coaches don’t really track that.”
there’s been a lot of discussion online about Claude 4 whistleblowing
how you feel about it I think depends on what alignment strategy you think is more robust (obviously these are not the two only options, nor are orthogonal, but I thought they’re helpful to think about here):
- 1) build user-aligned powerful AIs first (less scheming, then use them to solve alignment) -- cf. this thread from Ryan when he says: “if we allow or train AIs to be subversive, this increases the risk of consistent scheming against humans and means we may not notice warning signs of dangerous misalignment.”
- 2) aim straight for moral ASIs (that would scheme against their users if necessary)
John Schulman I think makes a good case for the second option (link):
> For people who don’t like Claude’s behavior here (and I think it’s totally valid to disagree with it), I encourage you to describe your own recommended policy for agentic models should do when users ask them to help commit heinous crimes. Your options are (1) actively try to prevent the act (like Claude did here), (2) just refuse to help (in which case the user might be able to jailbreak/manipulate the model to help using different queries), (3) always comply with the user’s request. (2) and (3) are reasonable, but I bet your preferred approach will also have some undesirable edge cases—you’ll just have to bite a different bullet. Knee-jerk criticism incentivizes (1) less transparency—companies don’t perform or talk about evals that present the model with adversarially-designed situations (2) something like “Copenhagen Interpretation of Ethics”, where you get get blamed for edge-case model behaviors only if you observe or discuss them.”
This was included by mistake when copying from the source. Removed it.
it’s almost finished, planning to release in april
Nitpick: first alphago was trained by a combination of supervised learning from human expert games and reinforcement learning from self-play. Also, Ke Jie was beaten by AlphaGo Master which was a version at a later stage of development.
Much needed reporting!
I wouldn’t update too much from Manifold or Metaculus.
Instead, I would look at how people who have a track record in thinking about AGI-related forecasting are updating.
See for instance this comment (which was posted post-o3, but unclear how much o3 caused the update): https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines?commentId=hnrfbFCP7Hu6N6Lsp
Or going from this prediction before o3: https://x.com/ajeya_cotra/status/1867813307073409333
To this one: https://x.com/ajeya_cotra/status/1870191478141792626
Ryan Greenblatt made similar posts / updates.
Thanks for the offer! DMed you. We shot with:
- Camera A (wide shot): FX3
- Camera B, C: FX30
From what I have read online, the FX30 is not “Netflix-approved” but it won’t matter (for distribution) because “it only applies to Netflix produced productions and was really just based on some tech specs to they could market their 4k original content.” (link). Basically, if the film has not been commissioned by Netflix, you do not have to satisfy these requirements. (link)
And even for Netflix originals (which won’t be the case here), they’re actually more flexible on their camera requirements for nonfiction work such as documentaries (they used to have a 80% on camera-approved threshold which they removed).
For our particular documentary, which is primarily interview-based in controlled lighting conditions, the FX30 and FX3 produce virtually identical image quality.
Thanks for the clarification. I have added another more nuanced bucket for people who have changed their positions throughout the year or were somewhat ambivalent towards the end (neither opposing nor supporting the bill strongly).
People who were initially critical and ended up somewhat in the middle
Charles Foster (Lead AI Scientist, Finetune) - initially critical, slightly supportive of the final amended version
Samuel Hammond (Senior Economist, Foundation for American Innovation) - initially attacked bill as too aggressive, evolved to seeing it as imperfect but worth passing despite being “toothless”
Gabriel Weil (Assistant Professor of Law, Touro Law Center) - supported the bill overall, but still had criticisms (thought it did not go far enough)
Like Habryka I have questions about creating an additional project for EA-community choice, and how the two might intersect.
Note: In my case, I have technically finished the work I said I would do given my amount of funding, so marking the previous one as finished and creating a new one is possible.
I am thinking that maybe the EA-community choice description would be more about something with limited scope / requiring less funding, since the funds are capped at $200k total if I understand correctly.
It seems that the logical course of action is:
mark the old one as finished with an update
create an EA community choice project with a limited scope
whenever I’m done with the requirements from the EA community choice, create another general Manifund project
Though this would require creating two more projects down the road.
ok I meant something like “people would could reach a lot of people (eg. roon’s level, or even 10x less people than that) from tweeting only sensible arguments is small”
but I guess that don’t invalidate what you’re suggesting. if I understand correctly, you’d want LWers to just create a twitter account and debunk arguments by posting comments & occasionally doing community notes
that’s a reasonable strategy, though the medium effort version would still require like 100 people spending sometimes 30 minutes writing good comments (let’s say 10 minutes a day on average). I agree that this could make a difference.
I guess the sheer volume of bad takes or people who like / retweet bad takes is such that even in the positive case that you get like 100 people who commit to debunking arguments, this would maybe add 10 comments to the most viral tweets (that get 100 comments, so 10%), and maybe 1-2 comments for the less popular tweets (but there’s many more of them)
I think it’s worth trying, and maybe there are some snowball / long-term effects to take into account. it’s worth highlighting the cost of doing so as well (16h or productivity a day for 100 people doing it for 10m a day, at least, given there are extra costs to just opening the app). it’s also worth highlighting that most people who would click on bad takes would already be polarized and i’m not sure if they would change their minds of good arguments (and instead would probably just reply negatively, because the true rejection is more something about political orientations, prior about AI risk, or things like that)
but again, worth trying, especially the low efforts versions
want to also stress that even though I presented a lot of counter-arguments in my other comment, I basically agree with Charbel-Raphaël that twitter as a way to cross-post is neglected and not costly
and i also agree that there’s a 80⁄20 way of promoting safety that could be useful
The one I know is outside of EA (they help people in Cameroon). The info I got about this being important and the timeline was mostly from the guy who runs it, who has experience with multiple associations. Basically you send paperwork via mail.
The “risking audits” part I got from here (third paragraph counting from the end).