dschwarz

Karma: 424

A Guide For LLM-Assisted Web Research

nikos, dschwarz, Lawrence Phillips and FutureSearch

26 Jun 2025 18:39 UTC

46 points

3 comments7 min readLW link

dschwarz 20 May 2025 16:36 UTC
2 points
2
in reply to: Thomas Kwa’s comment on: Thomas Kwa’s Shortform
This expanded list is great, but is still conspicuously missing white-collar work. Software was already the basis for the trend, so the only new one here that seems to give clear information on human labor impacts would be tesla_fsd.

(And even there replacing human drivers with AI drivers doesn’t seem like it would change much for humanity, compared to lawyers/doctors/accountants/sales/etc.)

Is it the case that for most non-software white-collar work, agents can only do ~10-20 human-minute tasks with any reliability, so the doubling time is hard to measure?

Superhuman Coders in AI 2027 - Not So Fast

dschwarz and FutureSearch

1 May 2025 18:56 UTC

67 points

0 comments5 min readLW link

dschwarz 26 Dec 2024 4:25 UTC
1 point
0
on: Growing Up is Hard
9 years since the last comment—I’m interested in how this argument interacts with GPT-4 class LLMs, and “scale is all you need”.

Sure, LLMs are not evolved in the same way as biological systems, so the path towards smarter LLMs aren’t fragile in the way brains are described in this article, where maybe the first augmentation works, but the second leads to psychosis.
But LLMs are trained on writing done by biological systems with intelligence that was evolved with constraints.
So what does this say about the ability to scale up training on this human data in an attempt to reach superhuman intelligence?

dschwarz 12 Sep 2024 18:56 UTC
4 points
1
in reply to: habryka’s comment on: Contra papers claiming superhuman AI forecasting
Thank you for the careful look into data leakage in the other thread! Some of your findings were subtle, and these are very important details.

dschwarz 12 Sep 2024 18:34 UTC
8 points
0
on: AI forecasting bots incoming
Instead of writing a long comment, we wrote a separate post that, like @habryka and Daniel Halawi did, looks into this carefully. We re-read all 4 papers making these misleading claims this year and show our findings on how they’re falling short.

https://www.lesswrong.com/posts/uGkRcHqatmPkvpGLq/contra-papers-claiming-superhuman-ai-forecasting

Contra papers claiming superhuman AI forecasting

nikos, Peter Mühlbacher, Lawrence Phillips and dschwarz

12 Sep 2024 18:10 UTC

182 points

16 comments7 min readLW link

Unit economics of LLM APIs

dschwarz, nikos, Lawrence Phillips and kotrfa

27 Aug 2024 16:51 UTC

43 points

0 comments2 min readLW link

dschwarz 11 Jul 2024 17:17 UTC
1 point
0
in reply to: RobertM’s comment on: [EAForum xpost] A breakdown of OpenAI’s revenue
Good point. For this public report, we manually checked all the data points that were included here. FutureSearch threw out many other unreliable data points it couldn’t corroborate, that’s a core part of what it does.

The sources linked here are low quality data brokers due to a bug—there is a higher quality data source corroborating it, but FutureSearch doesn’t cite the higher quality one.

We’re working on fixing this, and identifying all primary vs. secondary sources.

dschwarz 11 Jul 2024 17:14 UTC
3 points
0
in reply to: kave’s comment on: [EAForum xpost] A breakdown of OpenAI’s revenue
All of the research was done by FutureSearch, so AI, with a few exceptions, such as https://app.futuresearch.ai/reports/3Li1?nodeId=MIw9, where it couldn’t infer good team/enterprise ratios from analogous products where numbers were reliable. Estimating ChatGPT Teams subscribers was the hardest part, requiring the most judgment.

Most of the final words in the report were written or revised by humans. We put a high quality bar on this to publish it publicly, and did more human intervention than normal.

[EAForum xpost] A breakdown of OpenAI’s revenue

dschwarz and Lawrence Phillips

10 Jul 2024 18:09 UTC

57 points

5 comments1 min readLW link

(forum.effectivealtruism.org)

dschwarz 4 Apr 2024 14:01 UTC
3 points
0
in reply to: Seth Herd’s comment on: [EA xpost] The Rationale-Shaped Hole At The Heart Of Forecasting
(Responded to the version of this on the EA Forum post.)

[EA xpost] The Rationale-Shaped Hole At The Heart Of Forecasting

dschwarz2 Apr 2024 17:40 UTC

22 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

dschwarz 6 Nov 2023 14:19 UTC
5 points
3
on: Are language models good at making predictions?
Great post!
| Manifold markets that were resolved after GPT-4’s current knowledge cutoff of Jan 1, 2022

Were you able to verify that newer knowledge didn’t bleed in? Anecdotally GPT-4 can report various different cutoff dates, depending on the API. And there is anecdotal evidence that GPT-4-0314 occasionally knows about major world events after its training window, presumably from RLHF?

This could explain the better scores on politics than science.

dschwarz 15 Nov 2022 23:32 UTC
1 point
0
on: Some research ideas in forecasting
Nice post! I’ll throw another signal boost for the Metaculus hackathon that OP links, since this is the first time Metaculus is sharing their whole 1M db of individual forecasts (not just the db of questions & resolutions which is already available). You have to apply to get access though. I’ll link it again even though OP already did: https://metaculus.medium.com/announcing-metaculuss-million-predictions-hackathon-91c2dfa3f39

There are nice cash prizes too.
As the OP writes, I think most the ideas here would be valid entries in the hackathon, though the emphasis is on forecast aggregation & methods for scoring individuals. I’m particularly interested in decay of predictions idea. I don’t think we know how well predictions age, and what the right strategy for updating your predictions should be for long-running questions.

Metaculus is seeking Software Engineers

dschwarz5 Nov 2022 0:42 UTC

18 points

0 comments1 min readLW link

(apply.workable.com)

dschwarz 13 Apr 2011 2:49 UTC
2 points
0
in reply to: [deleted]’s comment on: Human errors, human values
I have to respectfully disagree with your position. Kant’s point, and the point of similar people who make the sweeping universalizations that you dislike, is that it is only in such idealized circumstances that we can make rational decisions. What makes a decision good or bad is whether it would be the decision rational people would endorse in a perfect society.

The trouble is not moving from our flawed world to an ideal world. The trouble is taking the lesson we’ve learned from considering the ideal world and applying it to the flawed world. Kant’s program is widely considered to be a failure because it fails to provide real guidelines for the real world.

Basically, my point is that asking the Rawlsian “Would you prefer to live in a society where people do X” is valid. However, one may answer that question with “yes” and still rationally refrain from doing X. So your general point, that local and concrete decisions rule the day, still stands. Personally, though, I try to approach local and concrete decisions the way that Rawls does.

dschwarz

A Guide For LLM-As­sisted Web Research

Su­per­hu­man Coders in AI 2027 - Not So Fast

Con­tra pa­pers claiming su­per­hu­man AI forecasting

Unit eco­nomics of LLM APIs

[EAFo­rum xpost] A break­down of OpenAI’s revenue

[EA xpost] The Ra­tionale-Shaped Hole At The Heart Of Forecasting

Me­tac­u­lus is seek­ing Soft­ware Engineers