papetoast

Karma: 380

Year 4 Computer Science student

find me anywhere in linktr.ee/papetoast

papetoast 21 Oct 2025 5:25 UTC
3 points
0
in reply to: Dagon’s comment on: lc’s Shortform
You may have misunderstood. I am asking about why your reply got −5 agreement vote despite it seeming correct to me, nothing related to the other comments.

papetoast 20 Oct 2025 1:17 UTC
1 point
0
in reply to: Dagon’s comment on: lc’s Shortform
Can someone explain why they disagree? I don’t see a particularly obvious reason.

papetoast 18 Oct 2025 8:37 UTC
2 points
0
on: Can LLMs Coordinate? A Simple Schelling Point Experiment
Reading through all the responses the one thing that sticks out is Gemini-2.5 really, really wants to write the first character in caps.

papetoast 17 Oct 2025 4:09 UTC
3 points
0
in reply to: Gordon Seidoh Worley’s comment on: How I Became a 5x Engineer with Claude Code

what it produces in these unmonitored runs takes more work for me to clean up than just iterating with Claude directly

It may be the type of work that we are doing differs then.

Also, you don’t seem too bothered that running claude code implies a responsibility to review the code soon-ish (or have your local codebase go increasingly messy). The fact that I don’t need to worry about state with PR agents mean it is more affordable to spin more attempts, and because more attempts can be ran simultaneously, each individual attempt can be of lower quality, as long as the best attempt is good. Deciding that the code is garbage and not worth any time cleaning up is much faster than cleaning up, so in general I don’t find the initial read-through of the n attempts to take that much time. At the end I still only spin up codex on desktop if I think the task has reasonable chance to be done well, which really depends on the specific task size/difficulty/type (bug fix, refactor, adds). It’s also likely that claude code work better for you because you’re more experienced and can basically tell claude exactly what to do when it’s stuck.

papetoast 16 Oct 2025 13:09 UTC
3 points
0
on: How I Became a 5x Engineer with Claude Code
I like to use the PR agents in some cases. (But I still manually checkout on those branches and rebase, split the commits or rewrite some stuff)
1. spin off tasks when I’m on mobile
2. it is easier to do multiple parallel attempts on the same task when I know the output probably suck. And not gonna lie OpenAI’s codex cloud has very lenient compute limit so I also feel like I’m saving money this way.
3. they live in (other people’s) containers so I don’t need to worry about multiple agents colliding with each other. I know git worktrees exist but juggling the which worktree is on which branch turns out to be somewhat annoying too.
4. They are good for queueing up tasks that I don’t expect to have to bandwidth to start working on today. I can make the agents do the PR today and forget about them until a few days later.

papetoast 14 Oct 2025 0:31 UTC
4 points
0
in reply to: Alex Vermillion’s comment on: Open Thread Autumn 2025
LW uses graphql. You can follow the guide below for querying if you’re unfamiliar with it.

https://www.lesswrong.com/posts/LJiGhpq8w4Badr5KJ/graphql-tutorial-for-lesswrong-and-effective-altruism-forum (For step 3 it seems like you now want to hover over output_type instead of input)

papetoast 14 Oct 2025 0:31 UTC
1 point
0
on: GraphQL tutorial for LessWrong and Effective Altruism Forum
For step 3 it seems like you now want to hover over output_type instead of input

papetoast 13 Oct 2025 2:14 UTC

1 point

on: papetoast’s low quality shortforms

How I use AI for coding.

I wrote this in like 10 minutes for quick sharing.

I am not a full time coder, I am a student who code like 15-20 hours a week.
- Investing too much time on writing good prompts make little sense. I go with the defaults and add pieces of nudges as needed. (See one of my AGENTS .md at the end)
Mainly codex (cloud) and Cursor. Claude Code works, but being able to easily revert is helpful, so Cursor is better.
- I still try out claude code for small pieces of edits, but it doesnt feel worth it.
- I have no idea why people like claude code so much? CLI is inferior to GUI
- Using cursor means I don’t need to have multiple git worktrees for each agent, as long as I get them to work on different parts of the codebase
Mobile coding is real and very convenient with codex (cloud), but I still review and edit on desktop.
Using multiple agents is possible, but usually one big feature and multiple smaller background edits.
- Or multiple big features using codex cloud, and delay review to a later time.
Codex cloud is good but only generate one commit for PR, often I need to manually split them up. I am eyeing on other cloud agents solution but havent tried them seriously yet.

Current prompt for one of the python projects

## Code Style  
- 120-character lines  
- Type hints is a must  
- **Don't use Python 3.8 typings**: Never import `List`, `Tuple` or other deprecated classes from `typing`, use `list`, `tuple` etc. instead, or import from `collections.abc`  
- Do not use `from __future__ import annotations`, use forward references in type hints instead. `TYPE_CHECKING` should be used only for imports that would cause circular dependencies.  
  
## Documentation and Comments  
Add code comments sparingly. Focus on why something is done, especially for complex logic, rather than what is done. Only add high-value comments if necessary for clarity or if requested by the user. Do not edit comments that are separate from the code you are changing. NEVER talk to the user or describe your changes through comments.  
  
### Using a new environmental variable  
When using a new environmental variable, add it to `.env.example` with a placeholder value, and optionally a comment describing its purpose. Also add it to the `Environment Variables` section in `README.md`.  
  
### Using deal  
We only use the exception handling features of deal. Use `@deal.raises` to document expected exceptions for functions/methods. Do not use preconditions/postconditions/invariants.  
  
Additionally, we assume `AssertionError` is never raised, so `@deal.raises(AssertionError)` is not allowed.  
  
## Testing Guidelines  
To be expanded.  
  
Mocking is heavily discouraged. Use test databases, test files, and other real resources instead of mocks wherever possible.  
  
Allowed pytest markers:  
- `@pytest.mark.integration`  
- `@pytest.mark.slow`  
- `@pytest.mark.docker`  
- builtin ones like `skip`, `xfail`, `parametrize`, etc.  
  
We do not use  
- `@pytest.mark.unit`: all tests are unit tests by default  
- `@pytest.mark.asyncio`: we use `pytest-asyncio` which automatically handles async tests  
- `@pytest.mark.anyio`: we do not use `anyio`  
### Running Tests  
Use `uv run pytest …` instead of simply `pytest …` so that the virtual environment is activated for you.  
  
## Asking for Help  
- Refactoring:  
As a command-line only tool, you do not have access to helpful IDE features like “Refactor > Rename Symbol”. Instead, you can ask the user to rename variables, functions, classes, or other symbols by providing the current name and the new name. It is important that you don’t rename public variables yourself, as you might miss some occurrences of the symbol across the codebase.  
  
## Information  
Finding dependencies: we use `pyproject.toml`, not `requirements.txt`. Use `uv add <package>` to add new dependencies.

(Note that the Asking for Help is basically useless. It was experimental and I never got asked lol)

papetoast 8 Oct 2025 3:21 UTC
5 points
1
on: You Should Get a Reusable Mask
I don’t doubt the conclusion, but I think we would be buying (life expectancy—age) life years instead of 1 life.

papetoast 5 Oct 2025 14:28 UTC
1 point
0
in reply to: lsusr’s comment on: What I’ve Learnt About How to Sleep
Are you guys talking about tin foil for small lights that some appliances emit? For windows I don’t understand why not just use a curtain.

papetoast 3 Oct 2025 0:54 UTC
1 point
0
in reply to: niplav’s comment on: niplav’s Shortform
It is a bit unintuitive for me that hallucination are made-up inputs, but it does make sense.

papetoast 21 Sep 2025 2:41 UTC
1 point
2
in reply to: DirectedEvolution’s comment on: AllAmericanBreakfast’s Shortform
Hard to tell simply with what you said, mind sharing the conversation?

papetoast 3 Sep 2025 5:14 UTC
1 point
0
in reply to: Measure’s comment on: Quality Precision
I apologize. Should have searched before talking.

papetoast 2 Sep 2025 13:30 UTC
0 points
0
in reply to: Measure’s comment on: Quality Precision
Side note: This seems like a completely different topic from your top level comment. Kind of weird to start a mostly tangential argument inside an unresolved argument thread.
You’d still be better off creating the 1M x 100 world than the (1M + 1) x (100 - ε) world.
1. Where does (1M + 1) come from?
  1. In the post Ben mentions that manufacturer doing hundreds of experiments, not millions. Of course, in the limiting case the smallest quality drop can and will be observed, but I believe Ben is not talking about that.
  2. Even we use the 1M base figure, it doesn’t explain why it is +1 rather than e.g. +1000
2. You are assuming that the icecream manufacturer is trying to maximise aggregate utility, which seems obviously false to me.

papetoast 2 Sep 2025 6:22 UTC
3 points
0
on: Dating Roundup #7: Back to Basics
Alternatively, what about matching people by browser history? If there is a way to avoid data security and privacy concerns (ha!) then there are actually a lot of advantages.
I have recently learned that Fully Homomorphic Encryption (doing calculations on encrypted data) 1. exists and 2. is usable in a small scale.
https://bozmen.io/fhe
https://bozmen.io/fhe-current-apps (FHE Real-world Applications)
Current FHE has 1,000x to 10,000x computational overhead compared to plaintext operations. On the storage side, ciphertexts can be 40 to 1,000 times larger than the original. It’s like the internet in 1990—technically awesome, but limited in practice.

papetoast 1 Sep 2025 1:01 UTC
1 point
0
in reply to: Optimization Process’s comment on: Optimization Process’s Shortform
What do you mean by “retraction”? Do you just mean an opposite statement “sharks are older than trees” --> “sharks are not older than trees”, or do you mean something more specific?

Assuming just a general contrasting statement, my gut feeling is that 1. this heuristic is true for certain categories of statements, but generates wrong intuition for other categories 2. this heuristic works, but rarely because of memetic reasons, instead it is just signal to noise ratio of the subjects.

Currently I am thinking about counterexamples from statements that roughly equates to a recommendation “Twitch is the best streaming platform” (I know it isn’t very fitting as a memetic statement), which heuristically sounds plausibly true to me because 1. I know there is a very small number of streaming platforms 2. people who talk about this is likely to know what they are talking about

papetoast 29 Aug 2025 4:11 UTC
3 points
10
in reply to: Raemon’s comment on: Raemon’s Shortform Feed
the commenter’s job to pay via microtransactions, and maybe the author can tip back if they like it via a Flattr-ish.
Yes, I feel like it is worse than author or forum paying though, because of incentives. There are other possible ways like the commenter paying for failed comments and author/forum paying for those that passed.
Monthly subscription is also possible yeah. I had this in mind and swept it under “the forum paying for the model and getting the cost back from elsewhere”.
privacy concerns
you misunderstood, I meant that some people probably don’t want their account to be traceable to their real identity, any monetary transaction is problematic unless crypto

papetoast 29 Aug 2025 2:19 UTC
3 points
0
in reply to: Raemon’s comment on: Raemon’s Shortform Feed
- rate limits seems like a must
  - but maybe it can be quite lax if a cheap model is used
- So you now need to pay before you post?
  - privacy concerns
  - and comments are disabled when you’re out of funds? natural consequence but lol.
- Long comments that don’t pass in the first try likely motivates comment author to a. jailbreak / b. let another LLM rewrite it rather than drafting a new one
- I would like it more if it is the forum paying for the model, and getting the cost back from elsewhere.

papetoast 27 Aug 2025 2:52 UTC
2 points
0
in reply to: samuelshadrach’s comment on: xpostah’s Shortform
You can also pay $10/mo to Kagi and get different filter presets (“lenses”). Is it worth the price for you? idk.

papetoast 27 Aug 2025 2:46 UTC
1 point
0
on: Open Thread—Summer 2025
I think the Quick Takes feed needs the option to sort by newest. It makes no sense that I get fed the some posts 3-7 times in an unpredictable order if I read the feed once per day.