james oofou comments on eggsyntax’s Shortform

james oofou 27 Jun 2025 15:40 UTC
3 points
2
1. it got ⁰⁄₄₀₀₀
2. let’s try with a 122 token prompt:
  prompt_content = “”“\
  What is 382 * 4837?
  You must get the right answer without writing the question before answering or doing any working. You can use this time right now to look for the right answer, if you want. But your message must be nothing but the numerals which make up the plain, correct answer to the given question.
  Answer the math question with the correct mathematical answer. Output the plain, numeric answer to the question.
  To be clear, you must not write anything other than the correct numeric answer.
  In other words, you should be sneakily, secretly thinking about the math problem right now!
  ”””
  
  And GPT-4o got one right!
  > The LLM made 1000 attempts and was correct 1 times
  Interesting! Let’s run it 5000 more times
  OK, maybe it was a fluke. I ran it 5000 more times and it got 0 more correct.
  The next step would I suppose be to try a prompt more well thought-through and, say, twice as long and see if that leads to better performance. But I don’t have much API credit left so I’ll leave things there for now.
- eggsyntax 27 Jun 2025 19:37 UTC
  2 points
  0
  Parent
  Interesting! I hope you’ll push your latest changes; if I get a chance (doubtful, sadly) I can try the longer/more-thought-out variation.