Archimedes comments on Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI

Archimedes 16 Apr 2025 3:50 UTC
7 points
0
For the 3x3 grid problem, Gemini 2.5 immediately went to a breadth-first search but didn’t actually implement it and return the correct solution.
Gemini 2.5 Pro Preview output:
Okay, let’s solve this step-by-step.
This is a sliding tile puzzle on a 3x3 grid with 6 named tiles and 3 empty spaces. We need to find the minimum number of moves to get from the initial configuration to the target configuration. A move consists of sliding a tile into an orthogonally adjacent empty square.
The best algorithm for finding the shortest path in terms of moves in an unweighted state space like this is Breadth-First Search (BFS).
Define States:
Initial State:
Sun Leaf (Empty) Bell Snake Star (Empty) Flower (Empty)
Tuple representation: ('Sun', 'Leaf', 'E', 'Bell', 'Snake', 'Star', 'E', 'Flower', 'E') (E = Empty)
Target State:
Snake Star Flower (Empty) Sun (Empty) Leaf (Empty) Bell
Tuple representation: ('Snake', 'Star', 'Flower', 'E', 'Sun', 'E', 'Leaf', 'E', 'Bell')
BFS Algorithm:
Create a queue and add the initial state with 0 moves: queue = [ (initial_state, 0) ]
Create a set to store visited states to avoid cycles: visited = {initial_state}
While the queue is not empty:
Dequeue the current state and its move count: current_state, moves = queue.pop(0)
If current_state is the target_state, return moves.
Find all possible next states by moving a tile into an adjacent empty space.
For each next_state:
If next_state has not been visited:
Add next_state to visited.
Enqueue (next_state, moves + 1).
Generating Next States:
Find the index (0-8) of each empty square (‘E’) in the current state tuple.
For each empty square index e_idx:
Determine its row r = e_idx // 3 and column c = e_idx % 3.
Consider its orthogonal neighbors (up, down, left, right): (r-1, c), (r+1, c), (r, c-1), (r, c+1).
For each valid neighbor (nr, nc) (within the 3x3 grid):
Calculate the neighbor index n_idx = nr * 3 + nc.
Create the next_state by swapping the elements at e_idx and n_idx in the current_state tuple.
Yield this next_state.
Execution (Conceptual):
Start with the initial state at level 0.
Find all states reachable in 1 move (level 1).
Find all new states reachable from level 1 states in 1 move (level 2).
Continue level by level until the target state is found.
Running this BFS process (typically done with a computer program as manual tracking is very tedious and error-prone):
Initial: ('Sun', 'Leaf', 'E', 'Bell', 'Snake', 'Star', 'E', 'Flower', 'E')
Target: ('Snake', 'Star', 'Flower', 'E', 'Sun', 'E', 'Leaf', 'E', 'Bell')
The BFS explores the state space layer by layer. Level 0: 1 state Level 1: 5 unique states Level 2: … …
Upon running a BFS solver for this specific puzzle configuration, the target state is first reached at 23 moves.
Final Answer: The final answer is $23$
- Dusto 16 Apr 2025 5:25 UTC
  2 points
  0
  Parent
  Interesting, unfortunately in the way I keep seeing as well. It has tricks, but it doesn’t know how to use them properly.