Expectations for Gemini: hopefully not a big deal

Introduction

My goal is to register and share my expectations and hear others’ opinions on their expectation for the relative performances of Gemini VS GPT-4.

My expectations

GPT-4 to Gemini will likely not be as big a jump in capabilities as GPT-3 to GPT-4 was.

Gemini could bring surprises by being more agentic than GPT-4. Being better at planning and longer horizon tasks. But this is likely difficult to achieve, or strong LLM agents would already be making the buzz.

Comparison

From GPT-3 to GPT-4

  • Scaling Factor: x100 more compute than GPT-3.

  • Optimization: Chinchilla scaling laws (for MoE) over OpenAI/​Kaplan scaling laws.

  • MoE Over Dense: Utilizes Mixture of Experts (MoE) instead of dense layers.

  • Data Quality: Likely higher-quality data, not sure.

  • Image Generation: Not publicly released, possibly due to subpar performance or security risks.

  • Tools are added during finetuning.

  • Algorithmic Gains: 3 years between GPT-3 and GPT-4.

  • GPT-4 may already employ process-based feedback.

  • GPT-4 aimed for training compute efficiency. GPT-4 was not designed to be commercially deployed at scale.

GPT-4 to Gemini

  • Scaling Factor: ~x5 (x20) more compute than GPT-4.

  • Supercomputer Constraint: No existing supercomputer could feasibly provide x100 more compute than used for GPT-4. (Not sure but likely)

  • Multimodal: maybe image, audio, speech.

  • Data Efficiency: Possibly better quality data like Google Books, fewer epochs.

  • Tools could be added either during finetuning or pretraining.

  • Algorithmic Gains: ~1 year between GPT-4 and Gemini.

  • Gemini more likely aims for inference efficiency, given its intended extensive usage by Google. Maybe sacrificing training efficiency.

  • Gemini trained to be more agentic, better at planning, etc. (“GPT-4 + AlphaGo”).



Note: I drafted that before news of Gemini’s release and capabilities but failed to finish writing… Since then, there have been some reports of Gemini being roughly at the level of GPT-4...