The Accidental Breakthrough: How OpenAI Discovered the ‘Holy Grail’ of ChatGPT by Eric Schmidt

Dec 17, 20243 min read

Eric Schmidt, the former CEO of Google and one of the most influential figures in the tech world, recently shed light on one of the most unexpected breakthroughs in artificial intelligence: the creation of ChatGPT. Schmidt, who led Google’s meteoric rise into a tech giant, explains how OpenAI stumbled upon a game-changing discovery—Reinforcement Learning from Human Feedback (RLHF)—and unknowingly unlocked AI’s full potential.

For entrepreneurs, innovators, and tech enthusiasts, this story serves as a powerful reminder: Sometimes, even the biggest breakthroughs come as surprises.

Photo: The Diary Of A CEO (YouTube)

The Discovery That No One Saw Coming

When OpenAI was working on GPT-3, a model based on Google’s transformer technology (a foundational AI architecture invented at Google), something unexpected happened. The team at OpenAI decided to use a technique that, at the time, seemed almost like an afterthought:

Reinforcement Learning from Human Feedback (RLHF).

In simple terms, RLHF allows AI to improve itself by learning from humans. At the end of the training process, humans act as evaluators—comparing two AI responses and selecting the better one. The system then uses this feedback to refine its answers further, learning recursively until it gets better and better.

According to Schmidt:

“They sort of casually started to use humans to make it better… And all of a sudden, they had this huge success disaster.”

A “Holy Crap” Moment for AI

Schmidt humorously imagines the OpenAI team sitting around on a Thursday night, turning on GPT-3 with RLHF implemented, and suddenly realizing what they had:

“Holy crap, look how good this thing is.”

What makes this story remarkable is that OpenAI didn’t immediately grasp the scale of their achievement. They were simultaneously working on GPT-4 and saw RLHF as a small addition, not the “holy grail” it turned out to be.

But the results spoke for themselves. ChatGPT, powered by GPT-3.5 and GPT-4, quickly became a global sensation—setting records as the fastest-growing application in history, reaching 100 million users within two months.

Why RLHF Was a Game-Changer

To understand why RLHF is such a breakthrough, consider this: AI systems are typically trained on massive datasets, but they often struggle to align with human preferences and deliver answers that “feel right.” RLHF fixes this by combining machine learning with human judgment.

It’s like training a chef to cook a dish. Instead of just following a recipe (the dataset), the chef learns to adjust flavors and presentation based on customer feedback until the meal is perfect.

The Hidden Lesson: Breakthroughs Are Not Always Obvious

Schmidt’s insight into OpenAI’s journey highlights a truth that’s often overlooked in innovation: even the most brilliant minds don’t always recognize their success right away. OpenAI’s RLHF wasn’t a carefully planned masterstroke; it was an experiment that turned into a world-changing discovery.

This mirrors other famous tech breakthroughs:

Post-it Notes: A 3M scientist accidentally invented a weak adhesive, which was initially seen as a failure. It later became the core of one of 3M’s most successful products.
Penicillin: Alexander Fleming discovered antibiotics when he noticed mold killing bacteria in his lab—a “mistake” that revolutionized medicine.
Slack: The workplace messaging giant started as a failed gaming project but pivoted into a billion-dollar tool.

These stories remind us that innovation often happens when experimentation meets curiosity.

What Entrepreneurs Can Learn from This

If you’re building a startup, launching a product, or innovating in any field, here’s what you can take away from OpenAI’s RLHF story:

Experiment Relentlessly: Sometimes the smallest tweaks or experiments lead to massive breakthroughs. Never dismiss an idea as too small to matter.
Learn from Humans: Just as RLHF relies on human feedback, businesses thrive when they listen to their customers. The closer you align with real human needs, the better your product will be.
Be Ready for the Unexpected: OpenAI’s team didn’t know how powerful RLHF would be until they turned it on. Be open to surprises, and when they happen, double down.

Final Thoughts: The Accidental Genius of OpenAI

Eric Schmidt’s perspective on OpenAI’s success story is a testament to the unpredictable nature of innovation. RLHF wasn’t part of a grand plan—it was a real discovery that changed the future of AI forever.

For anyone working to create the “next big thing,” the message is clear: Experiment boldly, listen to feedback, and embrace the surprises along the way. You might just discover your own “holy grail.”

Watch Eric Schmidt:

Source: Ex Google CEO: AI Can Create Deadly Viruses! If We See This, We Must Turn Off AI!