Learning Paths / Open Source AI
🌐 Learning Path

Open Source AI

How Llama, DeepSeek, and Mistral proved you don't need billions to build great models. The movement that democratized frontier AI.

📖 ~25 min read
📍 15 key events
📅 Jul 2023 - Dec 2025

The Closed Model Era

In early 2023, frontier AI was a walled garden. GPT-4 sat behind OpenAI's API. Claude required Anthropic access. Google's best models were internal only. The message was clear: building great AI required billions of dollars and thousands of GPUs.

Researchers could study papers, but not weights. Startups could use APIs, but not customize models. The AI revolution was happening, but most of the world could only watch through a paywall.

"Open source is the path to widespread AI benefits. Closed development concentrates power in the hands of a few."

- Mark Zuckerberg, July 2023

The Llama That Changed Everything

July 18, 2023 🔥

Llama 2 Open-Sourced

Meta released Llama 2 (7B, 13B, 70B parameters) under a permissive license allowing both research and commercial use. For the first time, a model competitive with GPT-3.5 was freely available to anyone.

So What?

For practitioners: Llama 2 meant you could run a capable LLM on your own infrastructure. No API costs, no data leaving your servers, full control over fine-tuning. The era of "build vs. buy" began.

Llama 2 sparked an explosion of innovation. Within months, the Hugging Face model hub was flooded with fine-tuned variants. Mistral, a Paris-based startup founded by ex-Google and ex-Meta researchers, released Mistral 7B-a model that punched far above its weight class.

The open-source community discovered that smaller, well-trained models could compete with giants. Quantization techniques let 70B models run on consumer GPUs. The "local LLM" movement was born.

Llama 3: Closing the Gap

April 18, 2024

Llama 3 Released

Meta released Llama 3 (8B and 70B) with improved performance and an 8K context window. The gap between open and closed models narrowed significantly-Llama 3 70B approached GPT-4 on many benchmarks.

So What?

For practitioners: Llama 3 was "good enough" for most production use cases. Companies could now build products on open weights without sacrificing quality. The business case for closed APIs weakened.

The Sputnik Moment: DeepSeek R1

January 20, 2025 🔥

DeepSeek R1: Open-Source Reasoning

Chinese lab DeepSeek released R1, an open-source reasoning model matching OpenAI's o1. Training cost: approximately $6 million-a fraction of the assumed $100M+ required for frontier models. The weights were immediately available on Hugging Face.

So What?

For practitioners: R1 proved that reasoning capabilities-previously thought to require massive compute-could be achieved efficiently. You could now run a model that "thinks" on your own hardware. The monopoly on reasoning was broken.

The market reaction was immediate and brutal. Nvidia lost $500 billion in market cap in a single day as investors questioned whether expensive AI infrastructure was truly necessary. The "Sputnik moment" comparison spread through tech media.

DeepSeek's success came from efficiency innovations: better training data curation, clever architecture choices, and a focus on reasoning-specific optimization. They proved that brute-force scaling wasn't the only path to frontier performance.

Llama 4: The MoE Revolution

April 5, 2025 🔥

Llama 4 Family Release

Meta released Llama 4 as the first open-weight model family with Mixture-of-Experts (MoE) architecture and native multimodality. Scout, Maverick, and the planned Behemoth variants offered unprecedented capability at various scales.

So What?

For practitioners: MoE architecture means you only activate a fraction of parameters per token-massive capability with manageable compute. Llama 4 Scout runs on a single H100 while offering frontier performance.

April 5, 2025

Llama 4 Scout: 10M Token Context

Llama 4 Scout offered an unprecedented 10 million token context window while fitting on a single Nvidia H100 GPU with quantization. This enabled entirely new use cases: analyzing entire codebases, processing book-length documents, maintaining extensive conversation histories.

So What?

For practitioners: 10M context means you can fit an entire codebase, documentation set, or research corpus in a single prompt. RAG becomes optional for many use cases. The "context window problem" is solved.

The Open-Closed Gap: 2025

Capability Best Open Model Best Closed Model Gap
General Reasoning DeepSeek R1 o1 ~Equal
Code Generation Llama 4 Maverick Claude Opus 4.5 ~5%
Context Length Llama 4 Scout (10M) Gemini 1.5 (2M) Open leads
Multilingual DeepSeek V3 GPT-5.1 ~Equal
Multimodal Llama 4 Gemini 3 ~10%

By late 2025, the gap between open and closed models had effectively closed on most benchmarks. In some areas-particularly context length-open models actually led. The narrative shifted from "can open source compete?" to "why pay for closed APIs?"

The Global Open Source Ecosystem

Open-source AI became a global movement. Beyond Meta and DeepSeek, key players emerged:

Throughout 2025

Global Open Source Labs

Mistral AI (France) continued releasing efficient models that punched above their weight. Alibaba's Qwen models dominated Chinese-language tasks. G42's Falcon (UAE) targeted Arabic markets. AI2's OLMo provided fully transparent training for researchers.

So What?

For practitioners: Open source is no longer a single company's effort. Multiple competitive options exist for different use cases, languages, and regions. The ecosystem is robust and self-sustaining.

The Full Timeline

February 2023
Llama 1 Leaked
Meta's original Llama leaked online, sparking open-source interest
July 2023
Llama 2 Released
First officially open commercial-use LLM at scale
September 2023
Mistral 7B
French startup releases surprisingly capable small model
December 2023
Mixtral 8x7B
First open MoE model demonstrates efficiency gains
April 2024
Llama 3
Meta closes gap with GPT-4 class performance
November 2024
DeepSeek V2
Chinese lab demonstrates efficient training methods
January 2025
DeepSeek R1
Open-source reasoning matches o1, $500B Nvidia drop
April 2025
Llama 4 Family
First open MoE + multimodal + 10M context model
April 2025
G42 Falcon 3
UAE releases Arabic-optimized open models
July 2025
AI2 OLMo 2
Fully transparent training for research community
October 2025
Llama Federal Approval
US government approves Llama for federal use
November 2025
DeepSeek V3 Preview
China continues pushing open-source frontier
December 2025
DeepSeek IMO Gold
Open-source model achieves math olympiad gold

Key Takeaways for Practitioners

What This Means For You

Continue Learning

Rise of AI Agents

From chatbots to autonomous workers. How Claude Computer Use, ChatGPT Agent, and MCP are enabling AI to take action.

Start Next Path