Chapter 1 · Introduction to Building AI Applications with Foundation Models
In one minute
AI exploded because of scale. Bigger models trained on more data became so capable that they unlocked a flood of new applications, and because only a few organizations can afford to train them, those models are now sold as a service. That combination (more demand + lower barrier to entry) created a new discipline: AI engineering. This chapter explains where foundation models came from, what they're good (and not good) at, whether you should build an AI app at all, and how the new "AI stack" differs from traditional ML.
From language models to foundation models
- A language model encodes statistical information about language, basically, how likely a word (token) is to appear given some context.
- The basic unit is a token (a character, word, or word-piece like
-tion). Breaking text into tokens is tokenization. Rough rule: ~100 tokens ≈ 75 English words. - Two flavors:
- Masked language models (e.g., BERT) fill in blanks using context from both sides. Good for classification, sentiment, understanding.
- Autoregressive language models predict the next token using only what came before. These power today's generative AI (GPT, Gemini, etc.).
- Self-supervision is the breakthrough. Instead of needing humans to label data, the model learns by predicting the next token in ordinary text. Suddenly the entire internet becomes training data, which is what let models grow to today's scale.
- Foundation model = a model trained on massive data that can be adapted to many tasks. When models added images, audio, and more, they became multimodal (LMMs). "Foundation" captures both the broad capability and the role as a base you build on.
Why "foundation"?
You don't build a model from scratch anymore, you build on top of one, the way you build a house on a foundation. The same base model can be adapted to translation, coding, summarizing, customer support, and thousands of other tasks.
What foundation models are good at (use cases)
The chapter surveys successful patterns across consumer and enterprise:
- Coding: writing, completing, and debugging code.
- Image & video production: generation and editing.
- Writing & summarization: drafting, rewriting, condensing.
- Education: tutoring, explanations, practice.
- Conversational bots: assistants and copilots.
- Information aggregation: search, Q&A, research help.
- Data organization & workflow automation.
The lesson isn't the list itself, it's learning to spot where AI fits: tasks that are open-ended, language/perception-heavy, and tolerant of imperfect output (with a human in the loop where stakes are high).
Should you even build it?
A crucial, often-skipped question. The book frames it around a few checks:
- Do you need AI at all? Many problems are solved better by simple rules or existing software.
- What's the role of AI? Is it critical or complementary? Reactive (responds to requests) or proactive? Does it need to be perfect, or is "mostly right" fine?
- Build vs. buy vs. wait? Could a better model next quarter solve this for free? Is this core to your business or a commodity?
- What's the cost of being wrong, and who catches the mistakes?
Reframe "should I build it?"
The risk isn't only building the wrong thing, it's also not building while competitors do. The point is to make the decision deliberately, weighing value, defensibility, and the cost of mistakes.
The new AI stack: what changed vs. ML engineering
AI engineering grew out of ML engineering, so many principles carry over (experimentation, evaluation, optimization for speed/cost). But the emphasis shifts:
- Less training models from scratch, feature engineering, and labeling.
- More prompt engineering, context construction (RAG), and lightweight adaptation (parameter-efficient finetuning).
- The hard parts move toward evaluation and product design.
The AI stack has three rough layers:
- Application development: prompts, context, evaluation, interfaces (where most AI engineers work).
- Model development: training, finetuning, dataset engineering, inference optimization.
- Infrastructure: serving, compute, data, and monitoring.
Takeaways
- Scale is the story of modern AI: bigger models → more capabilities → more applications.
- Self-supervision removed the labeling bottleneck and made foundation models possible.
- AI engineering = adapting pre-built models, not training from scratch.
- Always ask "should I build this?" before "how do I build this?"
- Most foundational ML practices still apply; the balance of effort shifts toward prompting, context, evaluation, and product.