Embeddings Explained

The beginner's guide to generative AI

Jan 24, 2025

∙ Paid

Hey, I’m Colin! I help PMs and business leaders improve their technical skills through real-world case studies. For more, check out my live courses, Technical Foundations and AI Prototyping.

Behind the scenes of every genAI app, you’ll find embeddings.

Let’s walk through what embeddings are, how they’re used, and why you’re going to keep hearing about them.

The Basics

Let’s start with some basic linear algebra ideas. Chances are you’re familiar with 2d coordinate systems, like this chart:

In this space, we have 2 points: A and B.

A is at the position [1,2], and B at the position [3,1]

We can also represent these positions as vectors from the point the x and y axis meet (also called the origin).

A vector consists of a direction and a length, like the vector A from [0,0] to [1,2].

Understanding N-dimension spaces

Now that we have a basic understanding of vectors, let’s talk about spaces.

Each space in linear algebra consists of some number of dimensions. In the above example, we had two: x and y.

Another common example is 3 dimensional space

Here, we can visualize a point or vector using 3 elements: x, y, and z.

This part can be a bit tricky to follow. In linear algebra, there is no limit to the number of dimensions. You and I can only really think in 2d and 3d spaces, but linear algebra has no problem handling points in 10, 50, or 500 dimensions.

Plotting words instead of points

Let’s imagine for a moment that a word can be turned into a vector (which is a position in an n-dimension space). With this vector, we can capture everything we know about the word – how often it’s used, how often it appears next to other words, and when it can be easily substituted in a sentence, like ‘river’ and ‘stream’.

With this vector, we’ve captured how the word is used. And it looks like this:

As crazy as it seems, you can actually do this. And as a result, we can put words into a space where they are related to each other not by their letters, but by some relationship defined by how the words have been used.

River and King plotted as vectors in 2d space

These n-dimensional positions, like the long string of numbers you saw above, are embeddings. When a large language model is training, its job is to accurately place related words in a similar position.

King, Man, Queen, and Woman as word vectors

Keep reading with a 7-day free trial

Subscribe to Tech For Product to keep reading this post and get 7 days of free access to the full post archives.