Embeddings & similarity
Turning meaning into vectors you can search by nearest neighbour.
An embedding turns a piece of text into a list of numbers — a point in a high-dimensional space — positioned so that things with similar meaning land near each other. It's the machinery behind semantic search, recommendations, and the retrieval half of RAG. Click around the map below.
Nearest to king: prince (57%), man (54%), princess (53%). Click any word to recompute — similar meanings sit close together.
Real models embed each token into hundreds or thousands of dimensions; this is a 2D projection so it fits on screen. The key idea survives the squashing: distance is similarity, and retrieval (RAG) works by finding the nearest vectors to your question.
Distance is meaning
Notice how the words cluster: animals near animals, code near code. Nothing told the model these categories — it learned them from how words are used. To “search by meaning,” you embed the query and find the nearest points; that's all retrieval is.
Directions mean things too
Hit the king − man + woman button. Because relationships are encoded as consistent directions, you can do arithmetic on meaning: subtract “man,” add “woman,” and the royal title comes along for the ride. Real embeddings do this in hundreds of dimensions; this map is just a flattened shadow of it.