Beginner’s Guide to
Vector Databases
AI by Hand ✍
Prof. Tom Yeh
Hosted by:
1
Roadmap
+Vector
Database Retrieval
Word Dot Product
Embedding
Sentence
Embedding Transformer
Search Q/A
AI by Hand ✍ 2024 © Tom Yeh 2
Database
Beginner’s Guide to Vector Databases - AI by Hand ✍
Fun fact
There are ___________________ millions dogs in the world!
AI by Hand ✍ 2024 © Tom Yeh 4
How to create a table?
SQL:
__________ TABLE _________
( id __________,
name __________________,
size _________,
pop _________)
id name size pop
AI by Hand ✍ 2024 © Tom Yeh 5
How to insert a record?
SQL:
_____________ INTO animals
_____________ (1, dog, 2, 900)
id name size pop
1 dog 2 900
AI by Hand ✍ 2024 © Tom Yeh 6
Vector Database
Beginner’s Guide to Vector Databases - AI by Hand ✍
How to create a vector database?
SQL: CREATE TABLE animals
( id INT,
name VARCHAR(10),
size INT,
pop INT,
emb _________________ not null )
id name size pop
AI by Hand ✍ 2024 © Tom Yeh 8
How to insert a record with a vector?
SQL:
INSERT INTO animals
VAUES (1, dog, 2, 900, ______________)
id name size pop emb
1 dog 2 900
AI by Hand ✍ 2024 © Tom Yeh 9
Retrieval
Beginner’s Guide to Vector Databases - AI by Hand ✍
Which record is relevant to the query “cat”?
Query
cat id name size pop emb
1 dog 2 900 2 1 0
1 2 0
2 bat 1 10000 0 1 2
AI by Hand ✍ 2024 © Tom Yeh 11
Draw distance vs similarity
distance similarity
AI by Hand ✍ 2024 © Tom Yeh 12
Distance vs similarity on a scale of 1 to 5
asc or desc similarity
asc or desc distance
AI by Hand ✍ 2024 © Tom Yeh 13
How to retrieve by similarity? (dot product)
________ name, emb<___>[__,__,__] AS score
FROM animals
________ BY ______ ASC | DESC ;l
AI by Hand ✍ 2024 © Tom Yeh 14
How to retrieve by distance? (Euclidean)
SELECT name, emb<*>[1, 2, 0] AS score
FROM animals
ORDER BY score DESC;
AI by Hand ✍ 2024 © Tom Yeh 15
Dot Product
Beginner’s Guide to Vector Databases - AI by Hand ✍
How to compute dot product?
Example:
1 2 3 dog 2 1 0
* * * * * *
2 2 0 cat 1 2 0
= = = ∑ = = = ∑
2 4 0 6
Result Result
AI by Hand ✍ 2024 © Tom Yeh 17
How to compute dot product using matrix
multiplication?
Example:
dog
1 2
2 1
3 0
2 2 0 6 cat 1 2 0
AI by Hand ✍ 2024 © Tom Yeh 18
How to compute dot product with multiple
vectors?
Example:
dog bat
1 1 2 0
2 1 1 1
3 1 0 2
2 2 0 6 4 cat 1 2 0 4
AI by Hand ✍ 2024 © Tom Yeh 19
Word Embedding
Beginner’s Guide to Vector Databases - AI by Hand ✍
Where are dog, cat and bat in the “name”
space?
AI by Hand ✍ 2024 © Tom Yeh 21
Where are dog, cat and bat in the “name”
space?
AI by Hand ✍ 2024 © Tom Yeh 22
Which embedding is better?
Embedding 1 Embedding 2
dog cat bat dog cat bat
2 1 0 2 0 1
1 2 1 1 1 0
0 0 2 0 2 2
AI by Hand ✍ 2024 © Tom Yeh 23
Which embedding is better?
dog cat bat dog cat bat dog cat bat
Desired Embedding 1 Embedding 2
2 1 0 2 1 0 2 0 1
dot
product 1 2 1 1 2 1 1 1 0
similarity 0 0 2 0 0 2 0 2 2
dog 2 1 0 H L dog 2 1 0 dog 2 1 0
cat 1 2 0 H L cat 1 2 0 cat 0 1 2
bat 0 1 2 L L bat 0 1 2 bat 1 0 2
AI by Hand ✍ 2024 © Tom Yeh 24
Sentence Embedding
Beginner’s Guide to Vector Databases - AI by Hand ✍
How to embed sentences?
id comment user emb
1 How are you? John ?
2 Who are you? Mary ?
AI by Hand ✍ 2024 © Tom Yeh 26
“How are you” à word embedding vectors
how are you
a an the how why who what are is am be was you we I they she he she me him her
0 -1 0 1 0 1 0 0 -1 1 0 0 0 3 1 0 -1 0 0 0 -1 0
2 0 2 0 0 0 -1 1 0 0 0 2 1 0 2 0 2 0 0 2 0 0
-1 0 -1 1 2 0 0 1 0 1 -1 0 0 -1 0 3 0 0 -1 0 2 -1
0 1 0 0 1 0 1 0 1 0 1 -2 0 0 0 1 0 1 0 1 0 1
AI by Hand ✍ 2024 © Tom Yeh 27
Word vectors à Sentence vector
Method 1: Concatenate
how are you
1 0 0
0 1 1
1 1 0
0 0 0
AI by Hand ✍ 2024 © Tom Yeh 28
Word vectors à Sentence vector
Method 2: Average
how are you
1 0 0
0 1 1
1 1 0
0 0 0
id comment user emb
1 How are you? John
2 Who are you? Mary
AI by Hand ✍ 2024 © Tom Yeh 29
“Who are you” à word embedding vectors
who are you
a an the how why who what are is am be was you we I they she he she me him her
0 -1 0 1 0 1 0 0 -1 1 0 0 0 3 1 0 -1 0 0 0 -1 0
2 0 2 0 0 0 -1 1 0 0 0 2 1 0 2 0 2 0 0 2 0 0
-1 0 -1 1 2 0 0 1 0 1 -1 0 0 -1 0 3 0 0 -1 0 2 -1
0 1 0 0 1 0 1 0 1 0 1 -2 0 0 0 1 0 1 0 1 0 1
0 0
1 1
1 0
0 0
AI by Hand ✍ 2024 © Tom Yeh 30
Word vectors à Sentence vector
Method 2: Average
who are you
1 0 0
0 1 1
0 1 0
0 0 0
id comment user emb
1 How are you? John [1/3, 2/3, 2/3, 0]
2 Who are you? Mary
AI by Hand ✍ 2024 © Tom Yeh 31
How to query by SQL?
________ comment, emb<___>[__,__,__,__] AS score
FROM posts
ORDER BY ______ ASC | DESC ;
AI by Hand ✍ 2024 © Tom Yeh 32
How to query using a high-level API?
query = Query(post_index)
._________(post)
._________(relevance_space.text, Param("________"))
app.query(query, _________ = "who are you?" )
Source: Superlinked.com
AI by Hand ✍ 2024 © Tom Yeh 33
Search
Beginner’s Guide to Vector Databases - AI by Hand ✍
K-Nearest Neighbor, K=3, Dot-Product
Database
ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
emb
Query 9 -8 9 9 0 3 1 -6 0 11 3 13 -2 6 15 -9 7 6 -5 8
{ max | min }
AI by Hand ✍ 2024 © Tom Yeh 35
K-Nearest Neighbor, K=3, L2
Database
ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
emb
Query 6 8 9 9 1 10 0 9 12 15 2 13 12 6 15 9 7 6 5 8
{ max | min }
AI by Hand ✍ 2024 © Tom Yeh 36
Transformer
Beginner’s Guide to Vector Databases - AI by Hand ✍
How to use a Transformer to get a sentence embedding
vector?
Word Sentence
Embedding Embedding
Vectors Vector
1 0 0
0 1 1
1 1 0
0 0 0
AI by Hand ✍ 2024 © Tom Yeh 38
How to combine across positions?
1
0
1
1 0 0
0 1 1
1 1 0
0 0 0
AI by Hand ✍ 2024 © Tom Yeh 39
How to combine across positions?
1 0
0 1
1 1
1 0 0 1
0 1 1 1
1 1 0 1
0 0 0 0
AI by Hand ✍ 2024 © Tom Yeh 40
How to combine across positions?
1 0 0
0 1 0
1 1 1
1 0 0 1 0
0 1 1 1 2
1 1 0 1 1
0 0 0 0 0
AI by Hand ✍ 2024 © Tom Yeh 41
How to combine across features?
1 0 0
0 1 1
1 1 0
0 0 0
1 1 1
1 0 -1 0 1
AI by Hand ✍ 2024 © Tom Yeh 42
How to combine across features?
1 0 0
0 1 1
1 1 0
0 0 0
1 1 1
1 0 -1 0 1 1 0 1
0 1 1 0 0
AI by Hand ✍ 2024 © Tom Yeh 43
How to combine across positions and
features?
1 0 0
0 1 0
1 1 1
1 0 0 0 0
0 1 1 2 1
1 1 0 1 0
0 0 0 0 0
1 1 1
1 0 -1 0 1
AI by Hand ✍ 2024 © Tom Yeh 44
How to use a Transformer to get a sentence embedding
vector?
1 0 0
Word
Embedding 0 1 0
Vectors 1 1 1
1 0 0 0 0
0 1 1 2 1
1 1 0 1 0 Sentence
Embedding
0 0 0 0 0 Vector
1 1 1
1 0 -1 0 1
0 1 1 0 0 2 3 1
0 0 0 1 1 1 1 1
AI by Hand ✍ 2024 © Tom Yeh 0 0 1 1 0 1 1 0 45
Q/A
Beginner’s Guide to Vector Databases - AI by Hand ✍
AI by Hand ✍ 2024 © Tom Yeh 47