How to optimize vector similarity search
I use prisma as ORM and execute a raw query.
In my table I have these rows :
titleEmbedded Unsupported("vector")?
descriptionEmbedded Unsupported("vector")?
locationEmbedded Unsupported("vector")?
Each of my vectors have 3072 dimensions (from openai embedding)
From what I see HNSW can only be up to 2k dimensions.
I have more than 100k rows
How can I optimize my query ? (right now its pretty slow)
2 Replies
optimistic-goldOP•13mo ago
foreign-sapphire•13mo ago
The reason the limit for the vector type is 2k dimensions is because of the limit of 8kB in Postgres page.
How slow is your query?
4 bytes per float x 2000.
However, you can use half-vec type instead that uses 2 bytes. So you can fit 4000 dimensions.
The loss in recall is insignificant according to tests I ran in the past. Plus, you save half the storage.