What techniques should I use to cache

What techniques should I use to cache the data read from the vector using a worker? What about KV?
4 Replies
garvitg
garvitg3mo ago
KV could potentially be a good candidate. You can refer to https://developers.cloudflare.com/workers/platform/storage-options/ to evaluate your options. The Vectorize service already caches data, so I am not sure how much benefit you'd get if you implement your own cache, but you can use the link above to explore your options.
Cloudflare Docs
Choose a data or storage product
Storage and database options available on Cloudflare's developer platform.
thipperz
thipperz3w ago
Hey Garvit, can you explain a bit more about "The vectorize service already caches data"? I'm currently using KV to cache search results for my app, one KV key for each user query. I'm currently using a very high TTL while providing the user the option to refresh results. In my app, there are new vectors being inserted constantly. So, how would that default cache mechanism work in this case? I'm wondering if my KV cache system would'nt be necessary at all, but at the same time I remember that setting it up heavily reduced my vectorize costs.
garvitg
garvitg3w ago
The vectorize service already caches data
We have implemented a multi-tier caching strategy internally based on the Vectorize storage layer design to optimize read latencies. You can read more about the implementation on the Vectorize blog: https://blog.cloudflare.com/building-vectorize-a-distributed-vector-database-on-cloudflare-developer-platform/#query-latency-optimization. Relevant passages:
As a distributed database keeping its data state on blob storage, Vectorize’s latency is primarily driven by the fetch of index data, and relies heavily on Cloudflare’s network of caches as well as individual server RAM cache to keep latency low.

Because Vectorize data is snapshot versioned, (see Eventual consistency and snapshot versioning above), each version of the index data is immutable and thus highly cacheable, increasing the latency benefits Vectorize gets from relying on Cloudflare’s cache infrastructure.
As a distributed database keeping its data state on blob storage, Vectorize’s latency is primarily driven by the fetch of index data, and relies heavily on Cloudflare’s network of caches as well as individual server RAM cache to keep latency low.

Because Vectorize data is snapshot versioned, (see Eventual consistency and snapshot versioning above), each version of the index data is immutable and thus highly cacheable, increasing the latency benefits Vectorize gets from relying on Cloudflare’s cache infrastructure.
The Cloudflare Blog
Building Vectorize, a distributed vector database, on Cloudflare’...
Vectorize was recently upgraded and made generally available, now supporting indexes of up to 5 million vectors, delivering faster responses, with lower pricing and a free tier. This post dives deep into how we built Vectorize to enable these improvements.
garvitg
garvitg3w ago
a very high TTL while providing the user the option to refresh results
 If this is what your desired experience is, then it makes sense to use something like KV as a caching layer in front of Vectorize. 
In my app, there are new vectors being inserted constantly. So, how would that default cache mechanism work in this case?
 We have built cache invalidation mechanisms that would respond to changes in an index state and ensure that Vectorize always provides up-to-date results. However any sort of caching is less effective when it requires frequent invalidation in response to a continuous stream of writes. To make the best use of Vectorize caching optimizations, we would recommend batched upsert/inserts and deletes; and the use of fewer mutation requests and larger batch sizes (Vectorize supports up to 5000 vectors in a single batch). Refer to the best practices page on our documentation: https://developers.cloudflare.com/vectorize/best-practices/insert-vectors/#improve-write-throughput.  Please note that even if you are building your own caching layer, frequent updates would mean frequent invalidations or out-of-date results, especially if you are using a high TTL. Cache configuration can be tricky and often depends heavily on usage patterns.

Did you find this page helpful?