Cloudflare Developers•7mo ago

Quick question: For Vectorize

Quick question: For Vectorize billing calculation, is the total count of vectors is used? Or does namespace and metadata filtering reduced the number of billable Queried Vector Dimensions for that request? I have multiple tenants with different sets of vector, so looking for an efficient way to manage these. Creating one index per tenant would work, but it seems to complicate the bindings a bit. https://developers.cloudflare.com/vectorize/platform/pricing/

5 Replies

garvitg•7mo ago

Hey @felix_m. Thanks for reaching out. As of now, the computation of billing metrics in Vectorize makes use of the total count of vectors in the index. Query costs also depend on the size of the index because the search operation gets more expensive as the size of the index grows. The computation does not depend on the query filters as of now.

felix_mOP•7mo ago

Hi @garvitg , thanks for your response. How about the namespace? https://developers.cloudflare.com/vectorize/best-practices/insert-vectors/#namespaces It seems to be applied before the search query so ideally I'd be able to use to insert separate projects on different namespaces.

When a namespace is specified in a query operation, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, increasing the precision of the matched results.

Alternatively, I could create separate indexes for each projects/tenants, but I'm not sure if/how workers can dynamically connect to these ressources without the Binding configuration in the wrangler.jsonc file (I'm fairly new to the workers ecosystem)

garvitg•7mo ago

For Vectorize, the namespace is considered a special metadata field and it behaves like a metadata field for billing metrics (ie: it has no impact on the computation). So queries with namespace filters would still consider the total size of the index for the computation of billing metrics. The reason we do not consider filters (or other query params such as top_k) in the computation of query costs is to reduce complexity in the metrics' formulae. This makes them easier to understand for customers, and Vectorize costs are still below market averages with filters not considered in the computation of billing metrics. You are correct in your assessment that workers cannot bind to an index dynamically as of now. We do plan to ship dynamic Vectorize indexes, but we do not have a timeline for that just yet.

felix_mOP•7mo ago

Thank you for the clarifications. This billing formula compounds very quickly for cases where data is segmented. For a basic use case of 1000 projects, with 1000 vectors and 1000 requests each. The requests count would be 1000*1000*1000*1000 = n^4 , while on separate indexes it'd just be n^3, making the bill ~1000 times higher. It's hard to justify this architechure decision since it doesn't scale at all. It'd be great to have a decent workaround until dynamic-index binding becomes available. Would programatically creating one worker per index be viable? What would you recommend for my use case (many self-contained projects needing their own vercotr-store)? I assume storing my optimized vectors sets in a static JSON files on R2 (and computing the knn in a worker) could be decent temporary option as well.

garvitg•7mo ago

That is why the queried vector dimensions are calculated as (queried vectors + stored vectors) * dimensions. So if you have an index with 1000 vectors, 1000 dimensions, and 1000 reqeusts per month, the queried vector dimensions would be (1000 + 1000) * 1000 = 2000 * 1000 and that comes to $0.02 per month. The summation logic helps scale the cost model more gradually. But we understand that there may be customers who require a larger volume of indexes, and that is why we plan to ship the ability to build and bind indexes dynamically. The requirements of that project are currently under investigation. In the meantime, if you still need to work with a large number of indexes, we would recommend using the Vectorize REST API instead of bindings. The REST API has feature parity with the operations available via worker bindings: https://developers.cloudflare.com/api/resources/vectorize/ and these endpoints can be invoked from a worker too. The only consideration would be that request latencies would be higher with REST API when compared to bindings.

Cloudflare API | Vectorize

Interact with Cloudflare's products and services via the Cloudflare API

Gaming

Programming

**Quick question: **For Vectorize

Did you find this page helpful?

Quick question: For Vectorize