Hello, Quick question, I get really slow response times in Vectorise. Like seconds of response time.
Hello, Quick question, I get really slow response times in Vectorise. Like seconds of response time. My code is very simple, so nothing else should be slowing it down. I am in Greece btw, so not sure if that plays a role. I get the same speeds when deploying the worker.
16 Replies
@Kingsley Michael I removed your message as this is the channel for #vectorize, please use #off-topic for anything unrelated to Cloudflare products
The limits page says that there's a maximum of 5,000,000 vectors per index, but I'm getting a 4002 error when attempting to upsert past 240,000. any idea why or things I should change?
EDIT: RESOLVED, I was using V1 instead of V2
TypeError: vectors.map is not a function
What does this mean when it's triggered from this:
await vectorIndex.insert({
id: String(data[0][dataId]),
values: vectorArray,
metadata
});
EDIT:
I tested it on a small function:
I got the same error: vectors.map is not a function...
@yevgen Any help would be appreciated, we're a high tier client and we want to implement vectorize over our app. But we're facing vectors.map is not a function on simple calls as well.
Does this happen just locally or also when deployed?
Both
This wasn't an issue before, it suddenly became an issue. I also created a fresh new worker and the issue still happened.
Hi @CedricHadjian. Thanks for reaching out! I was able to tweak the code you have shared a bit and was able to get it to work. This is a codeblock that works as expected.
Summary of the fixes:
1. I enclosed the object being passed to the Vectorize insert operation in square brackets. This is needed because the insert function requires the input to be of an Array type.
2. I converted the id from 1 to '1'. This is needed because Vectorize vector identifiers need to be of a String type.
Feel free to reach out if you need additional support for Vectorize!
Thank you so much, I spent 6 hours trying to figure out what the issue is and it turns out to be the brackets, I also remember trying stringifying the id but it still never worked.
Happens to the best of us, I am glad we could help you out with this! Something that helps us when we develop Worker code is to setup the Worker locally as an npm project. Then you can load the project into your preferred IDE which would almost certainly display relevant packages and method signatures from
node_modules
. This project code can also be pushed to your preferred Version Control System if it is a collaborative project. You can refer to https://developers.cloudflare.com/vectorize/get-started/embeddings/ for an example on setting up the project from scratch.Cloudflare Docs
Vectorize and Workers AI · Vectorize
Vectorize allows you to generate vector embeddings using a machine-learning model, including the models available in Workers AI.
Thank you, I had it working on another test worker that I was playing around with and when it came to production, I got stuck at this because I didn't notice the brackets
Is it possible to insert 5000 at the same time using insert function? As far as I know it's possible to do it through embeddings.ndjson file, but would 5000 work or is there another limit?
The current batch size limit is 1000 vectors, when you insert/upsert using a Worker: https://developers.cloudflare.com/vectorize/platform/limits/. A higher limit of 5000 vectors is available for inserts/upserts via the HTTP API.
Cloudflare Docs
Limits · Vectorize
The following limits apply to accounts, indexes and vectors (as specified):
Unknown User•6d ago
Message Not Public
Sign In & Join Server To View
is it possible to load stuff into a prod vectorize index from the local machine? or is the data running in the miniflare / simulator only?
Hi everyone. I wonder if someone could help me fill some gaps in my knowledge relating to using Vectorize (and vectorisation generally) to help with AI data analysis on potentially large datasets.
Without Vectorize, I can send data to an LLM for analysis/categorisation etc., but I'm of course limited by tokens as to how much data I can send. I have done some reading and I understand that getting my data into a Vectorize DB, then querying that as a preliminary step to sending data off to the LLM for analysis, might be the answer here.
But I'm hazy on the exact steps/rationale. Specifically, how does this help limit the amount of data I ultimately send to the LLM? Is the idea that Vectorize reduces it to a subset, so duplicates or near-duplicates are merged/omitted, or...? (Sorry for the long message - just looking for some high-level guidance!)
So the idea with RAG is this:
- you embed all your documents and store them in a vector DB (vectorize or otherwise)
- when a user question comes in, you embed the user question
- you do a search in the vector DB for "top <n> items" where
n
is 3, or 5, or 10, or higher depending on your use-case
- you give the results to the LLM along with the query
The idea is that, because embeddings between related items are similar, when you query the vector DB with the embedding of the user question, you're likely to get back things in the vector DB related to the user question. Then you give that to the LLM to actually answer the question based on the gathered knowledge.Thank you, that's really helpful. So this model depends on a particular question as a filter, and is less useful if, say, I simply want to run sentiment analysis on 100,000 user messages. Right?
So you can definitely run sentiment analysis using an LLM, but you wouldn't really need a RAG pipeline for that unless your sentiment depends on some external knowledge you want to give the LLM.