any plans to increase limits of non beta model? i mean having no limits

LLogan Grasby We're expanding the list of models for generating embeddings and taking requests...

S

scottoOP•5/9/24, 2:22 PM

any multilingual embedding model?

LLogan Grasby Temperature is supported on most LLMs as an input and additional parameters are ...

J

Julian•5/9/24, 2:44 PM

Thanks. Do you have a link to an example of setting temp?

P

Puliczek•5/9/24, 4:51 PM

Is it possible to use llama-3 function calling on workers ai?

R

rob•5/9/24, 8:23 PM

not right now

LLogan Grasby Only PEFT trained loras are compatible. See https://huggingface.co/docs/peft/en/...

E

element14•5/9/24, 11:02 PM

adapter_config.json

adapter_config.json

(click to show the full content)

{
"alpha_pattern": {},
"auto_mapping": null,
"base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.2",

"task_type": "CAUSAL_LM",
"model_type": "mistral",
"use_dora": false,
"use_rslora": false
}

LLogan Grasby Only PEFT trained loras are compatible. See https://huggingface.co/docs/peft/en/...

E

element14•5/9/24, 11:02 PM

Thanks for the suggestion.
I have followed the tutorial and re-trained the mistral model using the suggested jupiter notebook
https://github.com/huggingface/autotrain-advanced/blob/main/colabs/AutoTrain_LLM.ipynb

my

adapter_config.json

adapter_config.json

is now: see message above workers-ai

Do you notice anything wrong ? Because now , when I try to upload my

adapter_model.safetensors

adapter_model.safetensors

I recevie a new error from the wrangler

✘ [ERROR] 🚨 Couldn't upload file: A request to the Cloudflare API (/accounts/1111122223334444/ai/finetunes/6a4a4a4a4a4a4a4a4-a5aa5a5a-aaaaaa/finetune-assets) failed. FILE_PARSE_ERROR: 'file' should be of valid safetensors type [code: 1000], quiting...

✘ [ERROR] 🚨 Couldn't upload file: A request to the Cloudflare API (/accounts/1111122223334444/ai/finetunes/6a4a4a4a4a4a4a4a4-a5aa5a5a-aaaaaa/finetune-assets) failed. FILE_PARSE_ERROR: 'file' should be of valid safetensors type [code: 1000], quiting...

✘ [ERROR] 🚨 Couldn't upload file: A request to the Cloudflare API (/accounts/1111122223334444/ai/finetunes/6a4a4a4a4a4a4a4a4-a5aa5a5a-aaaaaa/finetune-assets) failed. FILE_PARSE_ERROR: 'file' should be of valid safetensors type [code: 1000], quiting...

✘ [ERROR] 🚨 Couldn't upload file: A request to the Cloudflare API (/accounts/1111122223334444/ai/finetunes/6a4a4a4a4a4a4a4a4-a5aa5a5a-aaaaaa/finetune-assets) failed. FILE_PARSE_ERROR: 'file' should be of valid safetensors type [code: 1000], quiting...

So to recap my tests:

The adapter previously trained using mlx_lm (https://github.com/ml-explore/mlx-examples , https://huggingface.co/docs/hub/en/mlx) is accepted during the fine-tune upload/creation process , but it generates the error
```
InferenceUpstreamError: ERROR 3028: Unknown internal error
```
```
InferenceUpstreamError: ERROR 3028: Unknown internal error
```
when I try to run an inference.
The adapter trained using autotrain from huggingface is not accepted during the fine-tune upload/creation process , giving me the error described above.

Is there any other required parameter like (

rank r <=8 or quantization = None

rank r <=8 or quantization = None

) that has not been specified into the documentation ?
Thanks again for the support

GitHub

autotrain-advanced/colabs/AutoTrain_LLM.ipynb at main · huggingface...

AutoTrain Advanced. Contribute to huggingface/autotrain-advanced development by creating an account on GitHub.

GitHub

GitHub - ml-explore/mlx-examples: Examples in the MLX framework

Examples in the MLX framework. Contribute to ml-explore/mlx-examples development by creating an account on GitHub.

Sscotto any plans to increase limits of non beta model? i mean having no limits

S

scottoOP•5/10/24, 12:41 AM

?

Sscotto ?

I

Isaac McFadyen•5/10/24, 1:11 AM

Not sure about plans but one of the people on the Workers AI team, Michelle, has previously said to reach out to her if you are looking for higher limits: https://canary.discord.com/channels/595317990191398933/1138522314594582578/1237041407982698598

Eelement14 Thanks for the suggestion. I have followed the tutorial and re-trained the mistr...

E

element14•5/10/24, 9:40 AM

I would also like to know if there is any REST API Endpoint (or via wrangler command ) to DELETE a prev created finetune , as after all my (unsuccessful) tests I have a big list of not-working finetune.
From the Documentation I see there are methods to list or create only.
Thanks

W

Wouter J•5/10/24, 11:46 AM

Hi, is it possible to give width/height to text-to-image? I see for stability ai SDXL e.g supports 1216x832. Basically I need to generate portrait and landscapes.

Eelement14 Thanks for the suggestion. I have followed the tutorial and re-trained the mistr...

E

element14•5/10/24, 11:47 AM

Found the solution thanks to @pshek
Be sure to use only

--target_modules q_proj,v_proj

--target_modules q_proj,v_proj

as target modules with autotrain

WWouter J Hi, is it possible to give width/height to text-to-image? I see for stability ai...

R

rob•5/10/24, 1:23 PM

not yet but team said params are on the way for some models

A

a5000•5/10/24, 3:17 PM

Hey there. I'm new to the channel. Testing Workers AI. Should I expect

@cf/meta/llama-3-8b-instruct

@cf/meta/llama-3-8b-instruct

to be broken/unavailable? I can't get a response, but

@cf/meta/llama-2-7b-chat-fp16

@cf/meta/llama-2-7b-chat-fp16

works fine

Aa5000 Hey there. I'm new to the channel. Testing Workers AI. Should I expect `@cf/meta...

C

Chaika•5/10/24, 3:17 PM

https://www.cloudflarestatus.com/incidents/8kql553z8g0p

Issues with Workers AI inference

C

Chaika•5/10/24, 3:18 PM

it's broken right now but you should normally expect it to be working

A

a5000•5/10/24, 3:19 PM

Ah gotcha! Should have checked there. Thanks for the prompt response. I noticed the broken one wasn't' showing in the

Active Models

Active Models

filter

Aa5000 Ah gotcha! Should have checked there. Thanks for the prompt response. I noticed ...

C

Chaika•5/10/24, 3:19 PM

afaik the active models is just supposed to show the models you are using, and is limited to billed models/non-beta

CChaika afaik the active models is just supposed to show the models you are using, and i...

A

a5000•5/10/24, 3:20 PM

ok great

CChaika https://www.cloudflarestatus.com/incidents/8kql553z8g0p

F

falex•5/10/24, 3:37 PM

I'm seeing the same error, 'InferenceUpstreamError', when call to llama 3-8b model.

C

Chaika•5/10/24, 3:38 PM

yea, I've seen it for the last few hours or so, they've got that incident open, hopefully fixed soon

冰

冰淇淋•5/10/24, 5:35 PM

Hello, I'm sorry to interrupt.
I have a problem using the Whisper model.

[wrangler:err] InferenceUpstreamError: AiError: undefined: ERROR 3001: Unknown internal error
    at Ai.run (cloudflare-internal:ai-api:66:23)
    at async Object.fetch (file:///C:/Users/Fathan/PetProject/cloudflare-demo/src/index.ts:15:21)
    at async jsonError (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-miniflare3-json-error.ts:22:10)
    at async drainBody (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-ensure-req-body-drained.ts:5:10)
[wrangler:inf] POST / 500 Internal Server Error (2467ms)

[wrangler:err] InferenceUpstreamError: AiError: undefined: ERROR 3001: Unknown internal error
    at Ai.run (cloudflare-internal:ai-api:66:23)
    at async Object.fetch (file:///C:/Users/Fathan/PetProject/cloudflare-demo/src/index.ts:15:21)
    at async jsonError (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-miniflare3-json-error.ts:22:10)
    at async drainBody (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-ensure-req-body-drained.ts:5:10)
[wrangler:inf] POST / 500 Internal Server Error (2467ms)

[wrangler:err] InferenceUpstreamError: AiError: undefined: ERROR 3001: Unknown internal error
    at Ai.run (cloudflare-internal:ai-api:66:23)
    at async Object.fetch (file:///C:/Users/Fathan/PetProject/cloudflare-demo/src/index.ts:15:21)
    at async jsonError (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-miniflare3-json-error.ts:22:10)
    at async drainBody (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-ensure-req-body-drained.ts:5:10)
[wrangler:inf] POST / 500 Internal Server Error (2467ms)

[wrangler:err] InferenceUpstreamError: AiError: undefined: ERROR 3001: Unknown internal error
    at Ai.run (cloudflare-internal:ai-api:66:23)
    at async Object.fetch (file:///C:/Users/Fathan/PetProject/cloudflare-demo/src/index.ts:15:21)
    at async jsonError (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-miniflare3-json-error.ts:22:10)
    at async drainBody (file:///C:/Users/Fathan/PetProject/cloudflare-demo/node_modules/wrangler/templates/middleware/middleware-ensure-req-body-drained.ts:5:10)
[wrangler:inf] POST / 500 Internal Server Error (2467ms)

冰

冰淇淋•5/10/24, 5:36 PM

full wrangle index.ts code

冰

冰淇淋•5/10/24, 5:36 PM

any idea how to fix it? Thank you

R

Rob M.•5/10/24, 8:53 PM

Hey guys

What is the best way to calculate neuron usage at runtime in worker?

LLogan Grasby The media recorder API will output Webm or other file types depending on the bro...

I

Isaac McFadyen•5/12/24, 9:24 PM

https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder/MediaRecorder#mimetype
You should be able to specify the codecs the browser uses, and as long as they're supported it'll use them rather than the defaults.

Eelement14 Thanks for the suggestion. I have followed the tutorial and re-trained the mistr...

J

jameskraus•5/13/24, 12:24 AM

@element14 , did you ever figure this out? I'm getting this same

FILE_PARSE_ERROR

FILE_PARSE_ERROR

with my autotrained finetune

S

Sam White•5/13/24, 4:22 AM

Should workers be compatible with transformers.js yet? I've tried setting it up, hoping to cache everything in KV after the first request! But I can't seem to pull the wasm files. Docs say you should be able to load from the jsdelivr cdn instead of trying to pull from /public, which I thought would work for Workers but I'm getting

Error: no available backend found. ERR: [wasm] RuntimeError: Aborted(both async and sync fetching of the wasm failed)

Error: no available backend found. ERR: [wasm] RuntimeError: Aborted(both async and sync fetching of the wasm failed)

. But the other files (config, onnx) are loading up fine. @Xenova I hope it's still okay to tag you! I'm so excited to get your library up and running!

SSam White Should workers be compatible with transformers.js yet? I've tried setting it up,...

I

Isaac McFadyen•5/13/24, 4:25 AM

If you're trying to load WASM at runtime from JSDelivr that won't work in Workers.

I

Isaac McFadyen•5/13/24, 4:25 AM

Workers intentionally don't support dynamically loading WASM for security reasons, it must be uploaded at deploy time.

I

Isaac McFadyen•5/13/24, 4:25 AM

See https://developers.cloudflare.com/workers/runtime-apis/web-standards/#javascript-standards (sorry, wrong link, updated)

IIsaac McFadyen Workers intentionally don't support dynamically loading WASM for security reason...

S

Sam White•5/13/24, 4:32 AM

I see. Damn. Thank you for the reply!

R

rachelswany•5/13/24, 7:02 AM

hey guys... iam loving the CF hosted Workers AI inference and all the models! I have a few questions:
1/ what is the pricing for beta mode models?
2/ what does it mean by it is in "beta"? CF wont pull the plug on the model I assume?
3/ are beta models suitable for staging type situations? ie not exactly prod, but close

A

Andus•5/13/24, 8:30 AM

Hi. When I'll end my limit from Workers Free will cloudflare start the paid pricing automatically or do I need to enable it myself? If it starts automatically can I turn it off somehow?

R

rob•5/13/24, 12:13 PM

Beta models are free rn afaik

J

James•5/13/24, 1:42 PM

It's quite disappointing to see the AI team's approach to backwards compatibility thus far, going against all of the great paradigms the Workers runtime has enforced for years.

Some examples:

https://github.com/cloudflare/workerd/pull/2044
- it took me quite some time to argue the point about this being a breaking change, despite pushback from multiple team members and saying the risk was "non-existent". Turns out, after this was merged, it broke things and had to be partially reverted
https://github.com/cloudflare/workerd/pull/2103#discussion_r1598483980
- another recent example that was just merged. I truly hope this doesn't break anyone's applications when it rolls out

As a long-time Workers user, I'm a massive fan of their compat guarantees as outlined at https://blog.cloudflare.com/backwards-compatibility-in-cloudflare-workers, and it seems the AI team isn't taking this very seriously for a GA product. There's precedent for other bindings implementing breaking changes behind compat flags, too: https://github.com/cloudflare/workerd/pull/1297

A change to the Workers Runtime must never break an application that is live in production.

.

.neurorotic.•5/13/24, 4:21 PM

The docs seem out of date, there is no "Account details section contains your Account ID" section on workers and pages ->overview https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/#find-account-id-workers-and-pages
And the only way I found to find it is by creating a domain (even though I did not intend to register any w/ CF) but then finally noticed the account_id is after the dash.cloudflare.com/{ACCOUNT_ID}/ on the site

This could be much clearer for first-time users.

Rrob Beta models are free rn afaik

R

rachelswany•5/13/24, 4:47 PM

Is there a known limit?

M

Mrinank•5/13/24, 6:05 PM

const stream = await retrivalChain.stream({
input: 'what is hello world',
});

return new Response(stream, {
    headers: {
    'content-type': 'text/event-stream',
    'Access-Control-Allow-Origin': '*',
    },
});

const stream = await retrivalChain.stream({
input: 'what is hello world',
});

return new Response(stream, {
    headers: {
    'content-type': 'text/event-stream',
    'Access-Control-Allow-Origin': '*',
    },
});

const stream = await retrivalChain.stream({
input: 'what is hello world',
});

return new Response(stream, {
    headers: {
    'content-type': 'text/event-stream',
    'Access-Control-Allow-Origin': '*',
    },
});

const stream = await retrivalChain.stream({
input: 'what is hello world',
});

return new Response(stream, {
    headers: {
    'content-type': 'text/event-stream',
    'Access-Control-Allow-Origin': '*',
    },
});

is this the correct way to send streams from cloudflare worker?
getting typescript error,

Argument of type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to parameter of type 'BodyInit | null | undefined'.
  Type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStream<Uint8Array>'.
    The types returned by 'getReader()' are incompatible between these types.
      Type 'ReadableStreamDefaultReader<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStreamDefaultReader<Uint8Array>'.
        Type '{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }' is missing the following properties from type 'Uint8Array': BYTES_PER_ELEMENT, buffer, byteLength, byteOffset, and 27 more.ts(2345)
const stream: IterableReadableStream<{
    context: Document<Record<string, any>>[];
    answer: string;
} & {
    [key: string]: unknown;
}>

Argument of type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to parameter of type 'BodyInit | null | undefined'.
  Type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStream<Uint8Array>'.
    The types returned by 'getReader()' are incompatible between these types.
      Type 'ReadableStreamDefaultReader<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStreamDefaultReader<Uint8Array>'.
        Type '{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }' is missing the following properties from type 'Uint8Array': BYTES_PER_ELEMENT, buffer, byteLength, byteOffset, and 27 more.ts(2345)
const stream: IterableReadableStream<{
    context: Document<Record<string, any>>[];
    answer: string;
} & {
    [key: string]: unknown;
}>

Argument of type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to parameter of type 'BodyInit | null | undefined'.
  Type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStream<Uint8Array>'.
    The types returned by 'getReader()' are incompatible between these types.
      Type 'ReadableStreamDefaultReader<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStreamDefaultReader<Uint8Array>'.
        Type '{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }' is missing the following properties from type 'Uint8Array': BYTES_PER_ELEMENT, buffer, byteLength, byteOffset, and 27 more.ts(2345)
const stream: IterableReadableStream<{
    context: Document<Record<string, any>>[];
    answer: string;
} & {
    [key: string]: unknown;
}>

Argument of type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to parameter of type 'BodyInit | null | undefined'.
  Type 'IterableReadableStream<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStream<Uint8Array>'.
    The types returned by 'getReader()' are incompatible between these types.
      Type 'ReadableStreamDefaultReader<{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }>' is not assignable to type 'ReadableStreamDefaultReader<Uint8Array>'.
        Type '{ context: Document<Record<string, any>>[]; answer: string; } & { [key: string]: unknown; }' is missing the following properties from type 'Uint8Array': BYTES_PER_ELEMENT, buffer, byteLength, byteOffset, and 27 more.ts(2345)
const stream: IterableReadableStream<{
    context: Document<Record<string, any>>[];
    answer: string;
} & {
    [key: string]: unknown;
}>

MMrinank ``` const stream = await retrivalChain.stream({ input: 'what is hello world', })...

M

Mrinank•5/13/24, 10:48 PM

i am using import { createRetrievalChain } from 'langchain/chains/retrieval';

S

scottoOP•5/14/24, 8:55 PM

whister should work with webm format? in the openai docs says it's supported but testing agains the API from cloudflare i see an error, Invalid or incomplete input for the model: model returned: Invalid audio input: F

C

codebam•5/14/24, 10:40 PM

is there a way to wrap

env.AI.run()

env.AI.run()

in

ctx.passThroughOnException

ctx.passThroughOnException

while still saving the result?

C

codebam•5/14/24, 10:44 PM

I don't actually understand how

passThroughOnException

passThroughOnException

works. I just want my worker to respond properly when the models time out

J

James•5/14/24, 11:17 PM

you'd probably want to just:

try {
    await env.AI.run(...)
} catch (e) {
    return new Response('handle errors here')
}

try {
    await env.AI.run(...)
} catch (e) {
    return new Response('handle errors here')
}

try {
    await env.AI.run(...)
} catch (e) {
    return new Response('handle errors here')
}

try {
    await env.AI.run(...)
} catch (e) {
    return new Response('handle errors here')
}

and then consider retrying with errors if it makes sense.

J

James•5/14/24, 11:18 PM

passThroughOnException

passThroughOnException

is more intended for if you have an origin behind the worker

J

James•5/14/24, 11:18 PM

so that rather than showing an error page, it'll just send the request as-is to your origin. This is often referred to as a "fail-open" design

JJames you'd probably want to just: ```js try { await env.AI.run(...) } catch (e) {...

C

codebam•5/14/24, 11:34 PM

oh okay. I'll try this, thank you

S

Shanmukeshwar•5/15/24, 6:54 AM

Workers AI are very slow avg: 10sec

S

Shanmukeshwar•5/15/24, 6:54 AM

Any tweaks we can do here?

any plans to increase limits of non beta model? i mean having no limits

Similar Threads

Similar Threads

Similar Threads