Is it possible to run HuggingFaceTB/SmolLM2-135M-Instruct on workersai? - Cloudflare Developers

© 2026 Hedgehog Software, LLC

More

Communities Docs About Terms Privacy

mounty

Mmounty Is it possible to run HuggingFaceTB/SmolLM2-135M-Instruct on workersai?

Isaac McFadyen

I

Isaac McFadyen•4/28/25, 4:10 PM

?workers-ai-models TLDR: no.

SuperHelpflare

S

SuperHelpflare•4/28/25, 4:10 PM

Workers AI currently only supports popular open-source models provided by the Cloudflare team, as well as your own LoRAs that can be applied on top of the Cloudflare-provided models. You cannot currently upload your own models or use a model from HuggingFace. See the documentation for the list of Cloudflare-provided models: https://developers.cloudflare.com/workers-ai/models/

ByteBeast

B

ByteBeast•4/29/25, 5:14 AM

I have build an agent using agent package but I am stuck at one thing, I want to get URL params or query params in tools of MCP server like agentId so I dont have to pass agentId in tools props (paramsSchema).

nick

N

nick•4/30/25, 4:20 AM

For the text-to-image models such as flux-1-schnell the current output is simply:

{ image: string }

{ image: string }

is there a way to have it output the usage data, similar to the

{ usage: { prompt_tokens: number, completion_tokens: number, total_tokens: number }}

{ usage: { prompt_tokens: number, completion_tokens: number, total_tokens: number }}

output for text generation models? Based on the pricing, I'd assume it would be something like

{ usage: { tiles: number, steps: number }}

{ usage: { tiles: number, steps: number }}

? Or does this just need to be calculated on each usage on our end using the default step of 4 if none is provided as input, and measuring the output image size?

Regarding output image size. Is there or will there be a way to specify the desired image output resolution? The image size i'm getting back is 1024x1024, so four 512x512 tiles. Just want to confirm this will always be the case?

Cloudflare Docs

FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

antim8

A

antim8•4/30/25, 12:54 PM

It seems like some models are not available in the worker types. For example https://developers.cloudflare.com/workers-ai/models/llama-3.1-8b-instruct-fast/

or did i miss something?

Cloudflare Docs

llama-3.1-8b-instruct-fast

[Fast version] The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common indus...

antim8

A

antim8•4/30/25, 12:58 PM

those are available in the code. Seems like there are a bunch of model missing

Rabid

R

Rabid•4/30/25, 7:12 PM

Anyone have a simple example for scoring NSFW images? I tried resnet-50 with a very NWFW image, and it just adds labels like "bath towel"

mr.niko.la

M

mr.niko.la•4/30/25, 8:35 PM

[ai]
binding = "AI"

[ai]
binding = "AI"

However, the

401 Unauthorized

401 Unauthorized

errors in my logs indicate that the authentication for Workers AI is failing, despite the correct binding. This suggests issues with account permissions, missing API tokens, or a misconfiguration in the Workers AI setup.

Do i need to add any token for

whisper turbo large v3

whisper turbo large v3

in the

vars

vars

section of my

wrangler.toml

wrangler.toml

file?

mr.niko.la

M

mr.niko.la•4/30/25, 8:50 PM

WORKERS_AI_TOKEN = "gdhd"

WORKERS_AI_TOKEN = "gdhd"

. i suppose i need to add the

workers ai

workers ai

api token too?

mr.niko.la

Mmr.niko.la `WORKERS_AI_TOKEN = "gdhd"`. i suppose i need to add the `workers ai` api token ...

Isaac McFadyen

I

Isaac McFadyen•4/30/25, 9:11 PM

You shouldn't need a token when you're using the binding, but how are you calling the model?

Isaac McFadyen

I

Isaac McFadyen•4/30/25, 9:11 PM

The binding is separate from the API, and if you're using the binding you shouldn't be calling the API at all

Isaac McFadyen

IIsaac McFadyen You shouldn't need a token when you're using the binding, but how are you callin...

mr.niko.la

M

mr.niko.la•4/30/25, 11:25 PM

Hi Isaac, thanks for the clarification! I'm using the AI binding to call the @cf/openai/whisper-large-v3-turbo model via env.AI.run in my Worker. Here’s a sample:

          try {
            console.log(`[BYPASS] Attempt 3: Using AI binding with raw bytes`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: [...audioBytes], // Convert Uint8Array to regular array
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 3: Using AI binding with raw bytes`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: [...audioBytes], // Convert Uint8Array to regular array
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 2: Using AI binding with base64`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: base64Audio,
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 2: Using AI binding with base64`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: base64Audio,
              language: 'en',
            });

Despite the binding in wrangler.toml ([ai] binding = "AI"), I’m getting 401 Unauthorized errors. Could this be an account permission issue or a binding misconfiguration? Any suggestions to debug?

mr.niko.la

Mmr.niko.la Hi Isaac, thanks for the clarification! I'm using the AI binding to call the @cf...

Isaac McFadyen

I

Isaac McFadyen•4/30/25, 11:36 PM

Is this in a Durable Object? Why are you using the

this

this

there?

Isaac McFadyen

IIsaac McFadyen Is this in a Durable Object? Why are you using the `this` there?

mr.niko.la

M

mr.niko.la•4/30/25, 11:56 PM

My

wrangler.toml

wrangler.toml

defines

AUDIO_PROCESSOR

AUDIO_PROCESSOR

as a Durable Object, and the code uses this.env.AI,

# Define the Durable Objects
[[durable_objects.bindings]]
name = "AUDIO_PROCESSOR"
class_name = "AudioProcessor"

# External DO binding for LLM worker
[[durable_objects.bindings]]
name = "LLM_PROCESSOR"
class_name = "LlmProcessor"
script_name = "llm"

# Add queue consumers with explicit handler functions
[[queues.consumers]]

[vars]


# Define the Durable Object class
[[migrations]]
tag = "v1"
new_classes = ["AudioProcessor"]

[[kv_namespaces]]


[[d1_databases]]


[[queues.consumers]]


[[queues.producers]]


[ai]
binding = "AI"

[observability]
enabled = true
head_sampling_rate = 1

# Define the Durable Objects
[[durable_objects.bindings]]
name = "AUDIO_PROCESSOR"
class_name = "AudioProcessor"

# External DO binding for LLM worker
[[durable_objects.bindings]]
name = "LLM_PROCESSOR"
class_name = "LlmProcessor"
script_name = "llm"

# Add queue consumers with explicit handler functions
[[queues.consumers]]

[vars]


# Define the Durable Object class
[[migrations]]
tag = "v1"
new_classes = ["AudioProcessor"]

[[kv_namespaces]]


[[d1_databases]]


[[queues.consumers]]


[[queues.producers]]


[ai]
binding = "AI"

[observability]
enabled = true
head_sampling_rate = 1

i removed the other bindings that were not relevant....

The sample code provided for Typsescipt uses the api only
https://developers.cloudflare.com/workers-ai/models/whisper-large-v3-turbo/

So im kinda confused and vibing use groq and claude to find the right way to do it thx

Cloudflare Docs

whisper-large-v3-turbo

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

acusti

A

acusti•5/1/25, 2:19 AM

i’ve been using workers AI via the binding from my worker for many months and it‘s been great. overall, very happy with it, thanks to all who are putting in the work to make it work so well.

however, there is a frustrating DX paper cut, which is that when running the worker locally in dev, i have to authenticate at least once a day. it gets triggered by invoking

env.AI.run(…)

env.AI.run(…)

, and i get an authentication error in the console and it opens my default web browser to the cloudflare dashboard’s grant permissions UI (or sometimes crashes the process and i just run

wrangler login

wrangler login

manually to do it). is this normal and expected?

mr.niko.la

Mmr.niko.la My `wrangler.toml` defines `AUDIO_PROCESSOR` as a Durable Object, and the code u...

mr.niko.la

M

mr.niko.la•5/1/25, 6:57 AM

Capacity error, will retry: 3040: Capacity temporarily exceeded, please try again

Capacity error, will retry: 3040: Capacity temporarily exceeded, please try again

cf whisper worker ai. @Isaac | AS213339 is this common?

Isaac McFadyen

IIsaac McFadyen Is this in a Durable Object? Why are you using the `this` there?

mr.niko.la

M

mr.niko.la•5/1/25, 7:13 AM

export class AudioProcessor {
  constructor(state, env) {
    this.state = state;
    this.env = env;
  }

export class AudioProcessor {
  constructor(state, env) {
    this.state = state;
    this.env = env;
  }

TechOnyx

T

TechOnyx•5/1/25, 8:19 AM

I recently wanted to integrate AI into my site, so I go into Cloudflare Dash > AI > Choose Workers API and create it, and then I deleted it immediately because I wanted to use the Rest API, but I can’t see that screen where you choose amongst worker and rest api, it keeps redirecting me to ai/usage

TechOnyx

T

TechOnyx•5/1/25, 8:24 AM

Please help me, I'm locked out from the AI section, I'm unable to get my api keys and all that stuff

TechOnyx

T

TechOnyx•5/1/25, 8:25 AM

I'm stuck on /usage page

rob

R

rob•5/2/25, 2:44 PM

How do we know what models r the best for what use case. / modalities ? Is there like a matrix somewhere

rob

Rrob How do we know what models r the best for what use case. / modalities ? Is there...

Isaac McFadyen

I

Isaac McFadyen•5/2/25, 8:26 PM

Just a matter of testing with your specific use-case

Isaac McFadyen

I

Isaac McFadyen•5/2/25, 8:26 PM

Specialty models like Qwen Coder will generally be better at their specific specialties than a non-specific model llike LLaMa

Keshav G

K

Keshav G•5/3/25, 8:08 AM

hello all
i am using runwithtools to build a rag chat app. the problem is the output message being outputted is very short, and the runwithtools doesn't respect the max_tokens. how to fix this ? kindly help

import { runWithTools } from '@cloudflare/ai-utils';
const model = '@cf/meta/llama-3.3-70b-instruct-fp8-fast';
const response = await runWithTools(
                    env.AI,
                    model,
                    {
                        messages: chatmessages,
                        tools: function_tools,
                        max_tokens: 8192,
                        temperature: 0.6,
                    },
                    { strictValidation: true, maxRecursiveToolRuns: 1, verbose: true, streamFinalResponse: true }
                );

import { runWithTools } from '@cloudflare/ai-utils';
const model = '@cf/meta/llama-3.3-70b-instruct-fp8-fast';
const response = await runWithTools(
                    env.AI,
                    model,
                    {
                        messages: chatmessages,
                        tools: function_tools,
                        max_tokens: 8192,
                        temperature: 0.6,
                    },
                    { strictValidation: true, maxRecursiveToolRuns: 1, verbose: true, streamFinalResponse: true }
                );

Keshav G

K

Keshav G•5/3/25, 2:23 PM

ok i sorted this issue, created a new package with these additional parameters.

now how can i get the final response and the previous tool response, to store as context for the ongoing conversation ?

acusti

Aacusti i’ve been using workers AI via the binding from my worker for many months and it...

Viktor

V

Viktor•5/3/25, 5:52 PM

I logged in once, and that's it.
Sometimes it asks me which account I want to use (my login has multiple accounts connected to it), but that's like once in a week.

Keshav G

KKeshav G ok i sorted this issue, created a new package with these additional parameters. ...

Viktor

V

Viktor•5/3/25, 6:01 PM

I honestly love everything about Cloudflare AI, except for their

ai-utils

ai-utils

. I spend a lot of time troubleshooting it, it tracks token usage wrong (returns only the last ones), calls

AI.run

AI.run

more often than necessary, etc. See #Is module @cloudflare/ai-utils stable? > I just wrote the function calling in my own code, works way better, and doesn't inflict double usage consumption. And I'd recommend you to do so, too, tbh

Mark

M

Mark•5/4/25, 2:54 AM

Does anyone still having index issue with AutoRag ?

Mark

MMark Does anyone still having index issue with AutoRag ?

iammac

I

iammac•5/4/25, 7:19 AM

Thread at #autorag

Nikhil Kachare

N

Nikhil Kachare•5/5/25, 9:59 AM

We recently transitioned from using the OpenAI Whisper API to Workers AI, using the @cf/openai/whisper-large-v3-turbo model. The experience has been excellent so far. One significant difference we've observed is with the language parameter: it defaults to 'en', unlike OpenAI, which automatically detects the language. Is this a feature missing from Workers AI, or am I overlooking something?

Viktor

VViktor I honestly love everything about Cloudflare AI, **except** for their `ai-utils`....

Keshav G

K

Keshav G•5/5/25, 7:35 PM

yes, i have modified the function to return messages array for use in context later.
i have also published the changes as a npm package

Keshav G

K

Keshav G•5/5/25, 7:38 PM

i dont know, why the development is so ignored.. also the models are very weak
now way better open source models are there like the qwen, r1, even the llama 405b. cloudflare needs to up the game asap

Aiden

A

Aiden•5/6/25, 7:23 AM

When to support qwen3

Aiden

A

Aiden•5/6/25, 7:27 AM

@Cloudflare

When to support qwen3

Aiden

AAiden @Cloudflare When to support qwen3

Isaac McFadyen

I

Isaac McFadyen•5/6/25, 2:34 PM

?pings

SuperHelpflare

S

SuperHelpflare•5/6/25, 2:34 PM

Please do not ping community members for non-moderation reasons. Doing so will not solve your issue faster and will make people less likely to want to help you.

Isaac McFadyen

I

Isaac McFadyen•5/6/25, 2:35 PM

That is a bot but the point stands, do not try to ping people please

cipherfunk

C

cipherfunk•5/7/25, 1:14 AM

Does the Cloudflare implementation of MeloTTS support any voice other than the default? Any language code I have tried returns the same voice. Is lang: 'EN-AU' the correct formatting?

matti

M

matti•5/8/25, 10:05 AM

Hello! as of today all my local dev queries to cloudflare ai return the following error message, but it works in production. Has anyone noticed a similar thing?

matti

M

matti•5/8/25, 10:05 AM

X [ERROR] InferenceUpstreamError: 10000: Authentication error

at Ai._parseError (cloudflare-internal:ai-api:107:24)
at async Ai.run (cloudflare-internal:ai-api:82:19)

blindChicken

B

blindChicken•5/8/25, 3:36 PM

I'm getting Error: 8001: Invalid inputError: 8001: Invalid input when I try to call tools using @cf/meta/llama-4-scout-17b-16e-instruct@cf/meta/llama-4-scout-17b-16e-instruct. If I call the model without tools it works fines, but if add the tools array it throws the error. Here is my tools array:

[
    {
        "name":"applyFilter",
        "description":"Function that applies the filter...",
        "parameters":{
            "type":"object",
            "properties":{
                "query":{
                    "type":"object",
                    "description":"The query object..."
                },
                "filterDesc":{
                    "type":"string",
                    "description":"Description, for the user, of the filter..."
                }
            },
            "required":["query","filterDesc"]
        }
    }
]

[
    {
        "name":"applyFilter",
        "description":"Function that applies the filter...",
        "parameters":{
            "type":"object",
            "properties":{
                "query":{
                    "type":"object",
                    "description":"The query object..."
                },
                "filterDesc":{
                    "type":"string",
                    "description":"Description, for the user, of the filter..."
                }
            },
            "required":["query","filterDesc"]
        }
    }
]

Here is the call to the model: await AI.run( model, { messages: messages, tools: tools });await AI.run( model, { messages: messages, tools: tools });

blindChicken

B

blindChicken•5/8/25, 3:59 PM

If I use @cf/meta/llama-3.3-70b-instruct-fp8-fast@cf/meta/llama-3.3-70b-instruct-fp8-fast instead of llama-4-scoutllama-4-scout the tool call works, but then the models responds incorrectly. Has there been a change in how tools are call between llama-3 and llama-4? Should I use FunctionsFunctions instead.

blindChicken

B

blindChicken•5/8/25, 4:06 PM

If I change it to call functions instead of tools, I get this cryptic error:

Error: 5006: Error: oneOf at '/' not met, 0 matches: required properties at '/' are 'prompt', Type mismatch of '/messages/0/content', 'array' not in 'string', Type mismatch of '/messages/1/content', 'array' not in 'string', required properties at '/functions/0' are 'name,code'

Error: 5006: Error: oneOf at '/' not met, 0 matches: required properties at '/' are 'prompt', Type mismatch of '/messages/0/content', 'array' not in 'string', Type mismatch of '/messages/1/content', 'array' not in 'string', required properties at '/functions/0' are 'name,code'

bscofield

B

bscofield•5/8/25, 7:12 PM

Howdy! I'm getting an AiError: 3043: Internal server errorAiError: 3043: Internal server error on some calls -- has anyone successfully debugged those?

blindChicken

B

blindChicken•5/8/25, 8:18 PM

I found a solution to my problem above, instead of using tool calling, I used response_formatresponse_format and as result I get response as JSON that can be easily parsed and then used to call my functions. It basically provides the same result, with the exception of allowing the model to select the tool, but one could easily set a prop in the JSON for function selection. It still sucks that tool calling appears to be broken.

bscofield

Bbscofield Howdy! I'm getting an `AiError: 3043: Internal server error` on some calls -- ha...

matti

M

matti•5/8/25, 10:24 PM

I've been getting that too, I think something is off today

matti

Mmatti I've been getting that too, I think something is off today

bscofield

B

bscofield•5/8/25, 11:05 PM

I got different results when I changed models -- maybe worth a shot for you?

matti

M

matti•5/9/25, 7:56 AM

I know yesterday I wasn't able to develop locally, I was getting auth errors all day long. That seems fixed. And I was also getting some AiError 3043 in prod on one certain flow.

dust

D

dust•5/9/25, 8:14 AM

hi guys. I have an issue fetching the api with 401 error. i fetched with vanilla js. i am not sure this is serve side issue or client side but the browser said the error401 is coming from the server. I ran the python code with the same token and works well. i have allowed the control origin but still error. hope anyone can help.

const API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{myID}/ai/run/"
const API_AUTH_TOKEN = "{myTOKEN}" //process.env.API_AUTH_TOKEN;
const model = "@cf/meta/llama-2-7b-chat-int8"
const headers = {
        'Authorization':`Bearer ${API_AUTH_TOKEN}`, 
        //'Content-type':'application/type',
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "GET,HEAD,POST,OPTIONS",
        "Access-Control-Max-Age": "86400"
    }

    if (!API_BASE_URL || !API_AUTH_TOKEN) {
        throw new Error('API credential is wrong or not configured from Github Action')
    }

    const inputs = [
        {'role':'system', 'content':systemPrompt},
        {'role':'user', 'content':userPrompt},
    ]

    const payload = {
        message: inputs
    }

    try {
        console.log("Requesting to LLM...")
        const response = await fetch(`${API_BASE_URL}${model}`, {
            method: 'POST', 
            headers:headers, 
            body: JSON.stringify(payload), 
            mode: 'no-cors'
        }
        );

        if (!response.ok) {
            throw new Error (`Error request from LLM: ${response.status}`)
        }
        console.log("Requesting completed. Waiting for output...")
        const output = await response.json(); // output
        console.log(output)
    }
    catch (error) {
        console.log("API Error", error);
        throw error;
    }

const API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{myID}/ai/run/"
const API_AUTH_TOKEN = "{myTOKEN}" //process.env.API_AUTH_TOKEN;
const model = "@cf/meta/llama-2-7b-chat-int8"
const headers = {
        'Authorization':`Bearer ${API_AUTH_TOKEN}`, 
        //'Content-type':'application/type',
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "GET,HEAD,POST,OPTIONS",
        "Access-Control-Max-Age": "86400"
    }

    if (!API_BASE_URL || !API_AUTH_TOKEN) {
        throw new Error('API credential is wrong or not configured from Github Action')
    }

    const inputs = [
        {'role':'system', 'content':systemPrompt},
        {'role':'user', 'content':userPrompt},
    ]

    const payload = {
        message: inputs
    }

    try {
        console.log("Requesting to LLM...")
        const response = await fetch(`${API_BASE_URL}${model}`, {
            method: 'POST', 
            headers:headers, 
            body: JSON.stringify(payload), 
            mode: 'no-cors'
        }
        );

        if (!response.ok) {
            throw new Error (`Error request from LLM: ${response.status}`)
        }
        console.log("Requesting completed. Waiting for output...")
        const output = await response.json(); // output
        console.log(output)
    }
    catch (error) {
        console.log("API Error", error);
        throw error;
    }

          try {
            console.log(`[BYPASS] Attempt 3: Using AI binding with raw bytes`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: [...audioBytes], // Convert Uint8Array to regular array
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 3: Using AI binding with raw bytes`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: [...audioBytes], // Convert Uint8Array to regular array
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 2: Using AI binding with base64`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: base64Audio,
              language: 'en',
            });

          try {
            console.log(`[BYPASS] Attempt 2: Using AI binding with base64`);
            const aiTranscription = await this.env.AI.run('@cf/openai/whisper-large-v3-turbo', {
              audio: base64Audio,
              language: 'en',
            });

# Define the Durable Objects
[[durable_objects.bindings]]
name = "AUDIO_PROCESSOR"
class_name = "AudioProcessor"

# External DO binding for LLM worker
[[durable_objects.bindings]]
name = "LLM_PROCESSOR"
class_name = "LlmProcessor"
script_name = "llm"

# Add queue consumers with explicit handler functions
[[queues.consumers]]

[vars]


# Define the Durable Object class
[[migrations]]
tag = "v1"
new_classes = ["AudioProcessor"]

[[kv_namespaces]]


[[d1_databases]]


[[queues.consumers]]


[[queues.producers]]


[ai]
binding = "AI"

[observability]
enabled = true
head_sampling_rate = 1

# Define the Durable Objects
[[durable_objects.bindings]]
name = "AUDIO_PROCESSOR"
class_name = "AudioProcessor"

# External DO binding for LLM worker
[[durable_objects.bindings]]
name = "LLM_PROCESSOR"
class_name = "LlmProcessor"
script_name = "llm"

# Add queue consumers with explicit handler functions
[[queues.consumers]]

[vars]


# Define the Durable Object class
[[migrations]]
tag = "v1"
new_classes = ["AudioProcessor"]

[[kv_namespaces]]


[[d1_databases]]


[[queues.consumers]]


[[queues.producers]]


[ai]
binding = "AI"

[observability]
enabled = true
head_sampling_rate = 1

import { runWithTools } from '@cloudflare/ai-utils';
const model = '@cf/meta/llama-3.3-70b-instruct-fp8-fast';
const response = await runWithTools(
                    env.AI,
                    model,
                    {
                        messages: chatmessages,
                        tools: function_tools,
                        max_tokens: 8192,
                        temperature: 0.6,
                    },
                    { strictValidation: true, maxRecursiveToolRuns: 1, verbose: true, streamFinalResponse: true }
                );

import { runWithTools } from '@cloudflare/ai-utils';
const model = '@cf/meta/llama-3.3-70b-instruct-fp8-fast';
const response = await runWithTools(
                    env.AI,
                    model,
                    {
                        messages: chatmessages,
                        tools: function_tools,
                        max_tokens: 8192,
                        temperature: 0.6,
                    },
                    { strictValidation: true, maxRecursiveToolRuns: 1, verbose: true, streamFinalResponse: true }
                );

[
    {
        "name":"applyFilter",
        "description":"Function that applies the filter...",
        "parameters":{
            "type":"object",
            "properties":{
                "query":{
                    "type":"object",
                    "description":"The query object..."
                },
                "filterDesc":{
                    "type":"string",
                    "description":"Description, for the user, of the filter..."
                }
            },
            "required":["query","filterDesc"]
        }
    }
]

[
    {
        "name":"applyFilter",
        "description":"Function that applies the filter...",
        "parameters":{
            "type":"object",
            "properties":{
                "query":{
                    "type":"object",
                    "description":"The query object..."
                },
                "filterDesc":{
                    "type":"string",
                    "description":"Description, for the user, of the filter..."
                }
            },
            "required":["query","filterDesc"]
        }
    }
]

Error: 5006: Error: oneOf at '/' not met, 0 matches: required properties at '/' are 'prompt', Type mismatch of '/messages/0/content', 'array' not in 'string', Type mismatch of '/messages/1/content', 'array' not in 'string', required properties at '/functions/0' are 'name,code'

Error: 5006: Error: oneOf at '/' not met, 0 matches: required properties at '/' are 'prompt', Type mismatch of '/messages/0/content', 'array' not in 'string', Type mismatch of '/messages/1/content', 'array' not in 'string', required properties at '/functions/0' are 'name,code'

const API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{myID}/ai/run/"
const API_AUTH_TOKEN = "{myTOKEN}" //process.env.API_AUTH_TOKEN;
const model = "@cf/meta/llama-2-7b-chat-int8"
const headers = {
        'Authorization':`Bearer ${API_AUTH_TOKEN}`, 
        //'Content-type':'application/type',
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "GET,HEAD,POST,OPTIONS",
        "Access-Control-Max-Age": "86400"
    }

    if (!API_BASE_URL || !API_AUTH_TOKEN) {
        throw new Error('API credential is wrong or not configured from Github Action')
    }

    const inputs = [
        {'role':'system', 'content':systemPrompt},
        {'role':'user', 'content':userPrompt},
    ]

    const payload = {
        message: inputs
    }

    try {
        console.log("Requesting to LLM...")
        const response = await fetch(`${API_BASE_URL}${model}`, {
            method: 'POST', 
            headers:headers, 
            body: JSON.stringify(payload), 
            mode: 'no-cors'
        }
        );

        if (!response.ok) {
            throw new Error (`Error request from LLM: ${response.status}`)
        }
        console.log("Requesting completed. Waiting for output...")
        const output = await response.json(); // output
        console.log(output)
    }
    catch (error) {
        console.log("API Error", error);
        throw error;
    }

const API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{myID}/ai/run/"
const API_AUTH_TOKEN = "{myTOKEN}" //process.env.API_AUTH_TOKEN;
const model = "@cf/meta/llama-2-7b-chat-int8"
const headers = {
        'Authorization':`Bearer ${API_AUTH_TOKEN}`, 
        //'Content-type':'application/type',
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "GET,HEAD,POST,OPTIONS",
        "Access-Control-Max-Age": "86400"
    }

    if (!API_BASE_URL || !API_AUTH_TOKEN) {
        throw new Error('API credential is wrong or not configured from Github Action')
    }

    const inputs = [
        {'role':'system', 'content':systemPrompt},
        {'role':'user', 'content':userPrompt},
    ]

    const payload = {
        message: inputs
    }

    try {
        console.log("Requesting to LLM...")
        const response = await fetch(`${API_BASE_URL}${model}`, {
            method: 'POST', 
            headers:headers, 
            body: JSON.stringify(payload), 
            mode: 'no-cors'
        }
        );

        if (!response.ok) {
            throw new Error (`Error request from LLM: ${response.status}`)
        }
        console.log("Requesting completed. Waiting for output...")
        const output = await response.json(); // output
        console.log(output)
    }
    catch (error) {
        console.log("API Error", error);
        throw error;
    }