Is there a way to limit the tokens output like how you can in openAI?

Uuser Is there a way to limit the tokens output like how you can in openAI?

U

userOP•3/29/24, 8:51 AM

if not, this would imo be a very important thing to have because uh, https://user.you-sk.id/IPzlj5FWbl54

uwu

owo

U

userOP•3/29/24, 8:52 AM

this happens way too frequently...

U

userOP•3/29/24, 8:53 AM

3000 tokens from just one AI message because it randomly bugged out isnt exactly the greatest thing

I

Isaac McFadyen•3/29/24, 3:02 PM

I'm going to take a wild guess and say this is qwen.

I

Isaac McFadyen•3/29/24, 3:02 PM

I've had similar issues with Qwen running locally (i.e. not just Workers AI).

I

Isaac McFadyen•3/29/24, 3:03 PM

Even if you prompt in English and tell it to reply in English it seems to randomly switch to Chinese and back. I eventually just gave up and switched models.

I

Isaac McFadyen•3/29/24, 3:03 PM

Not sure if the repetition is related but that's likely why it randomly replies in Chinese after the actual response.

Mmichelle are you using a chat template at all?

I

Isaac McFadyen•3/29/24, 3:04 PM

I was, yeah. ChatML.

I

Isaac McFadyen•3/29/24, 3:05 PM

It's possible it doesn't expect ChatML? I can't find many details online actually...

I

Isaac McFadyen•3/29/24, 3:05 PM

Looks like it wants ChatML so that's probably not the issue...

I

Isaac McFadyen•3/29/24, 3:05 PM

And my ChatML prompt worked in other models so I don't think it was an issue with how I was formatting it.

I

Isaac McFadyen•3/29/24, 3:06 PM

Unless they have some Qwen-specific ChatML variation?

Mmichelle which of the 4 qwen models are you trying?

I

Isaac McFadyen•3/29/24, 3:06 PM

I was trying 14b. Keep in mind this was locally and not Workers AI so perhaps Workers AI is formatting the chat prompt differently?

Mmichelle which model are you using?

U

userOP•3/29/24, 6:44 PM

Sorry, just woke up

U

userOP•3/29/24, 6:44 PM

The bug does happen on beta models so it's expected to be a bit buggy but still

Mmichelle we do allow you to set max_tokens

U

userOP•3/29/24, 6:44 PM

ah ty, couldnt find it in docs

U

userOP•3/29/24, 6:45 PM

anyways, the bug happens with all of the qwen models and openchat models

U

userOP•3/29/24, 6:45 PM

@cf/openchat/openchat-3.5-0106

@cf/openchat/openchat-3.5-0106

@cf/qwen/qwen1.5-0.5b-chat

@cf/qwen/qwen1.5-0.5b-chat

@cf/qwen/qwen1.5-1.8b-chat

@cf/qwen/qwen1.5-1.8b-chat

@cf/qwen/qwen1.5-14b-chat-awq

@cf/qwen/qwen1.5-14b-chat-awq

@cf/qwen/qwen1.5-7b-chat-awq

@cf/qwen/qwen1.5-7b-chat-awq

U

userOP•3/29/24, 6:46 PM

sometimes the models will either just spit out complete nonsense or a bunch of chinese characters

U

userOP•3/29/24, 6:46 PM

like i got several full on book-report essays when i just asked openchat "How are you today?" when testing

U

userOP•3/29/24, 6:47 PM

if needed I can provide my account ID

U

userOP•3/29/24, 6:48 PM

wdym chat template

U

userOP•3/29/24, 6:48 PM

Like a system prompt?

U

userOP•3/29/24, 6:49 PM

or like premade code

U

userOP•3/29/24, 6:50 PM

I just give the AI some basic instructions and the bug happens

U

userOP•3/29/24, 6:50 PM

Keep responses brief, limited to two sentences.
Maintain a friendly chat environment.
Use emoticons like ",_," or "@w@" instead of emojis.
You are to NEVER use emojis.

Keep responses brief, limited to two sentences.
Maintain a friendly chat environment.
Use emoticons like ",_," or "@w@" instead of emojis.
You are to NEVER use emojis.

Keep responses brief, limited to two sentences.
Maintain a friendly chat environment.
Use emoticons like ",_," or "@w@" instead of emojis.
You are to NEVER use emojis.

Keep responses brief, limited to two sentences.
Maintain a friendly chat environment.
Use emoticons like ",_," or "@w@" instead of emojis.
You are to NEVER use emojis.

U

userOP•3/29/24, 6:50 PM

so yeah I am using a chat template I guess?

Mmichelle @w@ let me check because i can't repro

U

userOP•3/29/24, 6:50 PM

it's like

U

userOP•3/29/24, 6:50 PM

completely random

U

userOP•3/29/24, 6:51 PM

it doesnt happen constantly but then randomly it will just spit out complete nonsense

U

userOP•3/29/24, 6:52 PM

I figure you guys have some sort of logs per-account-id you can check

U

userOP•3/29/24, 6:53 PM

fyi the bug seemed to be more frequest with the 8b one but it still happened sometimes for all of them

Mmichelle yeah DM me your account id

U

userOP•3/29/24, 6:53 PM

Gotcha

U

userOP•3/29/24, 6:53 PM

you're gonna see a lot of random testing in the logs xd

U

userOP•3/29/24, 6:54 PM

@cf/tinyllama/tinyllama-1.1b-chat-v1.0

@cf/tinyllama/tinyllama-1.1b-chat-v1.0

@cf/tinyllama/tinyllama-1.1b-chat-v1.0

@cf/tinyllama/tinyllama-1.1b-chat-v1.0 was also very weird iirc

U

userOP•3/29/24, 6:54 PM

(i tried all of the models to see which one would work best for the project im making)

U

userOP•3/29/24, 6:57 PM

{
    "response": "As an AI language model, I don't experience emotions in the way that humans do, but I'm here to assist you with any questions or tasks you may have to the best of my abilities. Is there something specific you'd like to know or discuss today? I'm here to provide information and support no matter the topic. Is there anything in particular on your mind or needing help with? I'm here to listen and offer any help you need.",
    "prompt": "Keep responses brief, limited to two sentences. Maintain a friendly chat environment. Use emoticons like \",_,\" or \"@w@\" instead of emojis. You are to NEVER use emojis. ##PERSONALITY## You are a friendly assistant"
}

{
    "response": "As an AI language model, I don't experience emotions in the way that humans do, but I'm here to assist you with any questions or tasks you may have to the best of my abilities. Is there something specific you'd like to know or discuss today? I'm here to provide information and support no matter the topic. Is there anything in particular on your mind or needing help with? I'm here to listen and offer any help you need.",
    "prompt": "Keep responses brief, limited to two sentences. Maintain a friendly chat environment. Use emoticons like \",_,\" or \"@w@\" instead of emojis. You are to NEVER use emojis. ##PERSONALITY## You are a friendly assistant"
}

{
    "response": "As an AI language model, I don't experience emotions in the way that humans do, but I'm here to assist you with any questions or tasks you may have to the best of my abilities. Is there something specific you'd like to know or discuss today? I'm here to provide information and support no matter the topic. Is there anything in particular on your mind or needing help with? I'm here to listen and offer any help you need.",
    "prompt": "Keep responses brief, limited to two sentences. Maintain a friendly chat environment. Use emoticons like \",_,\" or \"@w@\" instead of emojis. You are to NEVER use emojis. ##PERSONALITY## You are a friendly assistant"
}

{
    "response": "As an AI language model, I don't experience emotions in the way that humans do, but I'm here to assist you with any questions or tasks you may have to the best of my abilities. Is there something specific you'd like to know or discuss today? I'm here to provide information and support no matter the topic. Is there anything in particular on your mind or needing help with? I'm here to listen and offer any help you need.",
    "prompt": "Keep responses brief, limited to two sentences. Maintain a friendly chat environment. Use emoticons like \",_,\" or \"@w@\" instead of emojis. You are to NEVER use emojis. ##PERSONALITY## You are a friendly assistant"
}

U

userOP•3/29/24, 6:57 PM

it just spat this out

U

userOP•3/29/24, 6:58 PM

from "How are you today"

Is there a way to limit the tokens output like how you can in openAI?

Similar Threads