The total token limit at 131
I use vLLM and set max model length to 8000 a2048 but out is just 131 (total out + in ), although i have set max tokens to 2048. I try with 2 models and result is the same.


8 Replies
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
does runpod do json schema validation
why did that invalid JSON not cause an error
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
it should return 4xx error
(if they do validation stuff)
Unknown User•6mo ago
Message Not Public
Sign In & Join Server To View
essentially
its your fault
5xx: uh oh i messed up
3xx: go somewhere elseUnknown User•6mo ago
Message Not Public
Sign In & Join Server To View
that's unfortunate
pydantic schema and json validation would be nice