trying to use glm-4.6v (from zai provider) for image and video understanding

quick question about media understanding config

i'm trying to use glm-4.6v (from zai provider) for image and video understanding. i already have zai configured and working for text with the ZAI_API_KEY

is there a way to set it up as a media.models entry directly using the zai provider? or do i need to use the cli fallback approach with the zai command?

{
"provider": "zai",
"model": "glm-4.6v",
"capabilities": ["image", "video"]
}

or does it have to be:
{
"type": "cli",
"command": "zai",
"args": ["-m", "glm-4.6v", "{{MediaPath}}"],
"capabilities": ["image", "video"]
}
also wondering if anyone has gotten glm vision models working for media understanding and what the exact config looks like

thanks 🦞
Solution
Great question! For the ZAI provider with glm-4.6v, here's what you need to know:

Direct Provider Approach (if supported):
{
  "provider": "zai",
  "model": "glm-4.6v",
  "capabilities": ["image", "video"]
}


CLI Fallback (if provider doesn't support media directly):
{
  "type": "cli",
  "command": "zai",
Was this page helpful?