Theo's Typesafe Cult•7mo ago

Is DeepSeek-R1T-Chimera, like Grok 3 Mini, in the most attractive quadrant?

DeepSeek-R1T-Chimera merges the intelligence of R1 with the token efficiency of V3 and might be similar to Grok 3 mini! (https://x.com/tngtech/status/1916284566127444468) Theo wrote Grok 3 Mini is the only model with both a "cost to run intelligence index", less than 128 and an "artificial analysis intelligence index" more than 56 (https://x.com/theo/status/1920949723017412699) R1T got a lot of traction, (e.g. on Reddit https://www.reddit.com/r/JanitorAI_Official/comments/1k96qbs/new_deepseek_r1t_chimera_through_openrounter/ and Hugging Face https://huggingface.co/tngtech/DeepSeek-R1T-Chimera), but how about incorporating it in t3 chat?

TNG Technology Consulting GmbH (@tngtech) on X

Today we release DeepSeek-R1T-Chimera, an open weights model adding R1 reasoning to @deepseek_ai V3-0324 with a novel construction method. In benchmarks, it appears to be as smart as R1 but much faster, using 40% fewer output tokens. The Chimera is a child LLM, using V3s

Theo - t3.gg (@theo) on X

This chart is breaking my brain. When you compare cost against score, the ONLY model in the green is Grok 3 Mini.

[Mature Content] From the JanitorAI_Official community on Reddit: [...

Posted by imowlekk - 273 votes and 257 comments

4 Replies

zitter•7mo ago

It looks like it sort of comparing the graphs it has to be to the left of Deepseek R1 in cost but the same quality but are there any benchmarks?

TNGDKOP•7mo ago

benchmarks have not been released yet. The vibe check is: as intelligent r1 but with a shorter thinking process: https://www.reddit.com/r/JanitorAI_Official/comments/1k96qbs/new_deepseek_r1t_chimera_through_openrounter/

[Mature Content] From the JanitorAI_Official community on Reddit: [...

Posted by imowlekk - 280 votes and 265 comments

zitter•7mo ago

So we have to wait or can’t someone just run the benchmarks ??

TNGDKOP•6mo ago

Are there any plans of making this model available? It's overtaken R1 on Chutes (4.5 Bn input tokens / day are being processed) https://chutes.ai/app/chute/aef797d4-f375-5beb-9986-3ad245947469?tab=stats

Chutes

Run Anything, Decentralised.

Gaming

Programming

Is DeepSeek-R1T-Chimera, like Grok 3 Mini, in the most attractive quadrant?

Did you find this page helpful?