PauseAIP
PauseAI8mo ago
96 replies
hurt-tomato

TakeOverBench

🔎Research
I've been thinking about making a TakeOverBench - a benchmark that tests for capabilities relevant for takeover.

Now, we (PauseAI + ERO) are developing a website that combines existing benchmarks and links them to concrete takeover threat models.

(in this past, this topic was about an EU tender, which explains most of the first comments)
Was this page helpful?