"Can an AI do your job? " website
Basically a website where you enter a task description/... And an agent tries to accomplish it. The user can see steps of the reasoning and the actions of the agent.
Goal 1: so people can see what agents can already do and how good they are at reasoning and taking actions. Even if they are not great yet, still many programmers were very surprised when using agents with verbose=True. Devin has video demos but still targeted towards sw devs. I don't know of any other resource out there so the masses can see what agents can do
Goal 2: a bit of a long shot but eventually it would be useful to gather task descriptions for METR. They are already outsourcing this and maybe a website can help to gather more variety of tasks.
Problem: the costs at the beginning since running agents is not cheap but maybe an mvp can be run with a small pool of people and check it pays off the investment
Goal 1: so people can see what agents can already do and how good they are at reasoning and taking actions. Even if they are not great yet, still many programmers were very surprised when using agents with verbose=True. Devin has video demos but still targeted towards sw devs. I don't know of any other resource out there so the masses can see what agents can do
Goal 2: a bit of a long shot but eventually it would be useful to gather task descriptions for METR. They are already outsourcing this and maybe a website can help to gather more variety of tasks.
Problem: the costs at the beginning since running agents is not cheap but maybe an mvp can be run with a small pool of people and check it pays off the investment