Evaluation of an Agentic Application as a part of my BSc research
Just about to finish my bachelor's thesis (10 years since my last exam, fml) and I'm thinking of taking it a step further.
It would be nice to give the research some tangible results like time saved, accuracy and general satisfaction.
I'm looking for NGOs or small sized teams, that would run with my agent for about 1-2 weeks and report back.
Where to look for people that would this?
I'm open to suggestions on how to do this safely, track usage or any other feedback.
Current state:
- ollama powered agent works in terminal and executes one goal, then shuts off
- the models are ran locally (gpt-oss:20b works well)
- written in typescript and using @modelcontextprotocol for client, server and transport
- uses @modelcontextprotocol MCP server for accessing local filesystem
- uses custom MCP server for connecting to BunnyCDN
Todos:
- implement AI SDK to run external LLMs (user creates and use their keys)
- containerize the agent
- adding MCP servers that align with requirements of a project
- adding UI or exposing an endpoint instead of making people work in terminal
Github repo:
https://github.com/gsicvj/uplink
0 Replies