Hi, new here, relatively new to LLMs as well. VERY new to remote computing.
I've got a llama-cpp python script that I've been tinkering with on my laptop, and I'd like rent a remote GPU to speed up text generation.
I haven't found a tutorial on how to do what I really want. I'd like to be able to simply hit 'run' from vscode and have the LLM set up on a remote GPU - so that I can send my prompts over, and receive back the generated text.
I'm testing a system with an untraditional prompting system, so I can't just use an existing webUI, and I'd prefer to develop from my IDE than on Jupyter. Anyone got any tips or could point me in the right direction?