Python depencies - install from github repo

The following syntax at the top of my python script fails to deploy with: pip._internal.exceptions.DistributionNotFound: No matching distribution found for maxpanda-python-sdk
# extra_requirements:
# maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git
# extra_requirements:
# maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git
This syntax works from the command line and requirements.txt file locally. Is this possible in Windmill?
17 Replies
rubenf
rubenf•4mo ago
Can you renove maxpanda-python-sdk @ fom it and try ?
Ross (arterial.dev)
Ross (arterial.dev)•4mo ago
@rubenf removing the maxpanda-python-sdk @ allows the lockfile to generate but when I try to run the script I get ERROR: Invalid requirement: '@':
--- PIP INSTALL ---
maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git is being installed for the first time.
It will be cached for all ulterior uses.
/usr/local/bin/python3 -m pip install -v maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git -I -t /tmp/windmill/cache/pip/maxpanda-python-sdk@git+httpsgit@github.comrenovoenergymaxpanda_python_sdk.git --no-cache --no-color --no-deps --isolated --no-warn-conflicts --disable-pip-version-check
Using pip 23.1.2 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)
ERROR: Invalid requirement: '@'
--- PIP INSTALL ---
maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git is being installed for the first time.
It will be cached for all ulterior uses.
/usr/local/bin/python3 -m pip install -v maxpanda-python-sdk @ git+https://git@github.com/renovoenergy/maxpanda_python_sdk.git -I -t /tmp/windmill/cache/pip/maxpanda-python-sdk@git+httpsgit@github.comrenovoenergymaxpanda_python_sdk.git --no-cache --no-color --no-deps --isolated --no-warn-conflicts --disable-pip-version-check
Using pip 23.1.2 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)
ERROR: Invalid requirement: '@'
rubenf
rubenf•4mo ago
I think you still have it in your file somehow it wouldn't get that @ by itself
Ross (arterial.dev)
Ross (arterial.dev)•4mo ago
@rubenf Here is a recording of a new script creation which reproduces the issue:
rubenf
rubenf•4mo ago
Thanks, will look into it
Ross (arterial.dev)
Ross (arterial.dev)•4mo ago
@rubenf Lockfile generation is producing a valid lockfile syntax. It seems the command that bootstraps the job environment isn't able to handle a valid entry in the lockfile. I'm struggling to find a workaround. I tried uploading the sdk to f/sdks/maxpanda_python_sdk and import f.sdk.maxpanda_python_sdk as needed but the automatic requirement parsing is also breaking for the individual sdk files.
rubenf
rubenf•4mo ago
Understood but I won't be able to help before I investigate
Ross (arterial.dev)
Ross (arterial.dev)•4mo ago
Ok thanks for your help!
djm
djm•4mo ago
I spent the last few days looking at this for our private pypi package: I was not able to get it working with PIP_EXTRA_INDEX_URL, might need your help with that one @rubenf . I will look at a minimum reproducible example soon. I was able to get it working in our Self-Hosted environment like so: * use #requirements: to specify the exact Git SSH URL with the version tagged using @:
#requirements:
#git+ssh://git@github.com/Canteen-Australia/canteen-common-py@v0.5.3

from canteen.dbricks import DBricks

def main(
notebook_path: str,
):
DBricks.start_notebook(notebook_path)
print(f"Starting {notebook_path}")
#requirements:
#git+ssh://git@github.com/Canteen-Australia/canteen-common-py@v0.5.3

from canteen.dbricks import DBricks

def main(
notebook_path: str,
):
DBricks.start_notebook(notebook_path)
print(f"Starting {notebook_path}")
* Add Init Script to worker which adds github.com to known_hosts for SSH:
touch /root/.ssh/known_hosts
ssh-keyscan -t rsa github.com >> /root/.ssh/known_hosts
touch /root/.ssh/known_hosts
ssh-keyscan -t rsa github.com >> /root/.ssh/known_hosts
* Mount on the workers the private SSH key authorised to access the Git repo:
windmill_worker:
...
volumes:
...
# mount the GitHub SSH Identity
- ./gh_identity:/root/.ssh/id_rsa
windmill_worker:
...
volumes:
...
# mount the GitHub SSH Identity
- ./gh_identity:/root/.ssh/id_rsa
This seems to have issues when re-using the cached resolution on another worker; I would imagine INIT_SCRIPT runs on both workers so I'm not sure why the host key would fail to verify
Collecting canteen-common@ git+ssh://git@github.com/Canteen-Australia/canteen-common-py@v0.5.3
Cloning ssh://****@github.com/Canteen-Australia/canteen-common-py (to revision v0.5.3) to /tmp/pip-install-dzscmz_k/canteen-common_eccee4f9a9ad4f719bf9ecf23204270d
Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/Canteen-Australia/canteen-common-py' /tmp/pip-install-dzscmz_k/canteen-common_eccee4f9a9ad4f719bf9ecf23204270d

Host key verification failed.
Collecting canteen-common@ git+ssh://git@github.com/Canteen-Australia/canteen-common-py@v0.5.3
Cloning ssh://****@github.com/Canteen-Australia/canteen-common-py (to revision v0.5.3) to /tmp/pip-install-dzscmz_k/canteen-common_eccee4f9a9ad4f719bf9ecf23204270d
Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/Canteen-Australia/canteen-common-py' /tmp/pip-install-dzscmz_k/canteen-common_eccee4f9a9ad4f719bf9ecf23204270d

Host key verification failed.
Scaling to just one worker fixes this; also had to run delete from pip_resolution_cache ; on the postgres container
rubenf
rubenf•4mo ago
looks like you would need to run a command prior to do the pip install, this one:
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
You can try adding it to the init script but otherwise we would support it in EE
djm
djm•4mo ago
Thanks for getting back to me, I was looking at the EE pricing today with my teammate, I asked you previously about Non Profits but the pricing page doesn't seem clear. Does Pro(NfP) = EE? I have the ssh-keyscan command in step 2 above, already in init script. For some reason it works on the first worker, but as soon as the script is allocated to another worker it fails. Re-running incessantly until the original worker is chosen again will allow it to work successfully
rubenf
rubenf•4mo ago
yes, Pro(NfP) = EE are you sure it's (/root/.ssh) and not (/.ssh) ?
djm
djm•4mo ago
hmm well it works on one, you mean try with ~/.ssh?
rubenf
rubenf•4mo ago
I would play with bash scripts rather than python scripts to see what ssh/pip command work and do not as well as print known_hosts maybe and understand what's happening
djm
djm•4mo ago
Thanks, will be in touch re: licensing 🙂
rubenf
rubenf•4mo ago
and I wonder if it wou;d not be easier that you guys setup a proper pypi repo will allow proper versionning scheme
djm
djm•4mo ago
we do have one but with PIP_EXTRA_INDEX_URL we were getting a very strange issue where it would try and install our repo URL as a package, that is the reproduction i am talking about above