C
Coder.com•6mo ago
Emircan

Jupyterlab how to

I used
module "jupyterlab" {
count = data.coder_workspace.me.start_count
source = "registry.coder.com/modules/jupyterlab/coder"
version = "1.0.30"
agent_id = coder_agent.main.id
}
module "jupyterlab" {
count = data.coder_workspace.me.start_count
source = "registry.coder.com/modules/jupyterlab/coder"
version = "1.0.30"
agent_id = coder_agent.main.id
}
Not sure if it create conda env, but i cant install other packages. Also i want to use my gpu in that coder space. I use kubernetes, k3s and my pods are ready
kube-amd-gpu amd-gpu-operator-gpu-operator-charts-controller-manager-56vj26n 1/1 Running 36 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-controller-7555dfd458-jm79v 1/1 Running 38 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-webhook-server-8549795656-dftmj 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-gc-76ddd7ff65-5m8kj 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-master-75649cc887-lhmb8 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-worker-sd99w 1/1 Running 28 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-gpu-operator-charts-controller-manager-56vj26n 1/1 Running 36 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-controller-7555dfd458-jm79v 1/1 Running 38 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-webhook-server-8549795656-dftmj 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-gc-76ddd7ff65-5m8kj 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-master-75649cc887-lhmb8 1/1 Running 23 (26m ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-worker-sd99w 1/1 Running 28 (26m ago) 12d
Checked docs/articles but non found. Having some for this would be great. Thank you
No description
15 Replies
Emircan
EmircanOP•6mo ago
GitHub
[Feature]: RDNA 2 Support · Issue #154 · ROCm/gpu-operator
Suggestion Description I have 6800xt but it do not detect my card. Name: rocminfo Namespace: default Priority: 0 Service Account: default Node: <none> Labels: <none> Annotations: <no...
Emircan
EmircanOP•6mo ago
sample pod with amd gpu
apiVersion: v1
kind: Pod
metadata:
name: amd-smi
spec:
containers:
- image: docker.io/rocm/pytorch:latest
name: amd-smi
command: ["/bin/bash"]
args: ["-c","amd-smi version && amd-smi monitor -ptum"]
resources:
limits:
amd.com/gpu: 1
requests:
amd.com/gpu: 1
restartPolicy: Never
apiVersion: v1
kind: Pod
metadata:
name: amd-smi
spec:
containers:
- image: docker.io/rocm/pytorch:latest
name: amd-smi
command: ["/bin/bash"]
args: ["-c","amd-smi version && amd-smi monitor -ptum"]
resources:
limits:
amd.com/gpu: 1
requests:
amd.com/gpu: 1
restartPolicy: Never
Phorcys
Phorcys•6mo ago
it doesn't seem to create a conda env, no so either you'll have to set up a virtualenv yourself or use !sudo apt install python3-<package> instead in your notebook
Emircan
EmircanOP•6mo ago
okay, what about gpu passthrough? or lets say gpu scheduling in kubernetes world 🙂 I had proxmox background
Emircan
EmircanOP•6mo ago
after checking scripts i see pipx does the thing
!pipx install tensorflow
!pipx inject jupyterlab tensorflow
!pipx install tensorflow
!pipx inject jupyterlab tensorflow
No description
Emircan
EmircanOP•6mo ago
resources {
requests = {
"cpu" = "250m"
"memory" = "512Mi"
"amd.com/gpu" = 1
}
limits = {
"cpu" = "${data.coder_parameter.cpu.value}"
"memory" = "${data.coder_parameter.memory.value}Gi"
"amd.com/gpu" = 1
}
}
resources {
requests = {
"cpu" = "250m"
"memory" = "512Mi"
"amd.com/gpu" = 1
}
limits = {
"cpu" = "${data.coder_parameter.cpu.value}"
"memory" = "${data.coder_parameter.memory.value}Gi"
"amd.com/gpu" = 1
}
}
not sure but could be driver issue..
Emircan
EmircanOP•6mo ago
No description
matifali
matifali•6mo ago
Usually the expectation is that the environment(workspace) is pre-configured with GPU access and any other tools i.e. pytorch etc. The module only starts (and optionally installs) the jupyterlab ik the workspace
Phorcys
Phorcys•6mo ago
(@emircanerkul )
Emircan
EmircanOP•6mo ago
couldn't get it @Phorcys I already installed the module also spesified gpu labels as like i did in other outside coder pods but getting that error. second run i only see cpus not gpu
Phorcys
Phorcys•6mo ago
would you be able to share your base image?
Emircan
EmircanOP•6mo ago
Sure; here all tf code.
Emircan
EmircanOP•6mo ago
codercom/enterprise-base:ubuntu
Phorcys
Phorcys•6mo ago
hey, sorry for the late reply have you installed the AMD GPU drivers on your Kubernetes node(s)? if not, please install them and check via rocm-smi if yes, then i believe the container also needs to be compatible, so maybe try using the rocm/dev-ubuntu-22.04:latest base image instead
Emircan
EmircanOP•6mo ago
No worries and thank you, its just side research and not an urgent thing. I already install all amd things and i thought labeling make all work automatically, Kube node feature discovery should do the thing automatically i thought but yea might not enough, yea probably need to merge codercom/enterprise-base:ubuntu with rocm/dev-ubuntu-22.04:latest because replacing it didnt worked
kube-amd-gpu amd-gpu-operator-gpu-operator-charts-controller-manager-56ldnkx 1/1 Running 11 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-controller-7555dfd458-4fsql 1/1 Running 11 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-webhook-server-8549795656-4t5p5 1/1 Running 10 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-gc-76ddd7ff65-5m8kj 1/1 Running 33 (2m47s ago) 25d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-master-75649cc887-lhmb8 1/1 Running 33 (2m47s ago) 25d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-worker-sd99w 1/1 Running 39 (2m47s ago) 25d
kube-amd-gpu default-deviceconfig-device-plugin-xgwqv 1/1 Running 9 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-metrics-exporter-sgz8d 1/1 Running 68 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-node-labeller-bks4g 1/1 Running 9 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-test-runner-qbfkn
kube-amd-gpu amd-gpu-operator-gpu-operator-charts-controller-manager-56ldnkx 1/1 Running 11 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-controller-7555dfd458-4fsql 1/1 Running 11 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-kmm-webhook-server-8549795656-4t5p5 1/1 Running 10 (2m47s ago) 12d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-gc-76ddd7ff65-5m8kj 1/1 Running 33 (2m47s ago) 25d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-master-75649cc887-lhmb8 1/1 Running 33 (2m47s ago) 25d
kube-amd-gpu amd-gpu-operator-node-feature-discovery-worker-sd99w 1/1 Running 39 (2m47s ago) 25d
kube-amd-gpu default-deviceconfig-device-plugin-xgwqv 1/1 Running 9 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-metrics-exporter-sgz8d 1/1 Running 68 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-node-labeller-bks4g 1/1 Running 9 (2m47s ago) 12d
kube-amd-gpu default-deviceconfig-test-runner-qbfkn
Learning a lot more today. - Enabled registery: https://docs.k3s.io/installation/registry-mirror - Forked your coder docker repo - Based on FROM rocm/dev-ubuntu-24.04:latest - Build via docker, saved image and imported via k3s ctr images import https://www.geekandi.com/2023/02/17/import-docker-image-into-k3s/ Tested all looks good but jupyterlab tensorflow still give same error. I'll try my luck with https://hub.docker.com/r/rocm/tensorflow/tags k exec coder-dd716dea-c5a1-4351-87c6-9d4c1efa7aa2-797b7674db-z4xnh -n coder -- sudo rocminfo
*******
Agent 2
*******
Name: gfx1030
Uuid: GPU-df8c88475109d004
Marketing Name: AMD Radeon RX 6800 XT
...
*******
Agent 2
*******
Name: gfx1030
Uuid: GPU-df8c88475109d004
Marketing Name: AMD Radeon RX 6800 XT
...
I build with tensorflow 22gb image size 🤯 but still getting same issue. I think it might be related with user group access. https://discord.com/channels/747933592273027093/1370758361108447272 Of course .. not things always goes good https://github.com/pypa/pipx/issues/1635 Here is pr https://github.com/coder/images/pull/296 (its quite niche area and could be waste of build credits but incase you wanna add *but i want to test it more)

Did you find this page helpful?