Error running install scripts after deployment.
Workspace Template: Custom (based on deeplearning template https://github.com/matifali/coder-templates/tree/main/deeplearning)
Issue :
- Deployment failed for a workspace using my custom template derived from the standard deeplearning template. This issue started only today after restarting an existing deployment. Previously, new deployments and restarts worked without issues.
- This workspace had been restarted as I want to mount another location in the user_data folder. After restarting, the workspace failed to install extension and unable to connect to coder-server on web. Terminal and VS Code Desktop work fine. Any new deploy/restart to the other workspace also failed in connecting to coder-server.
- Maybe not related, but still worth mentioning: This workspace had been restarted repeatedly to resolve NVIDIA driver-related problems (note: this is a separate bug and also haven't found any thing to solve it). Normally after the restart, the workspace will be able to see the gpu again. The issue has been mention on these https://github.com/coder/coder/issues/4037 and these https://github.com/coder/coder/discussions/4038 but the solution cant seem to fix it. Help is appreciated.
Expected Behavior:
- Deployment should succeed as before, even after restarting.
Actual Behavior:
- Deployment fails post-restart. Other non-restarted deployments using the same template remain unaffected.
Steps to Reproduce:
- Use a workspace based on the custom template (derived from deeplearning).
- Spawn the workspace.
Here is my template config file if it helps:
GitHub
GPU in Docker container becomes unavailable if not used for sometim...
Using a Docker-based template that has access to GPU, If we leave the workspace in a running state and do not use the GPU actively. GPU becomes unavailable after some time. (DL) atif@workspace:~$ n...
GitHub
GPU in Docker container becomes unavailable if not used for sometim...
Using a Docker-based template that has access to GPU, If we leave the workspace in a running state and do not use the GPU actively. GPU becomes unavailable after some time. (DL) atif@workspace:~$ n...
3 Replies
Hi, what folder did you mount?
Sorry for the late response, I haven't been online in discord. I have mount a folder on the host machine to a subfolder in which the parent folder is setup as mount for this coder workspace as you can see in the following code below. The folder i mount mannually in the host machine is targeted to bind in a subfolder of data folder. I suspect howerver is not the case though, i have tried to remove the mount and try spinup different container with or without mount, both fails. This host machine also deploy other services too, so I think there is docker network config that is causing this problems.
# users data directory
volumes {
container_path = "/home/coder/data/"
host_path = "/sda1/vuhuy/NTQ_Hub/coder/deployment/coder-data/user_data/${data.coder_parameter.folder_loc.value}/"
read_only = false
}
hey, apologies for the delay, I forgot to reply, were you able to figure this out?