I
Immichβ€’3mo ago
OOF

Error when trying to use hardware acceleration for Machine Learning. GTX 1070 - cuda steps

I think i got it working but the gpu usage was only like 10% while the cpu usage was 80% during smart search. I update drivers for gpu and docker compose down. upon docker compose up -d i get error coded:
24 Replies
Immich
Immichβ€’3mo ago
:wave: Hey @OOF, Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:. References - Container Logs: docker compose logs docs - Container Status: docker ps -a docs - Reverse Proxy: https://immich.app/docs/administration/reverse-proxy - Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA Checklist I have... 1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time). 2. :ballot_box_with_check: read applicable release notes. 3. :ballot_box_with_check: reviewed the FAQs for known issues. 4. :ballot_box_with_check: reviewed Github for known issues. 5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy). 6. :ballot_box_with_check: uploaded the relevant information (see below). 7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable (an item can be marked as "complete" by reacting with the appropriate number) Information In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider: - Your docker-compose.yml and .env files. - Logs from all the containers and their status (see above). - All the troubleshooting steps you've tried so far. - Any recent changes you've made to Immich or your system. - Details about your system (both software/OS and hardware). - Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h). - The version of the Immich server, mobile app, and other relevant pieces. - Any other information that you think might be relevant. Please paste files and logs with proper code formatting, and especially avoid blurry screenshots. Without the right information we can't work out what the problem is. Help us help you ;) If this ticket can be closed you can use the /close command, and re-open it later if needed. Successfully submitted, a tag has been added to inform contributors. :white_check_mark:
Mraedis
Mraedisβ€’3mo ago
What OS is this?
OOF
OOFOPβ€’3mo ago
Windows 11
Mraedis
Mraedisβ€’3mo ago
Which nvidia driver are you using @OOF ? Can you output nvidia-smi on both windows and WSL?
OOF
OOFOPβ€’3mo ago
No description
OOF
OOFOPβ€’3mo ago
No description
OOF
OOFOPβ€’3mo ago
it is my understanding that I do not need the nvidia toolkit with WSL2 checked my card and it is supported
Mraedis
Mraedisβ€’3mo ago
Yeah just trying to figure it out
OOF
OOFOPβ€’3mo ago
forgot to add this from docker container [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 2 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message. machine learning*
Mraedis
Mraedisβ€’3mo ago
I see some users with the same problem specifically with driver 572.16
OOF
OOFOPβ€’3mo ago
i have just done: docker compose down and then docker compose up -d and no error has occurred in the terminal or in the docker container but gpu usage is no more than 20% with cpu up to 50%
Mraedis
Mraedisβ€’3mo ago
Try running any HWA/ML task like extract metadata
OOF
OOFOPβ€’3mo ago
with 100 concurrent tasks on smart search
Mraedis
Mraedisβ€’3mo ago
well no wonder your CPU is dying if you do 100 concurrency
OOF
OOFOPβ€’3mo ago
was just checking and the gpu is the same for both both 100 and 2 i may just have some fundamental misunderstanding of how hw accel is supposed to work and i have no idea why the error has gone πŸ’€
Mraedis
Mraedisβ€’3mo ago
There is always overhead with tasks if you swamp the CPU with overhead it will just bog down everything, including the feed to the GPU default concurrency is 2 Maybe try a little less "WOW HIGH NUMBER" and just upping it to 4.. 8... and see where the maximum efficiency is
OOF
OOFOPβ€’3mo ago
ok, getting cpu 25% with gpu 15% with concurrency 4, and cpu 40% with gpu 20% with concurrency 8. I personally have no idea if those numbers are reasonable, I was just expecting more gpu utilisation tbh i really should have done a before and after with the hardware acceleration disabled and then enabled
sogan
soganβ€’3mo ago
It sounds like it’s working fine now Storage is often a bottleneck. Make sure the thumbnails are on an SSD and definitely not being accessed through a network share If the processor is slow, that can also contribute to low GPU utilization
OOF
OOFOPβ€’3mo ago
if unspecified in the .env, will they will go to the UPLOAD_LOCATION ? Defos not a network share. All stored local Ryzen 7 3700X with gtx 1070 I feel should be fine? *Which is a hdd
Mraedis
Mraedisβ€’3mo ago
HDD is definitely a bottleneck
OOF
OOFOPβ€’3mo ago
would creating a mount for the thumbnails on my ssd and leaving the bulk storage on the hdd be viable?
Mraedis
Mraedisβ€’3mo ago
Thumbs and database πŸ˜› you'll need to actually move them though
OOF
OOFOPβ€’3mo ago
cheers πŸ‘
Immich
Immichβ€’3mo ago
This thread has been closed. To re-open, use the button below.

Did you find this page helpful?