accelerate launch best --num_cpu_threads_per_process value ?
Hi guys, I try to do some lora training on a serverless endpoint and I wonder how many cpu cores are available with the different GPU types? Is there a specification on that somewhere? And / or what do you use? My first tests ran on a single thread but would love to maximize performance.