For example, I am planning to use serverless endpoints. If I choose a global data centers for a user from Europe, the speed of transmitting images and accessing the GPU might be faster, but loading a model of about 12GB takes a long time. If I opt for a Canadian data center's network volume, the European user would have to send images to the Canadian data center, which would be slower for both image transmission and GPU access. However, with the option to use a 48GB high-availability GPU, the speed of accessing the GPU might not be that slow. Considering that the time saved by loading the 12GB model would likely outweigh the increased time for image transmission and GPU access, for my globally targeted app, does this mean that the second option might be more suitable?