@jax editing your endpoint doesn't apply the changes, you have to scale your workers down to zero and back up again for the changes to take effect when change the GPU tier.
It happens sometimes, I have an A1111 endpoint that usually works fine with 24GB VRAM but occasionally I get OOM for some requests so I also had to upgrade to 48GB because of it.