Dear support team, I'd like to know how are CPU time limits enforced, particularly in the free tier. Empirically, what I'm seeing is an endpoint I am benchmarking that is responding successfully despite taking on p50 96.4ms
In previous discussions [1], it was stated that there is some leeway with the limit, but still the limits and when they are enforced are unpredictable based on the information I found both on the website and the community discord. On the other hand, I found [2] from 2018, in which @kenton describes how CPU limiting works in practise, but I don't see the answer matching the data I'm getting empirically.
Can I understand better when to expect a worker exceeding CPU limit of 10ms on P50 to be rate-limited?
Additionally, why median and p50 values are reported differently? (100ms, and 96.4ms respectively)