C
C#•3mo ago
devhl

Slow HttpClient responses

I have a program which polls an API. It usually works great, but lately, I've been seeing many HttpClient requests take up to 30 seconds despite my timeouts. Here is how I configured the client. And here is the library where these methods to configure Polly reside https://github.com/devhl-labs/CocApi/blob/47f7e1ec097fdfd7f50e60436d7a0fe96d87f721/src/CocApi.Test/Program.cs#L62
options.AddCocApiHttpClients(
builder: builder => builder
.AddRetryPolicy(context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:Retries")) // 1
.AddTimeoutPolicy(TimeSpan.FromMilliseconds(context.Configuration.GetValue<long>("CocApi:Rest:HttpClient:Timeout"))) // 1.5 seconds
.AddCircuitBreakerPolicy(
context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:HandledEventsAllowedBeforeBreaking"), // 20
TimeSpan.FromSeconds(context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:DurationOfBreak"))) // 30 seconds
.ConfigurePrimaryHttpMessageHandler(sp => new HttpClientHandler
{
// this property is important if you query the api very fast
MaxConnectionsPerServer = context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:MaxConnectionsPerServer"), // 100
CookieContainer = sp.GetRequiredService<CookieContainer>().Value
})
);
options.AddCocApiHttpClients(
builder: builder => builder
.AddRetryPolicy(context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:Retries")) // 1
.AddTimeoutPolicy(TimeSpan.FromMilliseconds(context.Configuration.GetValue<long>("CocApi:Rest:HttpClient:Timeout"))) // 1.5 seconds
.AddCircuitBreakerPolicy(
context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:HandledEventsAllowedBeforeBreaking"), // 20
TimeSpan.FromSeconds(context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:DurationOfBreak"))) // 30 seconds
.ConfigurePrimaryHttpMessageHandler(sp => new HttpClientHandler
{
// this property is important if you query the api very fast
MaxConnectionsPerServer = context.Configuration.GetValue<int>("CocApi:Rest:HttpClient:MaxConnectionsPerServer"), // 100
CookieContainer = sp.GetRequiredService<CookieContainer>().Value
})
);
And here is the request itself. After the request, I log the difference between the current DateTime and the requestedAtLocalVar, which is showing me up to 30 seconds. It normally shows less than a second.
DateTime requestedAtLocalVar = DateTime.UtcNow;

using (HttpResponseMessage httpResponseMessageLocalVar = await HttpClient.SendAsync(httpRequestMessageLocalVar, cancellationToken).ConfigureAwait(false))
{
string responseContentLocalVar = await httpResponseMessageLocalVar.Content.ReadAsStringAsync(cancellationToken).ConfigureAwait(false);

....
DateTime requestedAtLocalVar = DateTime.UtcNow;

using (HttpResponseMessage httpResponseMessageLocalVar = await HttpClient.SendAsync(httpRequestMessageLocalVar, cancellationToken).ConfigureAwait(false))
{
string responseContentLocalVar = await httpResponseMessageLocalVar.Content.ReadAsStringAsync(cancellationToken).ConfigureAwait(false);

....
12 Replies
devhl
devhl•3mo ago
The only thing I can think is that maybe the timeout wont apply to the HttpContent#ReadAsStringAsync I've now flushed my dns cache and restarted the modem, router, and computer, but it made no effect.
mtreit
mtreit•3mo ago
It's usually better to use Stopwatch to insert timing into your code rather than using DateTime subtraction, although if it's 30 seconds it probably doesn't matter. If you have a reliable repro, I would time both awaits in the shown call and verify which one is being slow. I would also make sure you aren't exhausting the thread pool somehow (although if the code is properly async everywhere, that should be rather unlikely. But it's always possible something is going haywire.) Logging on the server side (assuming you control the API?) to see when it actually receives the request can also help.
devhl
devhl•3mo ago
It is the SendAsync that is taking the time.
mtreit
mtreit•3mo ago
I'd probably take a wireshark trace
devhl
devhl•3mo ago
I don't think that helps. The timeout should throw on slow requests. I see some timing out, but I also see requests going far longer than the timeout.
mtreit
mtreit•3mo ago
It might help narrow down at what part in the entire network communication it is being slow. Like how long it takes the server to respond, etc. I've seen all sorts of strange network issues, like TLS handshake losing packets and retries taking a long time. What API is the code calling? Is it your own API?
devhl
devhl•3mo ago
Not mine, it is the clash of clans api. Looks like I may be getting spurious retransmissions when the reponse times start going up
mtreit
mtreit•3mo ago
Do you have retry logic? Like with exponential backoff? That could explain it. Server is responding with "too busy" or whatever and your retry policy kicks in. That's why a network trace would help narrow down, see exactly what's happening on the wire.
devhl
devhl•3mo ago
I have retry but there is no backoff. If the server were too busy, I'd still expect to see tons of errors. Instead, I see my request time creep up, and somtimes hault. I do get some of the Polly timeout exceptions, but many requests are OK but taking longer than I want to wait., Also, while these slow downs happen, my cpu is 100% I think I will try putting a timeout on the client itself to see if I can mitigate this.
mtreit
mtreit•3mo ago
If your CPU is 100% you should figure out why 🙂 Could be the requests are just never getting any CPU time to execute the continuations after the async call. If your CPU is saturated. Since your async calls to the remote server should not be using CPU time at all.
devhl
devhl•3mo ago
I guess that would answer why the timeouts don't always trigger. The problem was not the api calls. After the call is made, there is a table that gets updated full of aggregate statistics. Updating the aggregate data was too slow. I removed that bit of code and it is back to working again. Thanks for the help. !resolved
mtreit
mtreit•3mo ago
🎉