Thread safety when mutating an object
I have two IAsyncEnumerables I want to join into one dictionary of result objects without buffering them into memory - which means build the results objects as they come from the IAsyncEnumerable.
Will this be thread safe and work correctly or do I need to ensure better thread safety?
19 Replies
That depends entirely on whether the thread which runs
ProcessAsync
is something like a UI thread, which has a SynchronizationContext installed on it, which posts messages back to a dedicated thread's message queue
(also, note, you can do await Task.WhenAll(namesConsumer, agesConsumer)
and save yourself an awaitAh forgot to mention... dotnet 8 we service
"we"?
web*
Rest API web server
So, you might have two threads calling
GetOrAdd
at the same time. But I've just noticed it's a ConcurrentDictionary so that should be fine
The guarantees made by GetOrAdd
ensure that you can't get a situation where two threads call GetOrAdd
at the same time with different IDs, and GetOrAdd
ends up returning different objects to both
I think you'd be better off with AddOrUpdate
anyway: that stops you setting Name
/Age
twice if you insert the objectSo the update fn will mutate the object and return itself?
And also is it safe to mutate an object from different threads like what's happening here?
Yes and yes. The two threads are mutating different fields, and not reading the other field, so it's OK
And will be OK no matter the field type inside?
If one thread was making decisions based on whether the other thread had written the other field, that would be racy, potentially
But it's safe for one thread to be manipulating one field, and another thread to be manipulating another field, as long as they just limit themselves to "their" field
And just to make sure GetOrAdd is wrong here or just that GetOrUpdate is better?
The field type might matter if you have two different threads doing stuff to a single field at the same time. But since each field is only interacted with by a single thread, you can't have two things happening to a single field at the same time
GetOrAdd is fine. Just in the case where you insert an object, you set e.g.
Name
twice, as you do new Result { Name = nameResult.Name }
then result.Name = nameResult.Name
Ah got it makes sense
This is better or just a stylistic preference?
Slightly better. With two
awaits
, a thread has to do a bit of work once the first await completes, just to start the second awaitInteresting I was pretty sure when all does await in a loop or similar
Good to know
@canton7 Anyway thanks a lot!
@canton7 btw is it a valid approach for the problem or will you do it differently?
I've been trying to think of a better way and tbh I'm struggling. I was wondering whether there's something in System.Linq.AsyncEnumerable, but anything I can come up with is probably slower and less obvious than your approach
It might be quicker to just:
It means that you can't do both names and ages in parallel, and you need to buffer all the names in memory, but you do cut out the overhead of a ConcurrentDictionary, so it might be quicker? You'd need to test it
Ok got it thanks... I doubt it as the IAsyncEnumerable can return 100K items and have more properties that what's shown here
Still worth the benchmark though
Fair!