21 Replies
was going to test, but it seems wildly unstable...
(For reference, it started here: https://discord.com/channels/143867839282020352/169726586931773440/1358781562841333932)
thanks :)
And yeah, filesystem caches will play havoc with your testing if you're not careful
Result: 2m17s
Result: 1m24s
Now run the first again 😛
i wonder if i can somehow make the directory a tmpfs to bypass caching?
sure
1m27
Yeah there we go
i probably should be testing without filesystem cache though, as that's the most likely case :p
I'm not sure how you can get rid of it tbh -- it tends to be an OS thing
echo 3 > /proc/sys/vm/drop_caches
is what im going to try
oh yeah, its hella slow now
not going to do any intermediate flushes, just once before starting my test
2m58 on the second one
though i highly doubt some parallelisation would hurt, with multiqueue access etcDoing parallel disk access tends to hurt
just wrote this up real quick, so guess we'll find out
isnt that what multiqueueing is for?
"multiqueueing"?
https://docs.kernel.org/block/blk-mq.html
huh... didnt expect to get a... JsonReaderException?
oh, i see.
it was reading metadata files i was writing into subdirs because i was getting root level keys wrong
result: 32s

lets try without FS cache

1m16s without filesystem cache
so if im seeing it right, that's > 2x as fast
37s without filesystem cache, if i replace the ConcurrentDictionary with a regular Dictionary + lock
so what, 6x faster?
i'd assume it'd start thrasing on HDDs though...
but that should be solved by using a decent scheduler at the OS kernel level
hm, with the merge logic, that gives me 12m42s, and that doesnt even do the final merge
implemented the merging recursively, its quite fast now but allocates a bunch of ReadStackFrames
testing without filesystem cache:
- chunked+threading approach: 13m27s
- recursive with async (div2): 2m20s
both depend on
GetSerializedUnoptimisedKeysParallel
as defined above
honestly im pretty happy with that resultsomething seems a bit odd here ngl lol
though i probably should reorder the data to be chronological...