C
C#•3mo ago
OptoCloud

Filesystem packer slows down after 30k files

After the filesystem packer has hashed all 255k files then the DB operations starts to slow down the entire application. The DB writes get to 30k files before the TAR writer catches up and slows down to the DB writers speed. then it uses hours maybe days to finish... Any way I can speed this up? https://github.com/OptoCloud/OptoPacker Current status:
GitHub
GitHub - OptoCloud/OptoPacker: Pre-packs huge filesystems containin...
Pre-packs huge filesystems containing repositories or other projects for compression, parsing gitignore files to exclude unnessecary files from packing - OptoCloud/OptoPacker
No description
3 Replies
OptoCloud
OptoCloud•3mo ago
tar writer has to wait for DB job before doing its thing because if a BLOB with a matching hash has already been written to the TAR file then there is no use writing it again Application workflow:
Discover all files, respecting gitignore files along the way

Register all directories traversed to the database and build a directory graph

Then asynchronously using IAsyncEnumerator, with batching and other stuff, for every file discovered:
Hash file, the hash will be used to ensure file contents of identical files are only written to TAR file once, a unique entry of file contents is refered to as a BLOB

Check database if a BLOB has already been registered in the database, if not then write the blob entry to the database

Register the file record with relation to the blob record in the database and set its relation to the directory hierarchy

**Only if** the blob record was inserted into the database, write the file to the tar file with a filename that is the hash of its content (the blob hash)
Discover all files, respecting gitignore files along the way

Register all directories traversed to the database and build a directory graph

Then asynchronously using IAsyncEnumerator, with batching and other stuff, for every file discovered:
Hash file, the hash will be used to ensure file contents of identical files are only written to TAR file once, a unique entry of file contents is refered to as a BLOB

Check database if a BLOB has already been registered in the database, if not then write the blob entry to the database

Register the file record with relation to the blob record in the database and set its relation to the directory hierarchy

**Only if** the blob record was inserted into the database, write the file to the tar file with a filename that is the hash of its content (the blob hash)
Jimmacle
Jimmacle•3mo ago
dumping a ton of binary blobs into a database (especially sqlite) sounds slow do you really need the file contents in there?
OptoCloud
OptoCloud•3mo ago
im not dumping the binary blobs into the database only their hashes the blobs im writing to a tar stream thats streamed into 7zip sorry for the miswrite I tried to explain it better now I updated my code a bit more to try to maybe optimize it? Idk how much more I can do Bump? @Jimmacle 🤔
Want results from more Discord servers?
Add your server
More Posts
why is my player floating, twitching and can almost not go up a rampso im learning c# for unity and i made a (very) litle scene to test my first person controls i made.Model, Dto, Entity - Id?Hi, I'm messing with a simple CRUD application. My PatientEntity looks like this: public claRequest header field content-type is not allowedI have a third party API which uses SOAP + xml. I want to send a post request. Using Postman I can mRectangle intersection not workinUsing the code provided, I've tried the ways I knew to make the tripods remove once they're hit by aFeeding Random into itself - does it "corrupt the randomness"?If I am to reinitialize an instance of System.Random after every .NextDouble() using the return valuUpdate and add to Resource.resx file via code.We are adding localization to our product and want to automate the process of updating the resx fileWinform project saves the Image with a black backgroundIt works perfectly except it creates a black background which isn't shown in the picturebox preview,✅ Theoretical question. MediatR and DTO and onion structureHello, THe following is just theoretical and I do not need any code In the programming world, transfStack overflow exception when closing one formCan't seem to find the issue.. I have a startscreen, and when any key is pressed on startscreen it tAdvice / Guidance on VM application services (Windows Server 2016)Has anyone every come across a virtual machine (3 virtual 2.6Ghz processors, 12GB Ram) where a singl