When storage template migration shows as not active, is it still working?
I think I ended up with my files in a bit of an odd state. I did a template migration to change the layout, then a few minutes later changed the storage tag on a user. The 'jobs' screen didn't show the first one as active, so I thought it was safe... should it have been safe?
35 Replies
Storage migration usually happen really fast so you would rarely catch it in active state
even on a slower external disk?
hmm... it also doesn't seem to have removed the original storage key... odd
so I have
.../admin/2023/2023-...
with around the right amount of files, and .../kenny/2023-...
with only about half. The first has the original template and storage tag, the second has the updated one for bothwhich storage key are you thinking about?
Looking at the webui it's called 'Storage Label'
(I also seem to have some files is
.../upload/uuid
for the user... but I don't know what they're about, so maybe they're a red herring)upload
is the location the file arrive on the server before being moved according to the storage templateoh... when/how do I move them?
THey are automatically moved in the process
if there are files in there, there seems to be some issues with the move
Usually running generate thumbnail for missing file will help moving them to the correct location
Looking at the db, I don't see any records in
assets
with the admin tagin the originalPath?
Is the assets table the authorative source for what exists? (ie if I script up something that writes entries in there, will everything 'just work'?)
yes, looking at originalPath
So what you are describing is that the files got moved for the path never gotten updated?
No, all the paths are updated but the files aren't moved
looks like about half the files were moved, both on the disk and in the db. None of the files were deleted. About half the db entries appear to be missing
Hmm
oh, for the other user it seems to have duplicated all the files to the new template path, but not removed the old ones. Looks like all db entries are updated in that case though. Curiouser and curiouser
@jrasm91 Any thoughs on this?
I can poke around at getting things where I want. I don't suppose there's an
fsck
tool to verify that all the originals match the db?There's no easy way to verify the integretiy between the asset paths and the files on disk, no.
Not exactly sure what you have done or what steps you have taken, but it looks like you somehow have gotten the files in the database out of sync with the files on disk. The storage migration process should (1) move a file and (2) update the database at the same time. There is an open bug (!!) where sometimes the database is busy and the database update fails and the files get out of sync. Maybe this happened here?
perhaps... I was concerned the bug might be around doing to different migrations near the same time
That's possible as well.
I guess the TLDR is we need to add some more robustness around that process still, and there are some open issues for that. In general I would recomend more caution when doing anything with the migration though.
good advice. I did a btrfs snapshot of the subvolume so I knew the files were safe. I probably should have dumped the assets table too
Yeah this has been an outstanding issue for awhile, but it is semi-complicated so it hasn't been worked on lol
I can script something up to run some checks locally, just want to confirm that 'assets' is the only place I need to pay attention to (there's no albums yet)
That is correct.
Even if you had albums, that just references asset ids. The only place we link assets to files is in the assets table and via the originalPath column.
In theory you could exlude the files that are correctly linked. Then you would have a list of assets and files that mismatch. The sha1 hash of the file is in the assets table so it should be easy to verify and re-match them with that.
Okay. On some other digging, it looks like uploads/uploads has files that were originally truncated by too low of a limit on nginx. Will they be cleaned up automatically and/or are they safe to delete?
They should be safe to delete.
In our code we only have logic to remove uploads that have been fully uploaded and then found to be duplicates.
I would have expected nginx to clean up aborted uploads though.
okay. I was thinking I'd run an
rmlint
to remove the duplicates (keeping the right version), then deal with what's missing in the db. Actually I have those so maybe I'll just upload them again... I know the android app still isn't all that happy about it's counts unless it does the uploadHow many file uploads are we talking about in your library here? And how many users?
The duplicates are in 1 user with around 3000 files. The oddness is in a second user with only around 300 files
For a total of 3300 files it might be faster to just wipe the assets table in the database and re-upload all of them. Just a thought.
(I could probably just
rm -rf 2023
to remove the duplicates, but might as well let rmlint
verify they actually are duplicates first)
true... except I uploaded them then told my daughter she could let ___Photos do the cleanup it wanted to do 🙂I mean, you have all the originals still, you could copy those out and re-upload them.
at least it seems everything failed safe, in that it duplicated files instead of losing any
oh... I see. Yes, I tried that when I still had the files on her phone and the app tried to reupload them all anyway, but in this case maybe that'd be better
Where/how are you seeing duplicates?
I moved the template from
2023/2023-01-01
to 2023-01-01
. It copied all the files, but also left the originals, so I have a couple directories (2023
, 2022
) that have the same files as the outer onesHmm, I've never seen that before. I thought it did a move instead of a copy/delete.