R2 Files loading Blank Page

I've had a bunch of foldered files that make up a training course package stored in R2 for a while (over a year). At the root of the folder is an html file (index.html or story.html) that when accessed would load the training material using the other files within the folder structure. This has worked for over a year until this week it all stopped working. I've tried: -deleting the files and re-uploading -messing with CORS -creating a fresh bucket -tried multiple methods of uploading (GUI, API, S3-Browser, CyberDuck) Nothing seems to work and the only errors i get are 404 for select files that the index.html is trying to call within the same folder. I suspect this is related to multipart upload issues but I cannot get the files to upload correctly, which i note was not an issue in the past. This also doesn't explain why the previous batch of uploaded folders/files that were working all the sudden stopped. Looking for guidence on how i should be uploading these files correctly so that it completes with all files in there correct place. For references the total folder size is 400+MB with about 200+ individual files ranging from bytes/kbs all the way to 1, 5, 10, or even 50MB for some videos and pictures. Hence the need for proper Multipart Uploading. Outside of that, looking for any insight as to what might have changed on CF's and how they are handling html files stored and served from R2?
24 Replies
Chaika
Chaika2mo ago
Did you get this working? I saw you asked some questions in #r2 rclone is indeed a really helpful and useful tool that should just work, shouldn't even need to mess with its settings for simple transfers
For references the total folder size is 400+MB with about 200+ individual files ranging from bytes/kbs all the way to 1, 5, 10, or even 50MB for some videos and pictures. Hence the need for proper Multipart Uploading.
You don't even need to use multipart up to 5 GiB, although some tools will ahead of that limit for performance and stopping entire reuploads.
Outside of that, looking for any insight as to what might have changed on CF's and how they are handling html files stored and served from R2?
Nothing recently
Kyle @ KyTech
Kyle @ KyTech2mo ago
Hey, so far I haven't been able to get things working. Or at least working in a way like they were before with R2. I got all the rsync commands going but like you said there probably wasn't a need for that. I've resorted to using Linodes S3 obj storage for now as that seems to work with the same folders and same rsync upload cmd (plus needing to set files to public read). But for now I can't get those files to render and get 404 or 400's for select files in the folder structure when loading the index.html file. It also seems like the folder or url/i path is not translating correctly from the root R2 domain to where these files are actually sitting with the R2 folder structure. Not sure if others have had something similar?
Chaika
Chaika2mo ago
generally most of the issues I've seen with files/folders in R2 (or object storage in general) is just confusion about them since Object storage doesn't actually have folders, it's all virtual. If you go to where you think the assets are, do they load fine? If you click on a file name in the R2 bucket, it'll give you the R2 Custom Domain it's reachable at, worth sanity checking "is this asset actually reachable, and if so what is the html page trying to load/is it any different"
Kyle @ KyTech
Kyle @ KyTech2mo ago
ya for sure, using "folder" as a general term. And yes I have tried that, as far as a i can tell the individual files load but it is hard to tell if they are working correct since they are things like .css .js and .wolf files for elements that get referenced in an online training package output that I don't control. Its standard practice to store and deliver these packages via S3 storage and for over a year I had done that via S3 and R2 for the last 6 months or so. The odd thing is the package (folder) I uploaded 6 months ago that had been working just stopped, which is what caused me to delete and try re-uploading and thats where im at lol. Again Linodes S3 has continued to work the whole time including new uploads so it was my guess that something about the R2 config had changed at least in someway.
Chaika
Chaika2mo ago
could you give the link of a broken page on r2, or look at the console (ctrl+shift+i -> console or right click -> inspect -> console) and look for errors about assets?
Kyle @ KyTech
Kyle @ KyTech2mo ago
Sure, heres the link of the broken folder: https://lmsfiles.aesi-inc.com/lms-test-rise-3%2Fcontent%2Findex.html Here for reference is the same folders/files on Linodes S3: https://acumen-training.us-east-1.linodeobjects.com/how-to-protect-your-data/content/index.html Thanks for following up and having a look!
Chaika
Chaika2mo ago
See how the slashes are different between r2 and s3 in your links? That's not R2 doing it (although I think the dashboard might show them wrong), and it looks like when you do that the browser takes them as escapes and thinks the relative directory is /, so it tries to load the resources with the wrong relative directory, looking for /lib/icomoon.css, but if you do the slashes right: https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/index.html it works fine, and your relative assets work /lms-test-rise-3/content/lib/icomoon.css
Kyle @ KyTech
Kyle @ KyTech2mo ago
hmm, interesting, I still feel like this is a change or bug on R2 side cause like i said i had files uploaded (same type of folder setup) which were working that allt he sudden stopped working. Then all future uploads like this have this same issue. I've tried uploading with multiple S3 clients or cli's and on Mac and windows so i don't think all of those are messing with the slash direction? only thing i can think of outside of a bug with our R2 tenant is a setting in CloudFlare for the custom domain thats used that is changing those paths? Maybe the URI formatting, could that do it?
Chaika
Chaika2mo ago
as far as I can see, this is just a browser thing/whatever is generating your links to them. You can mess up linode in exactly the same way https://acumen-training.us-east-1.linodeobjects.com/how-to-protect-your-data%2Fcontent%2Findex.html guess the real question is how you do generate those links/what is generating those links url encoded?
Kyle @ KyTech
Kyle @ KyTech2mo ago
I copied the link i gave you directly from the R2 dashboard:
No description
Chaika
Chaika2mo ago
ahh yea the dash is messed with that and has been forever but that wouldn't explain why it only broke now maybe the better question is: If you fix up all your links to be / instead of %2F, is there any that are still broken/don't work?
Kyle @ KyTech
Kyle @ KyTech2mo ago
ya, i have had a few weird things happen with links from there but nothing like this and nothing that stays after multiple refreshes, relogs, etc I only have 2 folders right now for testing and i just tried both and they work with the swap in prod or even dev I will be grabbing these links via the R2 API or Cf workers so i don't think it will be an issue but this weirdness was stopping me from even starting that lol
Chaika
Chaika2mo ago
idk how your other things are setup, but they wouldn't break like this unless they relied on relative directory links like that specific page does normally you see relative from base /index.css or absolute https://r2.example.com/index.css which would both be fine
Kyle @ KyTech
Kyle @ KyTech2mo ago
Ya i have another bucket for PDF docs so single files and they work fine.
Chaika
Chaika2mo ago
yea single files would work fine too, no relative linking at all in that case well eitherway sounds like the issue was just the r2 dash showing slashes in the preview urls as %2F (url escaped) and then the browser breaking on relative asset links in that exact setup, maybe could try to push for it to be fixed now as before it was just a harmless display thing but it breaking webpages in some cases is super annoying
Kyle @ KyTech
Kyle @ KyTech2mo ago
I wonder if this setting is playing into it?
No description
Chaika
Chaika2mo ago
nah this is r2 display + browser behavior normalization would change what cf gets/r2 gets, which maybe could mess with the html being served or something, but not the underlying issue R2 quite simply just shouldn't show slashes as %2F/url encoded
Kyle @ KyTech
Kyle @ KyTech2mo ago
ya i just wonder if because its being filtered through the custom domain with these settings? Cause I have a personal CF R3 account and that one also shows the % in urls but i've tried folders there as well and it pastes with the % but when the browser loads it converts it correctly.
No description
Chaika
Chaika2mo ago
no, it just depends on how you are loading the other resources (if you are loading any at all) is it the exact same content/setup? Do you have a link to it?
Kyle @ KyTech
Kyle @ KyTech2mo ago
no its not the exact same but maybe i will try an upload there to test. I just know that for all files in either account the dashboard always has shown the % slashes and then it always converted fine.
Chaika
Chaika2mo ago
right, the issue isn't purely conversion maybe I did a poor job explaining there's three general ways html can link to other resources 1. absolute urls (https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/lib/icomoon.css) 2. relative url off hostname/root path (/lms-test-rise-3/content/lib/icomoon.css) 3. relative url off current path (lib/icomoon.css) <--- this is what you do When you do %2F, it looks like the browser (at least Firefox) takes it as you are trying to escape the url, and that it isn't a proper path separator (I suppose it isn't). As such: https://lmsfiles.aesi-inc.com/lms-test-rise-3%2Fcontent%2Findex.html -> base is /, at root https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/index.html#/ -> base is /lms-test-rise-3/content/ So /lib/icomoon.css with the right base becomes /lms-test-rise-3/content/lib/icomoon.css which works, however if you visit the broken page, you can see the icomoon.css path it tries is /lib/icomoon.css which is broken. So this bug: Only occurs with html linking to other resources Only occurs with links to other resources with relative urls off the current path (pretty rare as far as I know, most of the time you just use relative off root) so this has to do with the way the browser is resolving those relative urls, and the way R2 displays the url encoded/with %2F, nothing else. Nothing you do in CF would change those relative urls to fully qualified ones, or make /lib/icomoon.css know it's supposed to be under /lms-test-rite-3/content
Kyle @ KyTech
Kyle @ KyTech2mo ago
ah ok ya that makes sense I did also find this forum post after searching for the % slash term with R2. very recent and looks like they are aware of this issue and working on it. This also points to this being a more recent issue which lines up with the timeline. https://community.cloudflare.com/t/r2-keys-with-multiple-slashes-in-key-require-url-encoding-or-it-404s/641375/4
Chaika
Chaika2mo ago
been an issue ever since those preview urls have been around afaik it's purely just a display bug (well, that then causes other issues) anyway tldr would be %2F are the enemy, replace with /, I'll see if I can't escalate the issue with the knowledge that it can in edge cases break webpages
Kyle @ KyTech
Kyle @ KyTech2mo ago
That would be awesome, thanks for taking the time on this!
Want results from more Discord servers?
Add your server
More Posts