download xml.gz sitemaps.

I'm trying to parse the sitemaps from a website that has .xml.gz sitemaps, in python I could use gunzip to decompress and use them. In crawlee we only have the "downloadListOfUrls" method, how I could make it to decompress those files before using them >? sitemap: https://www.zoro.com/sitemaps/usa/sitemap-product-10.xml.gz
6 Replies
NeoNomade
NeoNomadeOP3y ago
I parsed them but using tools from Node. It would be nice to have those built in, in crawlee
flat-fuchsia
flat-fuchsia3y ago
Replied in a different thread. Also passed the question/suggestion to the team 👍
NeoNomade
NeoNomadeOP3y ago
I can share my solution if needed
flat-fuchsia
flat-fuchsia3y ago
If you don't mind - I could definitely pass it to the team 👍 thankls
NeoNomade
NeoNomadeOP3y ago
Pastebin
async function downloadSitemaps() { let compressed_sitemaps = []...
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
flat-fuchsia
flat-fuchsia3y ago
Thanks 👍

Did you find this page helpful?