download xml.gz sitemaps.
I'm trying to parse the sitemaps from a website that has .xml.gz sitemaps, in python I could use gunzip to decompress and use them.
In crawlee we only have the "downloadListOfUrls" method, how I could make it to decompress those files before using them >?
sitemap: https://www.zoro.com/sitemaps/usa/sitemap-product-10.xml.gz
6 Replies
I parsed them but using tools from Node.
It would be nice to have those built in, in crawlee
flat-fuchsia•3y ago
Replied in a different thread. Also passed the question/suggestion to the team 👍
I can share my solution if needed
flat-fuchsia•3y ago
If you don't mind - I could definitely pass it to the team 👍 thankls
Pastebin
async function downloadSitemaps() { let compressed_sitemaps = []...
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
flat-fuchsia•3y ago
Thanks 👍