Download PDF file from URL?

Does someone know of a simple npm library to download files from a URL in Javascript/TypeScript?
7 Replies
ambitious-aqua
ambitious-aqua3y ago
I actually want to know this as well, seems like it's giving me an error on how other formats are only supported application/pdf is not supported
variable-lime
variable-limeOP3y ago
I tried using axiom, http.get, clicking on the download button. Nothing works
variable-lime
variable-limeOP3y ago
Basic crawler | Crawlee
This is the most bare-bones example of using Crawlee, which demonstrates some of its building blocks such as the BasicCrawler. You probably don't need to go this deep though, and it would be better to start with one of the full-featured crawlers
ambitious-aqua
ambitious-aqua3y ago
router.addDefaultHandler(async ({ request }) => {
const file = fs.createWriteStream('filename.pdf');
const response = await fetch(request.url);
response.body.pipe(file);
});
router.addDefaultHandler(async ({ request }) => {
const file = fs.createWriteStream('filename.pdf');
const response = await fetch(request.url);
response.body.pipe(file);
});
variable-lime
variable-limeOP3y ago
Thanks I tried your suggestion but unfortunately it wont compile I fixed it with code
async function downloadFile(url: string, targetFile: string) {
return await new Promise((resolve, reject) => {
Https.get(url, (response: any) => {
const code = response.statusCode ?? 0;

if (code >= 400) {
return reject(new Error(response.statusMessage));
}

// handle redirects
if (code > 300 && code < 400 && !!response.headers.location) {
return downloadFile(response.headers.location, targetFile);
}

// save the file to disk
const fileWriter = Fs.createWriteStream(targetFile).on("finish", () => {
resolve({});
});

response.pipe(fileWriter);
}).on("error", (error: string) => {
reject(error);
});
});
}

await downloadFile(link, "file.pdf");
async function downloadFile(url: string, targetFile: string) {
return await new Promise((resolve, reject) => {
Https.get(url, (response: any) => {
const code = response.statusCode ?? 0;

if (code >= 400) {
return reject(new Error(response.statusMessage));
}

// handle redirects
if (code > 300 && code < 400 && !!response.headers.location) {
return downloadFile(response.headers.location, targetFile);
}

// save the file to disk
const fileWriter = Fs.createWriteStream(targetFile).on("finish", () => {
resolve({});
});

response.pipe(fileWriter);
}).on("error", (error: string) => {
reject(error);
});
});
}

await downloadFile(link, "file.pdf");
rival-black
rival-black3y ago
do not forget to add additionalMimeTypes in crawler options then you can handle files with cheerio crawler
variable-lime
variable-limeOP3y ago
Thanks

Did you find this page helpful?