Invalid PDF structure
I'm using the crawl endpoint and one of the URLs it discovered is https://www.gamweb.com/assets/files/lsk.pdf, however, I get a "Invalid PDF structure" error when the page is scraped by FireCrawl. I can see why, since it's webpage with an embedded PDF instead of just a raw PDF as the URL implies. However, I do think that FireCrawl should be able to gracefully handle this.
3 Replies
Hey @micah.stairs , adding this as a GitHub issue. We should def be able to handle it!
Thanks for letting us know!
GitHub
[Feat] Ability to scrape embedded pdfs · Issue #839 · mendableai/fi...
"I'm using the crawl endpoint and one of the URLs it discovered is https://www.gamweb.com/assets/files/lsk.pdf, however, I get a "Invalid PDF structure" error when the page is sc...
Thanks! I subscribed to the issue