F
Firecrawl2mo ago
Rich

Crawling Website Issues with n8n

Hi all! I am having trouble crawling websites (from Google Sheets) looking for specific keywords. Not using a main url. Here is an example: https://www.whitehouse.gov/presidential-actions/executive-orders/ Looking for keywords like: - trade - freight - foreign trade - export - import - commerce - sanctions - customs - tariff - licensing
7 Replies
micah.stairs
micah.stairs2mo ago
Hey! I'm not sure if I completely understand your use case, but would the /search endpoint help here? https://docs.firecrawl.dev/features/search
Rich
RichOP2mo ago
I want to crawl (https://www.whitehouse.gov/presidential-actions/executive-orders/) I want the data to come back in these fields (on Google Sheets): Source Link Issuing Agency Title Date Keywords Document Type File Reference Full Text
micah.stairs
micah.stairs2mo ago
To extract structured data during crawling, please check out JSON mode: https://docs.firecrawl.dev/features/llm-extract.
Rich
RichOP2mo ago
Oh I've been all over that. Unfortunately I keep getting this error: "URL must have a valid top-level domain or be a valid path"
micah.stairs
micah.stairs2mo ago
Oh that error doesn't have anything to do with JSON mode. Can you share your request? Seems that the URL you passed in is not formatted correctly.
Rich
RichOP2mo ago
I can just seem to figure out how to crawl a website for specific info... Pretty discouraged, but still fighting the good fight with Google
micah.stairs
micah.stairs2mo ago
Hmm, sorry to hear that. Which website are you trying to crawl?

Did you find this page helpful?