Crawl a lead website to create a knowledge base

Hey guys, so I'm new to Firecrawl and I figured I'd find a response quicker if I post it here For the context: I'm creating a chatbots and voice ai assistants powered by LLMs for businesses. Question: given I have a fresh lead with the a web page URL and I want to create a knowledge base for my chatbot/voice agent LLM (Markdown or Embeddings) -> what is the most straightforward way to approach gathering 'facts' or 'q&a' pairs from the webpage Do I use extractor? Or /crawl? or gather a list of subpages from the website? P.S. I now am at the stage of manual customer knowledge gathering process. But I do realize that I might want to create a ready-to-advertise system for potential leads automatically. Please help me understand what my options are. Thanks!
1 Reply
micah.stairs
micah.stairs5mo ago
@sined99 the Extract endpoint is a good starting point for this! Here's an example on the playground which does this (fetches a list of Q&A pairs for an entire website): https://www.firecrawl.dev/extract-playground?=%7B%22state%22%3A%7B%22fields%22%3A%7B%22xcs4Vjcve40gpBqRs2PD8%22%3A%7B%22id%22%3A%22xcs4Vjcve40gpBqRs2PD8%22%2C%22name%22%3A%22faqs%22%2C%22type%22%3A%22array%22%2C%22required%22%3Afalse%2C%22parentId%22%3Anull%2C%22children%22%3A%5B%22dgkzSVr9cSJjGfltuflaB%22%5D%7D%2C%22CvijhhHHXyFA4hdvYzXtS%22%3A%7B%22id%22%3A%22CvijhhHHXyFA4hdvYzXtS%22%2C%22name%22%3A%22%22%2C%22type%22%3A%22string%22%2C%22required%22%3Afalse%2C%22parentId%22%3Anull%2C%22children%22%3A%5B%5D%7D%2C%22dgkzSVr9cSJjGfltuflaB%22%3A%7B%22id%22%3A%22dgkzSVr9cSJjGfltuflaB%22%2C%22name%22%3A%22%22%2C%22type%22%3A%22object%22%2C%22required%22%3Afalse%2C%22parentId%22%3A%22xcs4Vjcve40gpBqRs2PD8%22%2C%22children%22%3A%5B%22i5D4dZgVanvtLPzpylESY%22%2C%22WbzdfSUuGqXgxJ76EpP_i%22%5D%7D%2C%22i5D4dZgVanvtLPzpylESY%22%3A%7B%22id%22%3A%22i5D4dZgVanvtLPzpylESY%22%2C%22name%22%3A%22question%22%2C%22type%22%3A%22string%22%2C%22required%22%3Afalse%2C%22parentId%22%3A%22dgkzSVr9cSJjGfltuflaB%22%2C%22children%22%3A%5B%5D%7D%2C%22WbzdfSUuGqXgxJ76EpP_i%22%3A%7B%22id%22%3A%22WbzdfSUuGqXgxJ76EpP_i%22%2C%22name%22%3A%22answer%22%2C%22type%22%3A%22string%22%2C%22required%22%3Afalse%2C%22parentId%22%3A%22dgkzSVr9cSJjGfltuflaB%22%2C%22children%22%3A%5B%5D%7D%7D%2C%22rootFields%22%3A%5B%22xcs4Vjcve40gpBqRs2PD8%22%2C%22CvijhhHHXyFA4hdvYzXtS%22%5D%2C%22inputUrls%22%3A%5B%7B%22id%22%3A%22KLCyBQxslSKcb4WFbySGq%22%2C%22url%22%3A%22https%3A%2F%2Ffirecrawl.dev%2F*%22%7D%5D%2C%22auxiliarPrompt%22%3A%22Create+a+list+of+20+diverse+FAQs+about+this+company%27s+product.+Keep+questions+and+answers+brief+and+concise.%5Cn%5CnThe+FAQs+will+likely+not+be+listed+explicitly%2C+so+infer+what+you+think+a+user+might+ask%2C+based+on+the+documentation+you+see.%22%2C%22enableWebSearch%22%3Afalse%2C%22enableWebAgent%22%3Afalse%2C%22agentModel%22%3A%22NONE%22%2C%22exampleName%22%3A%22%22%7D%2C%22version%22%3A0%7D&isManualExtractView=true

Did you find this page helpful?