Firecrawl•5mo ago

/extract enpoint metadata per url

Is there a way to get metadata on /extract? For example I provide 10 urls in extract endpoint and instead of receiving a final blob parsed from all of this urls I want get the info of url => data.

2 Replies

micah.stairs•5mo ago

Hey @avem! From looking at the documentation, it seems that you can get a bit more information about where the data actually came from by setting showSources to true. I know it's not quite what you're asking for, but this seems to be the best that's available at this time. See https://docs.firecrawl.dev/api-reference/endpoint/extract#body-show-sources for more information about this field. I tried to get you an example output of what kind of data this would return using the Python API, but unfortunately when I call something like the following, it throws an internal exception :

await app.extract(
    urls=["https://www.firecrawl.dev"],
    prompt='Extract the pricing model from the website',
    show_sources=True
)

await app.extract(
    urls=["https://www.firecrawl.dev"],
    prompt='Extract the pricing model from the website',
    show_sources=True
)

Exception:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ExtractResponse
sources
  Input should be a valid list [type=list_type, input_value={'pricingModel': ['https://www.firecrawl.dev/']}, input_type=dict]

pydantic_core._pydantic_core.ValidationError: 1 validation error for ExtractResponse
sources
  Input should be a valid list [type=list_type, input_value={'pricingModel': ['https://www.firecrawl.dev/']}, input_type=dict]

I just filed an issue for this error: https://github.com/mendableai/firecrawl/issues/1591.

GitHub

Issues · mendableai/firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API. - Issues · mendableai/firecrawl

Firecrawl Docs

Extract - Firecrawl Docs

avemOP•5mo ago

Thank you micah! I will check it out.

Gaming

Programming

/extract enpoint metadata per url

Did you find this page helpful?