F
Firecrawl14mo ago
Arbaz

Unable to crawl

I'm unable to crawl this website: https://www.coditas.com This is my req body: { "url": "https://www.coditas.com", "limit": 5, "scrapeOptions": { "formats": [ "markdown" ], "waitFor": 1000 } } But when I tried on firecrawl playground it was able to crawl, I'm running it using docker.
7 Replies
Arbaz
ArbazOP14mo ago
I have one more question: If I want to scrape the homepage using a crawler, how can I do that? I am also passing some specific URLs, so I assume that I cannot include '/' in them. Is that correct?
Benos
Benos14mo ago
I am having a similar issue on another website self hosted on Docker
crawl_status = app.scrape_url(
"https://developers.hubspot.com/beta-docs/",
params={
"formats": ["markdown", "links"],
"onlyMainContent": False,
"includeTags": ["#main-content"],
"waitFor": 4000,
},
)
crawl_status = app.scrape_url(
"https://developers.hubspot.com/beta-docs/",
params={
"formats": ["markdown", "links"],
"onlyMainContent": False,
"includeTags": ["#main-content"],
"waitFor": 4000,
},
)
response is
{'markdown': '', 'links': [], 'metadata': {'sourceURL': 'https://developers.hubspot.com/beta-docs/', 'statusCode': 200}}
{'markdown': '', 'links': [], 'metadata': {'sourceURL': 'https://developers.hubspot.com/beta-docs/', 'statusCode': 200}}
This is working on the playground though
Arbaz
ArbazOP14mo ago
@Moderator Any update on it?
Adobe.Flash
Adobe.Flash14mo ago
@Arbaz are you self hosting it or not?
Arbaz
ArbazOP14mo ago
Yes, I've self-hosted it
Adobe.Flash
Adobe.Flash14mo ago
We will take a look. Just opened a github issue: https://github.com/mendableai/firecrawl/issues/625
GitHub
[Self-host] Unable to crawl certain pages · Issue #625 · mendableai...
(from discord) +2 This only happens in the self hosted version: Unable to crawl this website: https://www.coditas.com/ This is my req body: { "url": "https://www.coditas.com/", ...
rafaelmiller
rafaelmiller12mo ago
Hey @Arbaz @Benos we've made several improvements to our self-host codebase. Could you please update your local repository and give it another try? Additionally, feel free to check out the self-host guide for further assistance. Let me know how it goes!
Firecrawl Docs
Self-hosting | Firecrawl
Learn how to self-host Firecrawl to run on your own and contribute to the project.

Did you find this page helpful?