Not able to Crawl the website using Include Paths(Filter).
Hi,
I am trying to crawl the webpages from this domain https://www.peigenesis.com/ using path filters.
But i am not able to get the associated web pages with that path. I have been trying it from a long time. Can someone pls help me.
params2={
'limit': 4,
'maxDepth': 10,
'includePaths': ["/part-information/*"],
#"excludePaths": [],
#"ignoreSitemap": True,
"allowBackwardLinks": True,
"allowExternalLinks": False,
#"webhook": "<string>",
"scrapeOptions": {
"formats": ["markdown","html"],
#"headers": {},
#"includeTags": ['#parts a','h1 span','h2 span','#tools a','#tools .fieldvalue','#mates a','#mates .fieldvalue'],
"excludeTags": ['img'],
"onlyMainContent": True,
"waitFor": 2000
}
} this is the params that i am using while sending the api request
Thank you.
I am trying to crawl the webpages from this domain https://www.peigenesis.com/ using path filters.
But i am not able to get the associated web pages with that path. I have been trying it from a long time. Can someone pls help me.
params2={
'limit': 4,
'maxDepth': 10,
'includePaths': ["/part-information/*"],
#"excludePaths": [],
#"ignoreSitemap": True,
"allowBackwardLinks": True,
"allowExternalLinks": False,
#"webhook": "<string>",
"scrapeOptions": {
"formats": ["markdown","html"],
#"headers": {},
#"includeTags": ['#parts a','h1 span','h2 span','#tools a','#tools .fieldvalue','#mates a','#mates .fieldvalue'],
"excludeTags": ['img'],
"onlyMainContent": True,
"waitFor": 2000
}
} this is the params that i am using while sending the api request
Thank you.