best approach to scrape * sites with * strings?

Hello fellow Apifytes! I have 10 sites (for example). I want to scrape each site for any mention of 10 specific strings. Which 'ACTOR' would you recommend for this task?
4 Replies
Pepa J
Pepa J•2y ago
Hello @dgd , Based on your description it seems that https://apify.com/lukaskrivka/keywords-extractor could be what you need.
Apify
Keywords Extractor · Apify
Use our free website keyword extractor to crawl any website and extract keyword counts on each page.
national-gold
national-goldOP•2y ago
Pepa J
Pepa J•2y ago
So as I see it you have maxDepth set to 0. In the video in one moment you set it to 100, but then there is a cut and results are shown for maxDepth 0 again. Set it to Default 5 (please really write the value 5, once you delete the value I believe there is a placeholder with value 5 but actually 0 is used.) I tried input:
{
"caseSensitive": false,
"keywords": [
"Oberoi"
],
"linkSelector": "a[href]",
"maxConcurrency": 50,
"maxDepth": 5,
"maxPagesPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": true
},
"retireInstanceAfterRequestCount": 50,
"scanScripts": false,
"startUrls": [
{
"url": "https://www.sodis.ru"
}
],
"useBrowser": false,
"useChrome": false,
"pseudoUrls": []
}
{
"caseSensitive": false,
"keywords": [
"Oberoi"
],
"linkSelector": "a[href]",
"maxConcurrency": 50,
"maxDepth": 5,
"maxPagesPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": true
},
"retireInstanceAfterRequestCount": 50,
"scanScripts": false,
"startUrls": [
{
"url": "https://www.sodis.ru"
}
],
"useBrowser": false,
"useChrome": false,
"pseudoUrls": []
}
and I got about 3 occurrences, of that keyword.
national-gold
national-goldOP•2y ago
Thank you!

Did you find this page helpful?