CA
multiple-amethyst

Generalizing a Simple Cheerio Scraper for Different Pages

I have a working Cheerio Task, Task, which uses a Glob to find links, and CSS Classes to find the content to be extracted on the child page. I have almost 40 more similar pages., but each will have a different Glob for links and CSS Class for the terget information . What is the best way to generalye my working scraper for the other 40, maybe soon enough, 100 pages? My thought is that I need to collect the same info mentioned from all the pages...how terribly boring. Is there any AI for that?, either on the platform or elsewhere? Given that I get that done manually, I am assuming there is something smarter than creating 100 clones of the Task.... I have done some JS automation on Uilicious calculating the inputs using a Google Sheet to populate an array.... Same thing here? Glob, excerpt CSS Class from the main page for each URL, and target class on the detail page? I am pretty noobish on the platform and JS, so really appreciate simple answers. TIA.
2 Replies
multiple-amethyst
multiple-amethystOP2y ago
Not interesting to anyone?
Pepa J
Pepa J2y ago
Hello Key Lyle, Generaly scraping 100 different websites with single Actor/Task is not a great idea. Having single solution that would scrape everything sounds great, but only in theory, I have no good experiences with that since the website change very often and maintaining this is a long time run. Unfortunately there is not any AI clever solution that would do this for you.

Did you find this page helpful?