Error: OnlyMainContent parameter not working in scraping
Hey im trying to scrape these urls "https://www.northeastern.edu/research
https://www.northeastern.edu/graduate" to get the main content from the pages. But i get back messy polluted data full of html content and UI/navigation residue that should be prevented by the OnlyMainContent script. What do i need to do to scrape only the main content from the urls? ps: im building a large dataset so using the extract function would be too time consuming.
Meghan Gocke
Northeastern University Graduate Programs
Northeastern University Graduate Programs
Explore 200+ graduate programs, including certificates, master's degrees, and professional doctorates across 13 campuses and online.
0 Replies