Cheerio Crawler inner text

When creating a simple cheerio crawler and retrieving all the contents within a specific div tag that is contained within the retrieved content, if I use the .text() method, it appears to strip out all the HTML tags and concatenate content from different children tags without any spacing/delimiters. If i do a similar crawler utilizing puppeteer and call the innerText method on a particular retrieved tag it appears to be put spacing/newlines between the content contained in different child tags. Is there any capability in the cheerio crawler to pull the content out similar to how puppeteer does it?
1 Reply
dependent-tan
dependent-tan3y ago
Sadly - that's a disadvantage of cheerio. You could e.g. first replace certain html tags with new line or something like that and then try to use .text(). Or use more specific selectors, extract text one by one, and then join them into one string. But sadly I am not aware of any more straight-forward solution.

Did you find this page helpful?