Excluding hidden elements from HTML or Markdown
Hey all... I'm crawling a page that has
display: none
elements, and they are being included in both the HTML and the Markdown. Is there any way to exclude these?
Thanks!5 Replies
Hey! You'll want to use excludeTags to filter out using CSS selectors
Hey, thanks for getting back to me. This is fine if I know the classes to exclude beforehand, but this is not necessarily the case. There's no other way to do it? Seems a bit odd that a Markdown version of a web page can include stuff that will never appear on that web page...
Makes sense! I logged this as a feature request.
If that display:none is inlined in this particular site you are scraping, you could exclude the
[style*="display:none"]
CSS tag I think.Is that feature request somewhere publicly visible, like Github issues?
Just wondering if I can track progress...
Good question! No, it's tracked internally.