Regex to match text between nested tags
Question:
Write a program that extracts all the text without any tags and attribute values from an HTML document. Sample text: <html> <head><title>News</title></head> <body><p><a href="http://softuni.org">Software University</a>aims to provide free real-world practical training for young people who want to turn into skillful software engineers.</p></body> </html> Sample result: News Software University aims to provide free real-world practical training for young people who want to turn into skillful software engineers.I solved it without using Regex. Here's the code: https://paste.mod.gg/orqiwpelwkzq/0 But I was wondering what the regex pattern would look like for matching text within nested tags. :catderp: I have asked similar question but it didn't involve nested tags. Link: https://discord.com/channels/143867839282020352/1358651997745709267
BlazeBin - orqiwpelwkzq
A tool for sharing your source code with the world!
5 Replies
Ok, so it might be somehow possible with balancing groups but definitely not ideal choice for real projects.
Regular Expression Language - Quick Reference - .NET
In this quick reference, learn to use regular expression patterns to match input text. A pattern has one or more character literals, operators, or constructs.
Thanks, will close post after 24 hrs
It makes so much sense.
Funny enough that I also look for text between
>
and <
when traversing the string character by character. But tried to use full html tags in regex 😅
Thank you so much for explanation.please don't
write a parser
you can't parse HTML with regex
$htmlregex
Stack Overflow
RegEx match open tags except XHTML self-contained tags
I need to match all of these opening tags:
<p>
<a href="foo">
But not self-closing tags:
<br />
<hr class="foo" />
I came up with this and wanted to make