C#C
C#17mo ago
Slim

✅ Beginner needs help in C# HtmlAgilityPack and Linq query

Hi! I'm a beginner who try to webscrape some informations about soccer players from a website (transfermarket.com) for personal use.
I already got the informations I want into a datatable/datagridview by the following text:


WebClient webClient = new WebClient();
webClient.Encoding = Encoding.UTF8;
string page = webClient.DownloadString("https://www.transfermarkt.de/statistik/neuestetransfers");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);

List<List<string>> table = doc.DocumentNode.SelectSingleNode("//table[@class='items']")
            .Descendants("tr")
            .Where(tr => tr.Elements("td").Count() > 1)
            .Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
            .ToList();


This writes the table into my datagridview and is fine so far.
The only thing I want (and failed so far) is to get a link for each soccer player which is not part of the "td.InnerText", but "td.InnerHtml".

In the screenshot I show you on left the original website, middle my scraped datagridview and on the right the last information that I want to scrape to the datagridview too (for each player / each datarow)

How can I extend the Linq Table query to extend only that one (in my screenshot the marked line of my browser editor) or do I need to create an additional query?



Thanks!! 🙂
sn.png
Was this page helpful?