C#C
C#•2y ago•
34 replies
kurumi

Reading large xml file from archive by using XmlReader in Parallel mode

Hello 👋. I am looking for how can I read data from archive xml file in Parallel mode.

I have archive
someFiles.zip
with my needed data and it has
largeXmlFile.xml
file inside. This file is 40gb. It looks kinda of it (but has thousands of objects :Ok:):
<root>
  <OBJECT data1="123" data2="456" />
  <OBJECT data1="321" data2="654" />
</root>


Now I am opening this file from archive and get
Stream

using var zipFile = ZipFile.OpenRead(@"someFiles.zip");
var myFile = zipFile.Entries.FirstOrDefault(file => file.Name is "largeXmlFile.xml");
var myFileStream = myFile.Open();


then putting this
Stream
into
XmlReader
:
using var xmlReader = XmlReader.Create(myFileStream , new() { Async = true });


And I am simply reading it:
var objects = new List<MyObject>();
while (await xmlReader.ReadAsync())
{
    if (xmlReader is { NodeType: XmlNodeType.Element, Name: "OBJECT" })
    {
        objects.Add(ReadMyObject(xmlReader));
    }
}


It takes ages for reading this file, so my question is:
How can I change my code so I will read this XML in Parallel mode?
Was this page helpful?