C
C#Ezlanding

Best way to implement a regex based lexer [Answered]

In a regex lexer, you can loop over every pattern and do something like this:
if (match.Success && (match.Index - currentIndex) == 0)
//Token Found
if (match.Success && (match.Index - currentIndex) == 0)
//Token Found
The problem with this is that, unlike other Regex implementations, if c# regex sees that the first char does not match the pattern, instead of returning null/false, it keeps going until a match is found, meaning the if statement is needed and extra match calls are made. Is there any way to get this to work without the extra calls? I could use the index of each match and add them into a list in order based off the match's index position, but that seems needlessly complicated if there is a better way
E
Ezlanding558d ago
an example of how I'd want regex matching to work is in JS. see here where if it doesn't match the first char it returns null https://gist.github.com/pepasflo/4afa5813606b6ee73526a0d21d0d1035#file-lexer-js. So for example pattern " " (checks for space character) would return null with the string "hello world" in JS , while in c# it would recognize the space between o and w (index 5) @nekodjin (sorry for the ping) can you help me with this? You seem to be knowledgeable about c# regex
N
nekodjin558d ago
put ^ in the beginning of your regex pattern to match the beginning of the string, then it will only match the pattern if the pattern occurs at the very beginning. however this will require you to take substrings of the input string.
E
Ezlanding558d ago
If that's the best solution I'll do it it's not my favorite though :|
N
nekodjin558d ago
yeah it's not ideal however that is what the JS solution does that you posted i looked in the API and unfortunately it doesn't seem as if there is such an option that is unfortunate because taking substrings can be expensive oh well..
E
Ezlanding558d ago
¯\_(ツ)_/¯
N
nekodjin558d ago
it'd be nice if it had that option but alas the closest thing is you can specify an index to stop looking but that's ever-so-slightly different and doesn't work with arbitrarily long tokens like identifiers
A
Anton558d ago
I've heard antlr is good for this you could try that
N
nekodjin558d ago
yeah if you care about really optimizing stuff there are other options but for a hobby project you generally don't and for things more complicated than hobby projects, lexers and parsers are practically never the bottleneck in langdev
E
Ezlanding558d ago
ok thanks
A
Accord558d ago
✅ This post has been marked as answered!
Want results from more Discord servers?
Add your server
More Posts
string query in C Sharpi am trying to get values from 2 website links and assign them to a lable using visual studio one is there a convention for validating login sessions with MVC?Assuming I'm using ASP, ADO and .NET Say I want to construct a project to have several functions thUsing SQL db to create object ID or application?I have an application where my records in my DB need to have unique IDs. Should I leave this responsObject Initializers - To Parenthesize or Not To Parenthesize (that is the question)What is the difference between the object initializers on lines 1 and 2? In what situations should oHow do I move my mouse curser on screen?I went through most of stack overflow and some YouTube tutorials but didn't fins anythingHow do I move my mouse curser on screen?i was wondering if its possible to move the curse on the screen using c#SQL Error when trying to get an IDI'm trying to add username password and email to my users table And then get the ID of this new userEF Structuring ProblemBasically I have users that I want to assign to a group. But I also want the group to have a parent,data transfer between classesI want to list 'uname' value in method X, to method Y i am new to oopA generic and efficient way to feed in text for a lexer or scannerIn the past I was lazy and always just shoved a string into my lexical analyzer as input, but what'sBlazor Re-render [SOLVED]Hello, I have a property stored in static class, and I would like to re-render some components that Restarting audio with MediaPlayer?I've made a button that uses MediaPlayer to play a custom MP3 file that I've made. ```cs MediaPlayerWhat's the meaning of CS0659?```'class' overrides Object.Equals(object o) but does not override Object.GetHashCode()``` I got thiHow does Newtonsoft.Json.JsonConverter handle nullable structs?I'm presently writing a JsonConverter for a third-party struct, and in my json models that field canMost convenient method to handle 3D vectors?I'm looking for a struct/class in which magnitude(L2 norm), normalize, multiplication with a scalar,can anybody recommend where I can learn about file format encoding internals e.g pdf, doc, epubscan anybody recommend where I can learn about file format encoding internals e.g pdf, doc, epubsFiles on desktop [Repost]I want to check files that have been added or edited (I thought of using fileSystemWatcher but everyIndex was out of range. How to check if list has items without triggering this error?Hello, I have a list, with a list inside. ```cs List<List<dynamic>> characterData ``` This is becaWebsocket buffer size performance and latencyI'm trying to figure out what buffer size to have, does having a lower buffer size increase or decreFiles on desktopI want to check files that have been added or edited (I thought of using fileSystemWatcher but every