C
C#3mo ago
divadop

Difference algorithm that Git uses

Hello! Quick rundown: I have a need to detect changes in custom bytecode sequences. Essentially, I need to be able to compare two bytecode files (just consider them string sequences, lines of text even), and find: 1) Unchanged lines. 2) Deleted lines. 3) Inserted lines. 4) Moved lines. Modified lines can be considered deleted and inserted. The key is I NEED to detect moved lines. DiffPlex and Python's difflib both satisfy requirements 1, 2 and 3, but neither of them seems to be able to detect MOVED lines. Git's difference algorithm seems to be able to detect moved lines. I tested it using --color-moved and indeed, it correctly identified the moved bytecode blocks, unlike DiffPlex and difflib that consider it deleted and inserted. I believe Git uses the Myer's diff algorithm? I don't think the Myer's diff algorithm recognizes moved lines though, that must be some extra logic on Git's side of things. I could come up with a custom algorithm for detecting the moved lines, like taking the deleted lines from A and finding matching sequences in B with LCS or something like that, but surely someone has already done this? This doesn't seem like an uncommon problem, so I would be surprised if there isn't an open-source, polished and tested solution already, but I haven't been able to find any. C# is preferable, but it doesn't have to be C#, any language or tool will do. I would even consider parsing Git's diff output if I had no other options and if it was better structured. Any input is appreciated!
3 Replies
Petris
Petris3mo ago
doesn't git use an external tool for diffing? afair it doesn't have anything built in https://git-scm.com/docs/git-difftool I think it just defaults to the gnu diff tool for it
Joreyk ( IXLLEGACYIXL )
you can use fc.exe /? with the /b parameter to make a binary comparison is preinstalled on windows https://manual.winmerge.org/en/Compare_bin.html you can use winmerge to compare with a gui
divadop
divadop2mo ago
Didn't get a notif for this - thanks for the responses!