C
C#2w ago
Theos

✅ Optimizing number matching algorithm for a logical game

Hey, I'm trying to optimize one function from my solver thats is the most performance critical area. The idea is that I have a grid x by y of numbers 1-9 (0 is an empty cell). Numbers must be next to each other horizontally, vertically, or diagonally. There can't be any number between them. Distance between the cells has no limit. Find all valid pairs. I have 2 approaches 1.https://pastecode.io/s/11vo1bcs - this approach first pre-processes the grid to find and store the locations of every number. After this phase, it iterates thru each number's stored positions and checks every possible pair to see if they can be connected. A separate function verifies if the path between two positions is valid. 2. https://pastecode.io/s/gmp701ig checks every cell in the grid. For each cell containing a number, it does a linear search in specific directions (right, down, and two diagonals) to find a matching number. The search stops if a match is found or if the path is blocked by a non-zero number I compared these 2 approaches on 3 different grids (10 rows, 9 cells)
_sparseGrid = GenerateGrid(10, 9, 0.2, 42); // 20% filled
_denseGrid = GenerateGrid(10, 9, 0.8, 42); // 80% filled
_mediumGrid = GenerateGrid(10, 9, 0.5, 42); // 50% filled
_sparseGrid = GenerateGrid(10, 9, 0.2, 42); // 20% filled
_denseGrid = GenerateGrid(10, 9, 0.8, 42); // 80% filled
_mediumGrid = GenerateGrid(10, 9, 0.5, 42); // 50% filled
I've attached image showing the results. Original_ stands for the first approach and Optimized_ for the second one. Looks like my optimized approach isn't really faster which kinda makes no sense for me cuz it should make less checks, but anyway. Which approach should I choose? Are there better ways to do that? What low level optimizations can I do? I'm trying to push this as far as I can. Thanks in advance!
No description
45 Replies
canton7
canton72w ago
I'm trying to work out how your code actually works. What does Solve return? (The fact that you've used cryptically short variable names everywhere, and there are no comments, really doesn't help)
ティナ
ティナ2w ago
does your solve function return duplicate pair?
Theos
TheosOP2w ago
it returns row and column of numbers that can be connected and what number it is let me rename variables to more human readable sorry for that, that's how I code when I don't intend to share my code with anyone ;v
canton7
canton72w ago
"Connected" means that two cells in the same row/column contain the same number, and all the cells between them are empty?
Theos
TheosOP2w ago
in the same row/column or diagonally yes
ティナ
ティナ2w ago
btw a way guarantee O(# of row x # of col) is to use two pointers for scanning but you would end up with three nested loops
Theos
TheosOP2w ago
I've updated my code, check new links in the original post but I'm using two pointers for scanning
canton7
canton72w ago
Right, I see Yeah I don't see how the "optimised" version really removes any of the cost. You've just moved the cost around a bit Is there a limit on the max number of rows/cols?
Theos
TheosOP2w ago
well, the game that I'm trying to "solve" has 9 cols and infinite rows. I guess I can write some specific optimizations for that but I also want it to work on any grid can't decide on that. But I think we can focus on 9 cols and inifnite rows to make it easier I guess
canton7
canton72w ago
I wonder if there's a way to do this with a single pass through the board
ティナ
ティナ2w ago
public void M() {
int[,] grid = null;
for(int y = 0;y < 100;++y) {
int slow = -1;
for(int x = 0;x < 100;++x) {
if(grid[y, x] != 0) {
if(slow == -1) { /// i hate this btw, i usually initialize slow with other loop
slow = x;
} else if(grid[y, x] == grid[y, slow]) {
/// :) !!!!
} else {

}
slow = x;
} else {
// do nothing :)
}
}
}
}
public void M() {
int[,] grid = null;
for(int y = 0;y < 100;++y) {
int slow = -1;
for(int x = 0;x < 100;++x) {
if(grid[y, x] != 0) {
if(slow == -1) { /// i hate this btw, i usually initialize slow with other loop
slow = x;
} else if(grid[y, x] == grid[y, slow]) {
/// :) !!!!
} else {

}
slow = x;
} else {
// do nothing :)
}
}
}
}
canton7
canton72w ago
E.g. scan along a row. Whenever you encounter a number, that becomes the "current" number. If you encounter a differetn number, the "current" number changes, if you encounter the same number then that number gets marked as being connected I can't see how to avoid testing each row/col/diagonal individually there. Maybe there's a smart way
Theos
TheosOP2w ago
yep, I bet that there is some smart way to do this in O(n), that's why I came here to ask ;p this is only to find pairs in a given row right?
canton7
canton72w ago
Actually yeah. If you have an array of the current number for each column, then when you scan along a row, you can update the "current" for each column as you go Same for diagonals That should give you O(rows*cols)
ティナ
ティナ2w ago
yes, so thats why i say you would end up with three nested loops
canton7
canton72w ago
Argh fucking lab.razor.fyi froze on me
canton7
canton72w ago
Here we go. This looks like it works for rows and columns: https://paste.mod.gg/sqckmjurfrpq/0
BlazeBin - sqckmjurfrpq
A tool for sharing your source code with the world!
Theos
TheosOP2w ago
looks compact, I will benchmark this in a sec
canton7
canton72w ago
Diagonals are doable, just a tad more fiddly. Give me a min
Theos
TheosOP2w ago
sure thing there is one more check that I will need to implement later, but for now I'm skipping it cuz you can also pair the last number of one row with the first number of the next row
0012300
0030125
0012300
0030125
you can connect 3 here but ye, its not important for now
ティナ
ティナ2w ago
you can still solve it by two pointers by iterate row by row and pair the first to last or last to first 4 nested loops rn
canton7
canton72w ago
I've got 2 nested loops You can do the whole thing in 1 pass
canton7
canton72w ago
BlazeBin - kiggbxhpnpvu
A tool for sharing your source code with the world!
Theos
TheosOP2w ago
alright, thanks, lets test it
canton7
canton72w ago
BlazeBin - arlezvqiogcs
A tool for sharing your source code with the world!
canton7
canton72w ago
Can you do the same for columns, and for diagonals?
Theos
TheosOP2w ago
nope, only rows
canton7
canton72w ago
Heh, that's really easy to support
Theos
TheosOP2w ago
yep, I just wanted to simplify the problem for now
canton7
canton72w ago
BlazeBin - ralfnnjwdved
A tool for sharing your source code with the world!
Theos
TheosOP2w ago
oh u are fast haha
Theos
TheosOP2w ago
No description
Theos
TheosOP2w ago
ur approach is faster in all grids the biggest difference is in super dense grids (u are A2)
canton7
canton72w ago
Tbh I'm surprised the difference is that small with the sparse grids I'd expect the biggest difference in dense grids: that's when you're doing a lot of looping, as you go off searching every time you encounter a number
Theos
TheosOP2w ago
I have some other small ideas that can optimize it, will test them and see if I can improve benchamarking takes 4 minutes :OhNo:
canton7
canton72w ago
Ooh?
Theos
TheosOP2w ago
[MethodImpl(MethodImplOptions.AggressiveInlining)] on UpdateCurrent pre-allocate var result = new List<Result>(numRows * numCols); making Current non-nullable
Theos
TheosOP2w ago
well, AggressiveInlining doesn't change much A1 is without it and A2 is with it
No description
canton7
canton72w ago
Yeah I wouldn't necessarily expect inlining to help much there numRows * numCols is a fairly arbitrary initial size for result? And I'd expect the cost of adding to result to be completely amortised Making Current non-nullable could help though, if the number of rows/cols/diagonals is struggling to fit into a cache line
Theos
TheosOP2w ago
it didnt help, maybe my implementation is stupid but anyway, I think that its already fast enough. Time to implement the rest of the solver thanks for help :)
canton7
canton72w ago
Cool, that was a fun little challenge!
Theos
TheosOP2w ago
Advent of code teaser
canton7
canton72w ago
I always like to approach those assuming it can be done in O(n). So you're forced to imagine walking through the grid cell-by-cell, and the question becomes "what information do I need about this cell right now, because I'm not going to be coming back to it?"
Theos
TheosOP2w ago
Actually I have one more question about time complexity O(n), n means the number of cells right? Then isnt O(x * y) the same here? x - grid width y - grid height
canton7
canton72w ago
Yeah, I was taking n there to be the number of cells (Actually, I was being very lax, and just using "O(n)" to mean "linear time complexity") (There's no way that you can solve this without visiting each cell. So searching for anything sub-linear is pointless. So there's nothing n could usefully be which isn't the number of cells)

Did you find this page helpful?