PCLMULQDQ CRC32

i'll be porting the not bit-reflected version of that PCLMULQDQ CRC algo to c#
27 Replies
TechPizza
TechPizza11mo ago
oh cool, this will allow me to get rid of the lookup table
Maxine
Maxine11mo ago
lookup tables are bad for CPU cache
TechPizza
TechPizza11mo ago
u wot m8
Maxine
Maxine11mo ago
it's true it has to store the whole table in the cache or else you're getting cache misses all over the place
TechPizza
TechPizza11mo ago
a 8kB table was better than a 4kB table than having no table at all also the cpu doesnt know the length of tables
Maxine
Maxine11mo ago
well it depends
TechPizza
TechPizza11mo ago
it caches whatever is looked up
Maxine
Maxine11mo ago
sometimes the algorithm is helped by it it caches in blocks but yeah if you can get rid of the need for the lookup table without making it slower you make it faster
TechPizza
TechPizza11mo ago
in this case SIMD blasts through the algorithm ok, i tried porting two C implementations to c# neither gives me what i want i will have to port the assembly version <a:SKULL_IS_ANIM:968366113127796786>
TechPizza
TechPizza11mo ago
what is this formatting
No description
TechPizza
TechPizza11mo ago
who did this
TechPizza
TechPizza11mo ago
oh hey, it worked first try
No description
TechPizza
TechPizza11mo ago
aaaand it went to the code path i didn't want it to go this will be FUN
TechPizza
TechPizza11mo ago
holy shit
No description
TechPizza
TechPizza11mo ago
my assembly port was right all along
TechPizza
TechPizza11mo ago
lmfao copied 3 crc implementations off stackoverflow all of them are different @kaijellinghaus see, this is why i gave up on this yesterday it's hopeless but i am proud i managed to port assembly to c# first try
Kai
Kai11mo ago
I have like no idea what is happening
TechPizza
TechPizza11mo ago
i am trying to get a CRC algo that uses pclmulqdq to work for vorbis
Kai
Kai11mo ago
right have you considered just comparing the ASM 1:1 like
gdb --tui /path/to/a
layout asm
break <file>:<line>
r
y
gdb --tui /path/to/a
layout asm
break <file>:<line>
r
y
x2, put windows side by side, step through and compare?
TechPizza
TechPizza11mo ago
i mean, the asm is no longer the problem my c# port is the same as the asm, as seen here result local is the actual assembly, the cAsm local is my port i could also step through assembly in VS
Kai
Kai11mo ago
I have never used the VS C++/ASM/C/whatever debugger so I wouldn't know what's the problem then?
TechPizza
TechPizza11mo ago
vorbis uses a different polynomial and possibly endianness so i need to patch things up for the intel crc to match the vorbis crc
Kai
Kai11mo ago
I see
TechPizza
TechPizza11mo ago
if i do get it working, crc validation of vorbis packets will basically be free lol compared to the rest of the decoding process and i feel like this may be a good exercise in case i ever use OPUS for something unless OPUS has a completely different checksum algo oh i'm silly OPUS is not responsible for the checksums, it's the ogg container i see that as an absolute win when damn i did it https://github.com/jeffareid/crc/blob/master/crc32f/crc32fa.asm ported this to c# i only had to fix two branches in the c# because x86 flags are funky
TechPizza
TechPizza11mo ago
preliminary results will make a markdown table tomorrow after i clean everything up