Write or modify program to leverage CRC collision samples
Hello,
Inpsired by @rcgldr amazing posts here https://cboard.cprogramming.com/c-pr...-crc-16-a.html
That I am still trying to make full sense out of
Goal:
Reverse engineer this checksum format
0100100A00000000004080000000FFFFFFFFFFD7BB
0100100A00000000000060000000FFFFFFFFFFD7BB
0100100A00000000002010000000FFFFFFFFFFD7BB
(many more collision samples will be available soon)
Either by fully reverse engineering it, or coming up with an alternate algorithm that produces collisions at the "same times" that the "Blackbox" algorithm does.
This would allow me to create a lookup table.
If I can come up with a more typical CRC algorithm / parameters that causes collisions (even if the collision values are different from the "Blackbox" algorithm collisions) whenever the "Blackbox" algorithm produces collisions on a given set of samples, then I can create a lookup table from "Normal CRC" to "Blackbox CRC"
The relationship in this lookup table between "Normal CRC" and "Blackbox CRC" does not have to be 1:1, nor would it need to be. I.e. the "Blackbox" algorithm may introduce more collisions on top of what the underlying CRC produces
The main requirement for this to work is that the "Normal CRC" algorithm does not produce collisions when the "Blackbox" algorithm does not
The final output of the "Blackbox" algorithm can produce more collisions than my "Normal CRC" algorithm, but not the other way around.
Assumptions:
1.) There is a "Blackbox" algorithm that obfuscates a CRC16 in a manner more complex than "messing" with initial values and/or XorOut values
2.) The "underlying" algorithm is a fairly typical CRC, with very good entropy and low collision rate, that protects against positional swapping of the same bytes, and can distinguish between 00 and FF values.
What I need Help With:
Conceptually, am I full of **** with this strategy? If not, then I would either:
1.) Modify an existing program such as SRP16 to test CRC parameters until it finds a configuration that causes a collision on all 3 input samples, without caring what the actual collision CRC value is. Sort of a wildcard CRC, but all 3 wildcard CRC outputs must match each other to be considered a candidate / result.
2.) Write an entirely new program to leverage the collisions, looking for a set of CRC parameters that results in collisions (of any value) for all supplied input samples.
Potential issues:
1.) Due to the possibility of additional "extra" collisions being introduced by the "Blackbox" algorithm, not all samples in a given collision set are "good" to analyze. I.e. in a set of 5 sample message / CRC collisions, 3 of them may "make sense" from a CRC poly multiples standpoint, while the other 2 may make "no sense at all" because they are "artificially introduced" by week/poorly designed math in the "Blackbox" algorithm.