(Pre-note: This is neither DNA nor a compression scheme.)
I have a bunch of large (up to 20MB) binary strings (bytes), and I'm looking for repeating patterns within those strings.
I believe the strings consist of many contiguous repetitions of a substring of unknown length and I wish to find that substring.
Wrinkles:
(eg. it might be 'bcd abcd abcd ab' (without the spaces)).
Eg. (again spaces for clarification only) 'cdef abcdef abbdef aacdef abcdcf abcdef abd'
Could be 10s 100s, 1000s, 10000s etc.
Any thoughts on a way to tackle this?
Update: Now solved thanks to tye See Re^2: Analysing a (binary) string. for details.
In reply to Analysing a (binary) string. (Solved) by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |