For every length you want to consider For each packet for each substring of length N increment count of N's occurrence by 1 Look at all the data you've amassed about which substrings occur with what frequency and spit out some data.