Along with the disqualifying criteria, already mentioned here, Mr. Friedl does give some useful approximate criteria in the "Owls" book. He describes a sufficiently-long string as being "at least several kilobytes", and gives an example of a useful application as that of checking each chapter in his book, represented as a single string, for "mistaken markup" using a number of re's on that same string. (I presume. He describes it as "a bevy of checks".) He doesn't mention having done any comparative benchmarking of that operation, or how to guess how many re's make it worthwhile.