1. 1. starting with the longest string and continuing in descending order

    I don't get the idea of putting the longest first?

    The idea of putting the shortest first is that you can use the third parameter to index to skip over the shorter strings as you've checked them. Longer strings can never be contained by the shorter ones, and starting the search part way into the string is much cheaper than trimming the shorter ones off the end.

  2. 2. then only appending the non-embeddable strings to $all

    I do not know what you mean by "non-embeddable" in this context?

  3. I'm also wondering if the reallocation of new memory when appending to $all could be avoided by starting with a maximal length string and then shortening $all again.

    If you mean counting the space required for $all, allocating to that final size and then copying the elements into the string--rather than building it up by appending each element in turn--that is exactly what join does.

  4. Maybe uniq() from List::MoreUtils is faster

    Not in my tests. Mine usually works out ~15% faster.

  5. or could be completely avoided (after sorting identical strings always appear in a sequence)

    That would mean sorting the duplicates. Sorting is O(N log N); de-duping O(1). And after the sorting, you;d still need to make a complete pass with grep to remove the dups before joining.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^9: list of unique strings, also eliminating matching substrings by BrowserUk
in thread list of unique strings, also eliminating matching substrings by lindsay_grey

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.