in reply to Re: extracting all possible n-mers from an array
in thread extracting all possible n-mers from an array

thanks,

could you take a few seconds and explain what is going on here. I've not seen glob in this context before. All those references are also too nice not to use again.

-james

  • Comment on Re^2: extracting all possible n-mers from an array

Replies are listed 'Best First'.
Re^3: extracting all possible n-mers from an array
by QM (Parson) on Jun 07, 2006 at 02:20 UTC
    Let me try my hand at it...

    Original code:

    print "$_\n" for glob "{@{[ join ',', @letters ]}}" x $len;
    From the inside out:
    join ',', @letters
    creates a comma separated string: A,B,C,D,E,F.

    [...] creates an anonymous array of 1 element, which @{[...]} then dereferences. Putting that inside double quotes interpolates that 1 element array into the double-quoted string, surrounded by the outside pair of {}, yielding {A,B,C,D,E,F}.

    x $len repeats the string, in this case 4 times, to give:

    {A,B,C,D,E,F}{A,B,C,D,E,F}{A,B,C,D,E,F}{A,B,C,D,E,F}
    glob treats each {...} as a set, and generates all strings with one element from each set.

    I've always felt like this was almost undocumented, because the glob entry points to 2 other entries, one of which is completely unhelpful in this case, and the other which backs into it as an afterthought. If you're not paying attention, or don't already know it from some other context, glob doesn't improve the situation.

    The reason for the "{@{[...]}}" magic is to take the string from join and sandwich it between the curlies. That's a lot of work, and not always easy to follow. I prefer the following, just because it's easier to explain:

    print "$_\n" for glob (('{'. join( ',', @letters).'}') x $len);
    I sometimes see code a bit convoluted because someone forgot that join can also take parens to disambiguate it's arguments from the rest of the expression. In this case, glob uses parens for precedence with x $len, though the outer set of parens could be replaced with a leading +, as in:
    print "$_\n" for glob +('{'. join( ',', @letters).'}') x $len;
    As has been noted before (see this subthread), glob isn't necessarily good for large list generation.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      thanks, very clear and informative.

      -james