Update: Since the original code was severely buggy, I substituted it altogether with that also posted in this followup. The original code is available here.

This is another short program that for me is a complete application, and I hope that this time there's not a common utility to do the same...

Basically it expands strings of the form

foo[01:100]bar-[fred,barney,wilma]
to the list
foo01bar-fred foo02bar-fred ... foo100bar-fred foo01bar-barney ... foo100bar-wilma
More info here!
#!/usr/bin/perl -ln use strict; use warnings; sub expand; sub doit; print for expand $_; sub expand { doit split /\[(.*?)\]/, shift, -1; } sub doit { return @_ if @_ == 1; my ($pre,$pat,$post)=splice @_, 0, 3; map { doit $pre . $_ . $post, @_ } map { /(\w+):(\w+)/ ? $1..$2 : $_ } split /,/, $pat; } __END__

Replies are listed 'Best First'.
Re: String expansion script
by ambrus (Abbot) on Feb 03, 2005 at 19:37 UTC

    This is great. I especially like the distinction between [01:20] and [1:20].

    The zsh shell has a brace expansion function, so you can write this in zsh:

    echo f-{fred,barney,vilma}/h{08..22}.t
    which prints
    f-fred/h08.t f-fred/h09.t f-fred/h10.t f-fred/h11.t f-fred/h12.t f-fre +d/h13.t f-fred/h14.t f-fred/h15.t f-fred/h16.t f-fred/h17.t f-fred/h1 +8.t f-fred/h19.t f-fred/h20.t f-fred/h21.t f-fred/h22.t f-barney/h08. +t f-barney/h09.t f-barney/h10.t f-barney/h11.t f-barney/h12.t f-barne +y/h13.t f-barney/h14.t f-barney/h15.t f-barney/h16.t f-barney/h17.t f +-barney/h18.t f-barney/h19.t f-barney/h20.t f-barney/h21.t f-barney/h +22.t f-vilma/h08.t f-vilma/h09.t f-vilma/h10.t f-vilma/h11.t f-vilma/ +h12.t f-vilma/h13.t f-vilma/h14.t f-vilma/h15.t f-vilma/h16.t f-vilma +/h17.t f-vilma/h18.t f-vilma/h19.t f-vilma/h20.t f-vilma/h21.t f-vilm +a/h22.t
    Zsh makes the distinction between {08..22} and {8..22}. Bash 3.0 has cloned this numeric brace expansion feature too (it had non-numeric brace expansion originally), but it expands {08..11} as 8 9 10 11. (Brace expansion with multiple braces sometimes expand in a different order in the shells and your program.)

    By the way, your program does not fully expand m[2:5]-{a:t:g:c}, it just prints the four strings m[2:5]-a m[2:5]-t m[2:5]-g m[2:5]-c. Is that intentional?

    Also, your program does not seem to be able to expand braces embedded in each other, such as {/usr{:/local}:/var}/lib. Bash expands this to /usr/lib /usr/local/lib /var/lib.

      I for one would prefer not to see this idea made specific to pathname matching. I see it more as a kind of narrow-case templating. Comparing it to capabilities currently available in shells may not be fair or desirable; for example, the brace expansion features of zsh/bash result only in the names of existing paths that match; so if you say ls m{8:11} but have no files matching that pattern, the result is null, not m8 m9 m10 m11.

      As for nesting - I think calling the expander reiteratively (or recursively ;-) could give you that.

        Brace expansion is not specific to pathname matching. I've shown you: type

        echo a{1..3}
        you get
        a1 a2 a3
        even if such files do not exist.
        echo > a1; ls a{1..3}
        prints
        ls: a2: No such file or directory ls: a3: No such file or directory a1

        There is a filename matching feature too, in bash it looks like this:

        shopt -s extglob echo a@(1|2|3)
        is really similar to the friendly echo a[1-3], it prints
        a1
        if a1 exists but the other files don't. (Don't ask me how it works in zsh.) There is no numeric range variant of this though.
Re: String expansion script
by Tanktalus (Canon) on Feb 17, 2005 at 23:53 UTC

    I just want to thank you - being able to read this thread has saved me hours of work. Or cost me hours. Depends on how you look at it. I spent about 3 hours trying to get a similar version to work, and it finally does now. Hopefully this gives me such a simplified API to doing what I am trying to do that it will save me (and my cow-orkers) enough time to justify that... Even if it doesn't, it's a cool API to work with. :-)

    Here it is - I've co-opted the colon for a shell-like experience. However, if you want to provide a function that maps ".." or "," or whatever you want, you can go ahead. My requirements state that () and {} are both valid token markers, but you can do whatever you want with it. (I'm hoping to remove the ambiguity later, but I'm stuck with it for now.)

    Also - in my case, it's possible to come up with duplication (partly because of the shell-likeness, if you're using it), so I came up with a method to remove duplicates while preserving order. That order part isn't well-tested, so anyone noticing anything amiss there would be appreciated :-)

    Update: Code fixes - missed @_==0 case, allow scalar return if expansion does not give multiple return values. Remember - this code can do more than just turn a simple scalar into a list (although that's where the thread started). It can perform arbitrary replacements which may be one-to-one. The expand_variable routine can return a single string which causes the whole thing to return a single string.

      I just want to thank you - being able to read this thread has saved me hours of work. Or cost me hours. Depends on how you look at it. I spent about 3 hours trying to get a similar version to work, and it finally does now. Hopefully this gives me such a simplified API to doing what I am trying to do that it will save me (and my cow-orkers) enough time to justify that... Even if it doesn't, it's a cool API to work with. :-)
      And I thank you in turn for your kind words. I gave a glance at your code and I noticed that it's not a snippet any more. Unfortunately I can't watch it in more detail: I wonder wether in the end you're matching the patterns balanced-wise as to allow for newsted ones...
Details [was: "String expansion script"]
by blazar (Canon) on Feb 03, 2005 at 13:16 UTC
    Note 1: I wrote this as a quick hack one day that I got tired of writing ad hoc solutions for the same task. I am perfectly aware that the use of $`, $& and $' is generally discouraged for various reasons. In this case it seemed to me that it provided the simplest way to achieve what I wanted.

    Note 2: I am perfectly aware that all in all this is not probably the most efficient way to do this, with potentially huge lists of parameters passing through recursive subs. All I can say is that it has worked excellently under all conditions of utilization I've needed. I typically use it like thus:

    echo 'http://foo.org/{bar:baz}/gallery[1:100]/pic[1:12].jpg' | xstr | +wget -i - -nH -x
Re: String expansion script
by jdporter (Paladin) on Feb 03, 2005 at 15:22 UTC
    Well that's fairly nifty. However, it has a few shortcoming... but rather than say "Here's my version which is better," I'd like to propose to you a set of enhancements, and see how you would approach it - if you're up for the game.
    1. The essential functionality is not reusable, except as a command-line tool. Extract it into a module, or at least a perl library file, so that I can call it from other perl code.
    2. The expander is hard-coded to expect only "-[1:10]-{a:b:c}-" patterns. What about "-{a:b:c}-[1:10]-" or "-[1:10]-[2:11]-" or "-{a:b:c}-{d:e:f}-" or even "-[1:10]-" or "-{a:b:c}-", etc? Make the expander recognize and expand any number of "[1:10]" and "{a:b:c}" patterns in the string.
    -- 
    jdporter
      The essential functionality is not reusable, except as a command-line tool. Extract it into a module, or at least a perl library file, so that I can call it from other perl code.
      Well, basically you're tempting me to let my hubris take over my laziness. However consider that
      1. as a general rule I still consider myself to be at most an advanced newbie,
      2. unfortunately time is not really an option. See for example this article in clpmisc, also available from Google groups or Google groups-beta.
      The expander is hard-coded to expect only "-[1:10]-{a:b:c}-" patterns.
      No, it isn't!
      What about "-{a:b:c}-[1:10]-" or "-[1:10]-[2:11]-" or "-{a:b:c}-{d:e:f}-" or even "-[1:10]-" or "-{a:b:c}-", etc? Make the expander recognize and expand any number of "[1:10]" and "{a:b:c}" patterns in the string.

      But it already does!! (Maybe you missed the point in which I stated that I wrote this as a general purpose solution to avoid having to create many ad hoc ones.)

      I apologize for I did not pinpoint all the details and only hinted to the "format" of input strings...

      However:

      1. a range of "numbers" (but not only, thanks to Perl's smart .. operator) [<num1>:<num2>] expands to the list of numbers from <num1> to <num2> in a smart way, e.g. with the correct number of leading zeroes,
      2. a colon separated list of "words" {<word1>:<word2>:...:<wordn>} expands to that list.
      I am perfectly aware that this description is not too clear and foolproof either, but I'm confident it will shed some light on the damned thing.

      I am aware it could be improved in many other ways as well. As I wrote in the first place it is well suited to the use I'm making of it. Of course I'd be curious to see any suggestion about it both from the UI and the implementation POVs.

        D'Oh. --me (jdporter--) for not actually testing your code before making a statement about its behavior.

        So I tested your code, and it doesn't do for me what you say it does for you:

        /[09:10]/
        /09/ /10/
        /{x:y}/
        /x/ /y/
        /[09:10]/{x:y}/
        /[09:10]/x/ /[09:10]/y/
        O.k., here's how I would do it. The following sub is stand-alone. I've changed the spec slightly: lists within curlies are comma-separated rather than colon separated. That seems a bit more natural to me.
        sub expand { local $_ = shift; if ( /^(.*?)\{([^}]+)\}(.*)$/ ) { my( $pre, $spec, $post ) = ( $1, $2, $3 ); return map expand($pre.$_.$post), split /,/, $spec } if ( /^(.*?)\[(\d+):(\d+)\](.*)$/ ) { my( $pre, $lo, $hi, $post ) = ( $1, $2, $3, $4 ); return map expand($pre.$_.$post), $lo .. $hi } $_ }
        One thing about it that I think could use some investigation and tweaking is whether it might be preferable to use /s or /g (or both) on the regexes.
        -- 
        jdporter

        Update: Here's a slightly different way to code it:
        sub expand { local $_ = shift; my @a; (@a=/^(.*?)\{([^}]+)\}(.*)$/)?map(expand($a[0].$_.$a[2]),split/,/, +$a[1]): (@a=/^(.*?)\[(\d+):(\d+)\](.*)$/)?map(expand($a[0].$_.$a[3]),$a[1] +..$a[2]): $_ }
Original code [was "Re: String expansion script"]
by blazar (Canon) on Feb 09, 2005 at 09:24 UTC
    #!/usr/bin/perl -ln use strict; use warnings; sub expand; sub doit; print for expand $_; sub expand { local $_=shift; return doit $`, (split /:/, $1), $' if /\{([\w:]+)\}/; return doit $`, $1 .. $2 , $' if /\[(\w+):(\w+)\]/; $_; } sub doit { my ($pre,$post)=(shift,pop); map { my $pre=$pre . $_; map $pre . $_, expand $post } @_; } __END__

    Original description:

    ("specifications" have changed too)
    This is another short program that for me is a complete application, and I hope that this time there's not a common utility to do the same...

    Basically it expands strings of the form

    foo[01:100]bar-{fred:barney:wilma}
    to the list
    foo01bar-fred foo02bar-fred ... foo100bar-fred foo01bar-barney ... foo100bar-wilma
    More info here!

      I've just noticed your new (fixed) script. I like the new input syntax better than the old one. I also like this syntax: /dev/hd[a:d][,1:8], and broken ranges: [1:5,7:10].

      Pity it still can't handle brackets embedded in each other, such as /dev/[hd[a:d][1:8],fd[0:1]], but that'd be difficult to implement properly.

      (Also, this code prints a warning on empty input lines.

        Of course it shouldn't be too hard with Text::Balanced. I would still be delighted to do it in one single regex. I'm not really sure if this is actually impossible, but for sure it's hard.

        I'm puzzled by the possibility of doing it with a single regex and the help of a few pre- and post- transformations. I have a well definite idea of how to do that, but it won't be so easy either... first or later I suspect that my hubris will take over and eventually I'll do it!

        Update: I'm not bothered by the fact that this has been downvoted, but I'd be very glad to know why it has... however to show with an example (of a somewhat related problem) what I meant with the above, I refer you to Re: Shuffling cards, but in that case the state of affairs is particularly simple because I can trust input strings which in turn are (guaranteed to be) particularly simple.