nobot has asked for the wisdom of the Perl Monks concerning the following question:

Most esteemed and splendid holy ones,

I need to parse a string of numbers and put them in an array.
The format of the string can differ as shown below.

1) Single number : 80
2) Range : 17-199
3) Multiple numbers: 18,512,21,78
4) Combination : 18,5,1790,19-66,212,213

Result shall be an array filled with the numbers sorted
in ascending order.
I have tried several methods involving split, push and
regexps. But I have not found a really clean and nice
solution.
Especially the combination part gets me into doing ugly
loops all over the place.
Note, I am not some kid trying to get you guys to do my
homework.
Actually I am not even a kid, but a pretty old dude :)

Regards

Replies are listed 'Best First'.
Re: number sequence
by kyle (Abbot) on Mar 23, 2007 at 17:24 UTC

    This seems to work:

    my $comb = '18,5,1790,19-66,212,213'; $comb =~ s/(\d+)-(\d+)/join ',', $1 .. $2/ge; my @seq = sort { $a <=> $b } split ',', $comb;
      Yes, looks like it works. I'd prefer to expand the range format ("10-66") after the split:
      my @seq = sort { $a <=> $b } map /(\d+)-(\d+)/ ? $1 .. $2 : $_, split /,\s*/, $comb;
      Anno
Re: number sequence
by Old_Gray_Bear (Bishop) on Mar 23, 2007 at 17:55 UTC
    So; you write two subroutines, one to handle the range case and the other to handle the simple number. Call them ranger() and simple().

    Then you a write a routine to manage the worker routines (manager()). The manager function tests its argument for the presence of a '-'. If the hyphen is present, pass the argument to ranger($arg); otherwise call simple($arg) and return the result in either case.

    Finally you write your controller control(), which is responsible for decomposing your input string into smaller bits:

    1. If there is a comma in the input then
      • split the string in two at the comma (stra, strb)
      • process the first string -- $ra = control($stra)
      • process the second one -- $rb = control($strb)
    2. If there are no commas in the input, pass the string to the manager routine $ret = manager($parm_string).
    3. Remember to do the right thing with the return strings.

    Welcome to the world of iterative parsing; recursive descent is your friend.

    Update -- added tag to close the list.

    ----
    I Go Back to Sleep, Now.

    OGB

Re: number sequence
by GrandFather (Saint) on Mar 23, 2007 at 22:04 UTC

    You forgot about join and map:

    use strict; use warnings; while (<DATA>) { chomp; print join ' ', sort {$a<=> $b} map {/(\d+)-(\d+)/ ? $1 .. $2 : $ +_} split ','; print "\n"; } __DATA__ 18,5,1790,19-66,212,213

    Prints:

    5 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 + 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 + 64 65 66 212 213 1790

    DWIM is Perl's answer to Gödel
Re: number sequence
by brian_d_foy (Abbot) on Mar 24, 2007 at 03:20 UTC

    You want Set::IntSpan. The only way I know this is that newsrc files use that format, and that's what News::Newsrc uses. :)

    Update: bobf notes that Set::IntSpan needs a run in increasing order, and that most solutions pretty much do the same thing that I've done below. However, I'm thinking about the next part of the problem, where some part of the program is going to ask if a value is in that list. Making the range is easy with simple Perl. Getting data out of it is a bit more complicated. :)

    #!/usr/bin/perl use Set::IntSpan; while( <DATA> ) { chomp; my $set = Set::IntSpan->new( join ",", sort { $a <=> $b } split /,/, $_ ); print "$_ ---> @{ [ $set->elements] }\n"; } __END__ 80 17-199 18,512,21,78 18,5,1790,19-66,212,213
    --
    brian d foy <brian@stonehenge.com>
    Subscribe to The Perl Review

      I was going to post the same thing, until I created a test script with the OP's example data. The subsequent error message led me to the docs, where I found this (emphasis mine):

      If a string is given, it is taken to be a run list. A run list specifies a set using a syntax similar to that in newsrc files. A run list is a comma-separated list of runs. Each run specifies a set of consecutive integers.
      Restrictions: The runs in a run list must be disjoint, and must be listed in increasing order.

      Since the OP's example data is not ordered, a solution like the ones provided above is necessary before Set::IntSpan can be used. This, however, makes Set::IntSpan moot for the purpose of the OP. :-)

Re: number sequence
by moklevat (Priest) on Mar 23, 2007 at 17:17 UTC
    Hi nobot,

    It would be helpful if you showed an example of what you tried and how it didn't work. I'm also not sure what your input data look like. Do you want ranges treated as a high and low value only, or do you want all integers in between as well?

      I am currently at work and the code is at home. The code I have works. It is just that my solution is not really that elegant. I hoped you guys would show me some magic :)
      And all the numbers in (between) the range shall be put into the range as well.
Re: number sequence
by andye (Curate) on Mar 23, 2007 at 18:03 UTC
    Here's one way of doing it:

    my $comb = '18,5,1790,19-66,212,213'; $comb =~ s/-/../g; my @uns; push @uns, eval "($_)" foreach split ',', $comb; print join "\n", sort { $a <=> $b } @uns;
    Of course, using eval like this, you have to be able to trust the source data not to contain any nasties.

    HTH, andye

Re: number sequence
by bennymack (Pilgrim) on Mar 23, 2007 at 18:05 UTC

    I highly reccomend getting your hand on a copy of Mark Jason Dominus' "Higher Order Perl". While it may be overkill for the purpose of finding a solution to this particular problem, it will show several different ways to elegantly solve this problem that will not only work for now but work for any variations you'd like to add in the future. So, if you're serious about your Perl knowledge and loooking to grow therein, then definitely check it out. I believe the chapter you'll most be interested in is chapter 8 but you'll need to browse a few of the earlier chapters for background. Sorry about the fake meta-help...