in reply to What is sprintf doing?

If your real data is not the same as the sample data that you posted, then you will need to adjust the code I offered to suit.

The basic problem is that you want to sort your strings according to a numeric field embeded part way through your strings. To do that, you need to extract the numeric value from each string in order to compare them. In the sample data you supplied, the numeric field followed a (unique) comma and a space, so extracting that using a regex was easy.

$string =~ m[ , # a comma \s+ # followed by at least one space ( # capture \d+ # one or more digits ) ];

In the following code, I've done two things that hopefully will clarify what is going on.

  1. I've added an extra map block before the sort that prints out the modified records so that you can see the effect of the sprintf.
  2. I've added a test before the sprintf to detect and die if any records are encountered that don't match the regex.
#! perl -slw use strict; my @array = split '\n', <<'EOA'; Item1 - 2 foo, 2 bar Item2 - 0 foo, 1 bar Item3 - 1 foo, 3 bar Item4 - 1 foo, 2 bar EOA my @sorted = map{ # Strip the first five characters that we added earlier. substr $_, 5; } sort { # make the sort descending $b cmp $a } map { # print out the modified record to show what sprintf is doing print "before sort: '$_'"; $_; } map { ## try the regex and die displaying ## the failing record if it doesn't match die "Regex did not match: '$_'" unless m[,\s+(\d+)]; # It matched, so $1 contains the value to prepend. # pad to width 5. # use a bigger value here and above in the substr # if your numbers can be bigger than 99999. sprintf '%05d%s', $1, $_; } @array; print $/; print for @sorted; __END__ P:\test>test2 before sort: '00002Item1 - 2 foo, 2 bar' before sort: '00001Item2 - 0 foo, 1 bar' before sort: '00003Item3 - 1 foo, 3 bar' before sort: '00002Item4 - 1 foo, 2 bar' Item3 - 1 foo, 3 bar Item4 - 1 foo, 2 bar Item1 - 2 foo, 2 bar Item2 - 0 foo, 1 bar

Hopefully, that will clarify what the code does and also allow you to modify the regex to suit your data.

You can delete the extra map block once you've seen what the sprintf is doing, but I would leave the die...unless code in place so that you will detect and get a record of any malformed records in use.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re: Re: What is sprintf doing?
by McMahon (Chaplain) on May 25, 2004 at 18:10 UTC
    OK, so I *think* what I needed to do was cull all of the unwanted data before sorting. This works, gives no errors, and is substantially faster than what I was doing before. The double regex looks wrong, though>:

    my @wanted; foreach $msg(@msgs) { if ($msg =~ m[,\s+(\d+)]) { push @wanted, $msg; } } my @sorted = map{ substr $_, 5; } sort { $b cmp $a } map{ sprintf '%05d%s', $_ =~ m[,\s+(\d+)], $_; } @wanted;

      Sorry, but I do not understand what you are doing here at all.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        My source data array is actually more like:

        Item1 - 2 foo, 2 bar argle bargle Item2 - 0 foo, 1 bar floogle Item3 - 1 foo, 3 bar Item4 - 1 foo, 2 bar choogle


        I used the regex in the foreach to remove everything that causes the "uninitialized" and "isn't numeric" messages.

        But I still seem to need the regex in the sprintf statement, to make the sort come out right.