McMahon has asked for the wisdom of the Perl Monks concerning the following question:

Hello again monks...

Yesterday I received some great advice about sorting in this thread: Sort array according to a value in each element?

Out of the box, the Guttman-Rosler Tranform from BrowserUKdid almost exactly what I needed. (I've been trying to understand the ST and GRT since then.)

I've got the following code which is working fine:

my @sorted = map{ substr $_, 5; } sort { $b cmp $a } map{ sprintf '%05d%s', $_ =~ m[,\s+(\d+)], $_; } @msgs; #SOME STUFF foreach my $warning(@sorted) { print OUT $warning unless ($warning =~ "<stuff I don't want>"); }


but sprintf reports the following warning on every line of a 7253-line LOG file (line 43 is the "sort..." line):

Use of uninitialized value in sprintf at script.pl line 43, <LOG> line + 7253. Argument <line from LOG> isn't numeric in sprintf at script.pl line 43 +, <LOG> line 7253


Oddly enough, this code works just as well:

my @sorted = map{ substr $_, 125; } sort { $b cmp $a } map{ sprintf '%0125d%s', $_ =~ m[,\s+(\d+)], $_; } @msgs;


I just have a feeling that if I could figure out the sprintf/substr interaction, it would be easier to figure out the rest of the code. I'm pretty sure I know what substr is doing here, but I'm not getting the relation to sprintf.

Again, I've got working code, now I'm just trying to make sure I understand why it works.

Replies are listed 'Best First'.
Re: What is sprintf doing?
by Roy Johnson (Monsignor) on May 25, 2004 at 15:31 UTC
    sprintf is just composing a new string that has the extracted number on the front, padded to five spaces with leading zeroes. Substr is taking that back off.

    The uninitialized warning suggests that the pattern match isn't successful. Could you provide some of the data you've got, so we can reproduce the warnings?


    The PerlMonk tr/// Advocate
Re: What is sprintf doing?
by Nkuvu (Priest) on May 25, 2004 at 15:31 UTC
    From the error message, I'd say that the value from the $_ =~ m[,\s+(\d+)] isn't a number. It looks like the match m[,\s+(\d+)] is failing.
Re: What is sprintf doing?
by halley (Prior) on May 25, 2004 at 16:44 UTC
    I think your question was answered above. Just a minor and irrelevant style consideration I wanted to raise.

    Tasks which munge lists into other lists might be more readable if you consider them to be a pipeline. You could get lost in all those curly braces if they keep chaining like that. Each task on the pipeline becomes clear if you line them up a bit:

    my @sorted = map { substr $_, 5 } sort { $b cmp $a } map { sprintf '%05d%s', $_ =~ m[,\s+(\d+)], $_ } @msgs;

    --
    [ e d @ h a l l e y . c c ]

Re: What is sprintf doing?
by BrowserUk (Patriarch) on May 25, 2004 at 17:38 UTC

    If your real data is not the same as the sample data that you posted, then you will need to adjust the code I offered to suit.

    The basic problem is that you want to sort your strings according to a numeric field embeded part way through your strings. To do that, you need to extract the numeric value from each string in order to compare them. In the sample data you supplied, the numeric field followed a (unique) comma and a space, so extracting that using a regex was easy.

    $string =~ m[ , # a comma \s+ # followed by at least one space ( # capture \d+ # one or more digits ) ];

    In the following code, I've done two things that hopefully will clarify what is going on.

    1. I've added an extra map block before the sort that prints out the modified records so that you can see the effect of the sprintf.
    2. I've added a test before the sprintf to detect and die if any records are encountered that don't match the regex.

    Hopefully, that will clarify what the code does and also allow you to modify the regex to suit your data.

    You can delete the extra map block once you've seen what the sprintf is doing, but I would leave the die...unless code in place so that you will detect and get a record of any malformed records in use.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      OK, so I *think* what I needed to do was cull all of the unwanted data before sorting. This works, gives no errors, and is substantially faster than what I was doing before. The double regex looks wrong, though>:

      my @wanted; foreach $msg(@msgs) { if ($msg =~ m[,\s+(\d+)]) { push @wanted, $msg; } } my @sorted = map{ substr $_, 5; } sort { $b cmp $a } map{ sprintf '%05d%s', $_ =~ m[,\s+(\d+)], $_; } @wanted;

        Sorry, but I do not understand what you are doing here at all.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail