http://qs1969.pair.com?node_id=689020

steph_bow has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I would like to sort fileswhose name are for examples : M3_output_ZGZ22_02_20061022_duration_200mn_comptage_60.txt

I have built the following sub routine:

use strict; use File::Copy; sub files_sort { my $first_number; my $second_number; if ($a =~ /(\d+)_(\d+)_duration_(\w+)_comptage_(\d+).txt$/){ $first_number = $1; } if ($b =~ /(\d+)_(\d+)_duration_(\w+)_comptage_(\d+).txt$/){ $second_number = $1; } if ($1 < $2){ -1 } elsif($1 > $2){ 1 } else{ 0} } my @TXT = glob("*.txt"); @TXT = sort files_sort @TXT;

Is that OK ? Thanks

Update: thanks to pisni for correcting my mistake, I just want to order the list by the number 02, 04, 12, not the date (20061022).

Replies are listed 'Best First'.
Re: sorting files
by reasonablekeith (Deacon) on May 29, 2008 at 14:57 UTC
    With reference to your update, you were pretty close, you just needed to actually compare the values to went to the trouble of extracting :) ...
    sub files_sort { my $first_number; my $second_number; if ($a =~ /(\d+)_(\d+)_duration_(\w+)_comptage_(\d+).txt$/){ $first_number = $1; } if ($b =~ /(\d+)_(\d+)_duration_(\w+)_comptage_(\d+).txt$/){ $second_number = $1; } $first_number <=> $second_number; }
    Have a look in 'perldoc perlop' if you don't know what the 'space ship' operator does, it's _very_ handy for sort functions, and is doing what you were trying to do with your if's and elses.

    I would add though, that if this was my code, I'd probably write a sub to extract the index number from the file name, and write the whole thing like this.

    sub my_sort { parse_file_name($a) <=> parse_file_name($b); } sub parse_file_name { if ($_[0] =~ /(\d+)_\d+_duration_\w+_comptage_\d+.txt$/) { return $1; } else { return 0; } }
    ---
    my name's not Keith, and I'm not reasonable.
Re: sorting files
by toolic (Bishop) on May 29, 2008 at 15:01 UTC
    You might want to clarify your requirements a little by providing a few example input filenames and the expected output (contents of your @TXT array).

    I have made a few changes to your script:

    • Added use warnings;
    • Stored the regex in a variable, $re, to avoid duplication.
    • Removed the unnecessary capturing parentheses from the regex; since you are only using $1, you only need 1 set.
    • Check $first_number instead of $1. Same for $2.

    Here is the code:

    use strict; use warnings; use File::Copy; sub files_sort { my $first_number; my $second_number; my $re = qr/(\d+)_\d+_duration_\w+_comptage_\d+.txt$/; if ($a =~ $re) { $first_number = $1; } if ($b =~ $re) { $second_number = $1; } if ($first_number < $second_number) { -1 } elsif ($first_number > $second_number) { 1 } else { 0 } } my @TXT = glob("*.txt"); @TXT = sort files_sort @TXT; print "$_\n" for @TXT;

    Here is the output:

    % ls -1 *.txt A3_output_ZGZ22_03_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_02_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_12_20051022_duration_200mn_comptage_60.txt M3_output_ZGZ22_12_20061022_duration_200mn_comptage_60.txt % 689020.pl M3_output_ZGZ22_02_20061022_duration_200mn_comptage_60.txt A3_output_ZGZ22_03_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_12_20051022_duration_200mn_comptage_60.txt M3_output_ZGZ22_12_20061022_duration_200mn_comptage_60.txt

    Is this what you expect?

    Caveat: you must make sure that all the *.txt files in your directory match your regex.

      I'd rewrite the sub as:

      sub files_sort { my $re = qr/(\d+)_\d+_duration_\w+_comptage_\d+.txt$/; my $first_number = $a =~ $re ? $1 : -1; my $second_number = $b =~ $re ? $1 : -1; return $first_number <=> $second_number; }

      which is slightly more compact and deals better with bad matches by avoiding comparing undefined values. Explicit returns are good too.


      Perl is environmentally friendly - it saves trees
Re: sorting files
by psini (Deacon) on May 29, 2008 at 14:37 UTC

    It could be ok, it depends on what you are expecting it to do...

    As you have written you are extracting only the "02" part of the filename in $first_number and the "20061022" part in $second_number. I don't really believe it is what you meant to do :)

    Please better explain what you are trying: in the example filename there are 4 numeric parts and you should at least decide the order of precedence between them (which are to be sort before). Secondly, if the format of the string is known (length of the numeric parts, for instance) there could be efficient ways than in the generic case.

    Rule One: Do not act incautiously when confronting a little bald wrinkly smiling man.

Re: sorting files
by graff (Chancellor) on May 30, 2008 at 02:22 UTC
    I'm surprised that I seem to be the first one to suggest a Schwartzian Transform:
    @TXT = map { s/\d+ //; $_ } sort map { s/(.*?)_(\d+)_/sprintf("%04d %s_%s_",$2,$1,$2)/e; $_ } gl +ob( "*.txt" );
    (Not fully tested, but updated to add a missing "_" in the replacement string of the s///e operation)
Re: sorting files
by poolpi (Hermit) on May 30, 2008 at 13:06 UTC


    #!/usr/bin/perl -w use strict; use Data::Dumper; use Sort::Key::Multi qw(iikeysort); chomp( my @s = <DATA> ); my @is = iikeysort { /\A .+ _ (\d+) _ (\d+) .+ txt \z/msx; } @s; print Dumper \@is; __DATA__ M3_output_ZGZ22_02_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_078902_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_11_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_1_20061022_duration_200mn_comptage_60.txt M3_output_ZGZ22_078902_20051022_duration_200mn_comptage_60.txt M3_output_ZGZ22_128_20061022_duration_200mn_comptage_60.txt
    Output: $VAR1 = [ 'M3_output_ZGZ22_1_20061022_duration_200mn_comptage_60.txt', 'M3_output_ZGZ22_02_20061022_duration_200mn_comptage_60.txt' +, 'M3_output_ZGZ22_11_20061022_duration_200mn_comptage_60.txt' +, 'M3_output_ZGZ22_128_20061022_duration_200mn_comptage_60.txt +', 'M3_output_ZGZ22_078902_20051022_duration_200mn_comptage_60. +txt', 'M3_output_ZGZ22_078902_20061022_duration_200mn_comptage_60. +txt' ];

    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb