Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Tips for elegance

by hasimir44 (Sexton)
on Jun 06, 2006 at 04:31 UTC ( [id://553741]=perlquestion: print w/replies, xml ) Need Help??

hasimir44 has asked for the wisdom of the Perl Monks concerning the following question:

This little snippet works fine to grab a (compressed) file name and it's time stamp, but I'm sure that the regexp part could be written better. Also, does it have to be in an "if" statement?
chomp($ll[0] = `ls -lto $path/*B1006* |head -n 1`); chomp($ll[1] = `ls -lto $path/*B1106-* |head -n 1`); chomp($ll[2] = `ls -lto $path/*B11062* |head -n 1`); $next = 0; foreach (@ll) { if (m#([A-Z][a-z][a-z]\s[0-9][0-9]\s[0-9][0-9]:[0-9][0-9])\s(/.+\. +Z)#) { $stamp[$next] = $1; $name[$next] = $2; } print "\$name[\$next]: $name[$next]\n"; print "\$stamp[\$next]: $stamp[$next]\n"; $next++; }
The Output looks like this:
$name[$next]: /<path>/B1006-cnvexp.20060605.200841.Z $stamp[$next]: Jun 05 20:08 $name[$next]: /<path>/B1106-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09 $name[$next]: /<path>/B11062-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09
I am the young grasshopper. In one hour, I will be off work and practicing my drunken style :) Cheers.

2006-06-06 Retitled by planetscape, as per Monastery guidelines

( keep:0 edit:11 reap:0 )

Original title: 'Tips for elegence'

Replies are listed 'Best First'.
Re: Tips for elegance
by Zaxo (Archbishop) on Jun 06, 2006 at 05:06 UTC

    Shelling out for file lists is pretty inelegant. Built-in glob can handle that,

    my @ll = map { (glob "$path/$_")[0] } qw/*B1006* *B1106-* *B11062*/;

    You can dispense with the index $next by pushing elements onto the arrays. Your regex can be dressed up a little with quantifiers and built-in character classes.

    my (@stamp, @name); foreach (@ll) { if (m#([A-Z][a-z]{2}\s\d{2}\s\d{2}:\d{2})\s(/.+\.+Z)$#) { push @stamp, $1; push @name, $2; } }
    The if is necessary. Without it, you don't know that $1 and $2 are fresh. I added an end-anchoring $ to ensure .Z but not .Zoo files.

    I left off printing until they can happen in one discrete block.

    for (0..#$stamp) { printf "\$name[$_]:\t%s\n\$stamp[$_]:\t%s\n\n", $name[$_], $stamp[$_]; }

    Update: Woops, japhy++ spotted a feature I completely overlooked. That simplifies everything, and makes the regex completely unnecessary. With the above @ll is @name. New version, minus printing:

    my @name = map { (glob "$path/$_")[0] } qw/*B1006*.Z *B1106-*.Z *B11062*.Z/; my @stamp = map { scalar localtime( (stat)[9]) } @name;
    That retains $path in the names, but those can be trimmed if you like using File::Basename.

    After Compline,
    Zaxo

Re: Tips for elegance
by GrandFather (Saint) on Jun 06, 2006 at 05:56 UTC

    There area number of things I'd alter here. For a start using \d rather than the more cumbersome character class [0-9] helps as does using the quantifier {2} in the regex. Personally I prefer to reserve # for comments where possible so I'd choose something else for the regex delimiters.

    Rather than using two arrays which you have to keep in sync, I'd use one array of hashes. That avoids having to use indexes everywhere to access the arrays.

    Assigning the capture list directly to variables is often cleaner than using the special capture variables (my ($stamp, $name) = m<...>;), although in this case just using the capture variables directly is clean.

    use strict; use warnings; my @ll = <DATA>; my @items; chomp @ll; foreach (@ll) { next if ! m<([A-Z][a-z]{2}\s\d{2}\s\d{2}:\d{2})\s(/.+\.Z)>; push @items, {stamp => $1, name => $2}; } print "$_->{name}: $_->{stamp}\n" for @items; __DATA__ Jun 05 20:08 /<path>/B1006-cnvexp.20060605.200841.Z Jun 05 20:09 /<path>/B1106-cnvexp.20060605.2008.Z Jun 05 20:09 /<path>/B11062-cnvexp.20060605.2008.Z

    Prints:

    /<path>/B1006-cnvexp.20060605.200841.Z: Jun 05 20:08 /<path>/B1106-cnvexp.20060605.2008.Z: Jun 05 20:09 /<path>/B11062-cnvexp.20060605.2008.Z: Jun 05 20:09

    DWIM is Perl's answer to Gödel
Re: Tips for elegance
by japhy (Canon) on Jun 06, 2006 at 12:08 UTC
    Is no one paying attention to the fact that the OP is using `ls -lto ...` to get the modification time of the files? I strongly suggest you use stat() or the -M filetest for this instead. It's much cleaner, faster, and system-independent, and there's no hairy regex necessary for parsing out the date. And by the way, my `ls -lto ...` doesn't zero-pad the day, so your regex would fail on my system.
    my @months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ); my @dates; for my $f (@files) { my $mod_time = (stat $f)[9]; my ($min, $hour, $day, $mon) = (localtime $mod_time)[1..4]; print "$f modified on $months[$mon] $day, $hour:$min\n"; # or if you want zero-padding... printf "%s modified on %3s %02d, %02d:%02d\n", $f, $months[$mon], $d +ay, $hour, $min; }
    Please see stat for more details.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Tips for elegance
by Samy_rio (Vicar) on Jun 06, 2006 at 05:05 UTC

    Hi hasimir44, Try this,

    use strict; use warnings; my $next = 0; my (@stamp, @name); while (<DATA>){ chomp; ($stamp[$next], $name[$next]) = $_ =~ m/([a-z]{3}(?:\s[0-9]{2}){2} +:[0-9]{2})\s(\/.+\.Z)/i; print "\$name[\$next]: $name[$next]\n"; print "\$stamp[\$next]: $stamp[$next]\n"; $next++; } __DATA__ Jun 05 20:08 /test/B1006-cnvexp.20060605.200841.Z Jun 05 20:09 /test/B1106-cnvexp.20060605.2008.Z Jun 05 20:09 /test/B11062-cnvexp.20060605.2008.Z Output is: $name[$next]: /test/B1006-cnvexp.20060605.200841.Z $stamp[$next]: Jun 05 20:08 $name[$next]: /test/B1106-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09 $name[$next]: /test/B11062-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Re: Tips for elegance
by polettix (Vicar) on Jun 06, 2006 at 11:05 UTC
    Others already suggested some cleanup for your regex, but I'd suggest to comment the regex a bit using the x modifier (starting from GrandFather's version):
    m< ( # Start of capture #1 [A-Z][a-z]{2} # Month name \s \d{2} # Day \s \d{2}:\d{2} # Hour:Minutes ) # End of capture #1 \s (/.+\.Z) # Capture #2: File name >x; # End of regex, note 'x' modifier
    Of course the actual indentation and commenting style is at your will.

    Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://553741]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-20 02:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found