Tips for elegance

hasimir44 has asked for the wisdom of the Perl Monks concerning the following question:

This little snippet works fine to grab a (compressed) file name and it's time stamp, but I'm sure that the regexp part could be written better. Also, does it have to be in an "if" statement?

chomp($ll[0] = `ls -lto $path/*B1006* |head -n 1`);
chomp($ll[1] = `ls -lto $path/*B1106-* |head -n 1`);
chomp($ll[2] = `ls -lto $path/*B11062* |head -n 1`);

$next = 0;
foreach (@ll) {
    if (m#([A-Z][a-z][a-z]\s[0-9][0-9]\s[0-9][0-9]:[0-9][0-9])\s(/.+\.
+Z)#) {
        $stamp[$next] = $1;
        $name[$next] = $2;
    }
    print "\$name[\$next]: $name[$next]\n";
    print "\$stamp[\$next]: $stamp[$next]\n";
    $next++;
}
[download]

The Output looks like this:

$name[$next]: /<path>/B1006-cnvexp.20060605.200841.Z
$stamp[$next]: Jun 05 20:08
$name[$next]: /<path>/B1106-cnvexp.20060605.2008.Z
$stamp[$next]: Jun 05 20:09
$name[$next]: /<path>/B11062-cnvexp.20060605.2008.Z
$stamp[$next]: Jun 05 20:09
[download]

I am the young grasshopper. In one hour, I will be off work and practicing my drunken style :) Cheers.

2006-06-06 Retitled by planetscape, as per Monastery guidelines

( keep:0 edit:11 reap:0 )

Original title: 'Tips for elegence'

Comment on Tips for elegance Select or Download Code

Replies are listed 'Best First'.
Re: Tips for elegance by Zaxo (Archbishop) on Jun 06, 2006 at 05:06 UTC
Shelling out for file lists is pretty inelegant. Built-in glob can handle that, `my @ll = map { (glob "$path/$_")[0] } qw/B1006 B1106- B11062/;` [download] You can dispense with the index $next by pushing elements onto the arrays. Your regex can be dressed up a little with quantifiers and built-in character classes. `my (@stamp, @name); foreach (@ll) { if (m#([A-Z][a-z]{2}\s\d{2}\s\d{2}:\d{2})\s(/.+\.+Z)$#) { push @stamp, $1; push @name, $2; } }` [download] The if is necessary. Without it, you don't know that $1 and $2 are fresh. I added an end-anchoring $ to ensure .Z but not .Zoo files. I left off printing until they can happen in one discrete block. `for (0..#$stamp) { printf "\$name[$_]:\t%s\n\$stamp[$_]:\t%s\n\n", $name[$_], $stamp[$_]; }` [download] Update: Woops, japhy++ spotted a feature I completely overlooked. That simplifies everything, and makes the regex completely unnecessary. With the above @ll is @name. New version, minus printing: `my @name = map { (glob "$path/$_")[0] } qw/B1006.Z B1106-.Z B11062.Z/; my @stamp = map { scalar localtime( (stat)[9]) } @name;` [download] That retains $path in the names, but those can be trimmed if you like using File::Basename. After Compline, Zaxo	[reply] [d/l] [select]
Re: Tips for elegance by GrandFather (Saint) on Jun 06, 2006 at 05:56 UTC
There area number of things I'd alter here. For a start using `\d` rather than the more cumbersome character class `[0-9]` helps as does using the quantifier `{2}` in the regex. Personally I prefer to reserve # for comments where possible so I'd choose something else for the regex delimiters. Rather than using two arrays which you have to keep in sync, I'd use one array of hashes. That avoids having to use indexes everywhere to access the arrays. Assigning the capture list directly to variables is often cleaner than using the special capture variables (`my ($stamp, $name) = m<...>;`), although in this case just using the capture variables directly is clean. `use strict; use warnings; my @ll = <DATA>; my @items; chomp @ll; foreach (@ll) { next if ! m<([A-Z][a-z]{2}\s\d{2}\s\d{2}:\d{2})\s(/.+\.Z)>; push @items, {stamp => $1, name => $2}; } print "$_->{name}: $_->{stamp}\n" for @items; __DATA__ Jun 05 20:08 /<path>/B1006-cnvexp.20060605.200841.Z Jun 05 20:09 /<path>/B1106-cnvexp.20060605.2008.Z Jun 05 20:09 /<path>/B11062-cnvexp.20060605.2008.Z` [download] Prints: `/<path>/B1006-cnvexp.20060605.200841.Z: Jun 05 20:08 /<path>/B1106-cnvexp.20060605.2008.Z: Jun 05 20:09 /<path>/B11062-cnvexp.20060605.2008.Z: Jun 05 20:09` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: Tips for elegance by japhy (Canon) on Jun 06, 2006 at 12:08 UTC
Is no one paying attention to the fact that the OP is using `ls -lto ...` to get the modification time of the files? I strongly suggest you use stat() or the -M filetest for this instead. It's much cleaner, faster, and system-independent, and there's no hairy regex necessary for parsing out the date. And by the way, my `ls -lto ...` doesn't zero-pad the day, so your regex would fail on my system. `my @months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ); my @dates; for my $f (@files) { my $mod_time = (stat $f)[9]; my ($min, $hour, $day, $mon) = (localtime $mod_time)[1..4]; print "$f modified on $months[$mon] $day, $hour:$min\n"; # or if you want zero-padding... printf "%s modified on %3s %02d, %02d:%02d\n", $f, $months[$mon], $d +ay, $hour, $min; }` [download] Please see stat for more details. Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply] [d/l] [select]
Re: Tips for elegance by Samy_rio (Vicar) on Jun 06, 2006 at 05:05 UTC
Hi hasimir44, Try this, use strict; use warnings; my $next = 0; my (@stamp, @name); while (<DATA>){ chomp; ($stamp[$next], $name[$next]) = $_ =~ m/([a-z]{3}(?:\s[0-9]{2}){2} +:[0-9]{2})\s(\/.+\.Z)/i; print "\$name[\$next]: $name[$next]\n"; print "\$stamp[\$next]: $stamp[$next]\n"; $next++; } __DATA__ Jun 05 20:08 /test/B1006-cnvexp.20060605.200841.Z Jun 05 20:09 /test/B1106-cnvexp.20060605.2008.Z Jun 05 20:09 /test/B11062-cnvexp.20060605.2008.Z Output is: $name[$next]: /test/B1006-cnvexp.20060605.200841.Z $stamp[$next]: Jun 05 20:08 $name[$next]: /test/B1106-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09 $name[$next]: /test/B11062-cnvexp.20060605.2008.Z $stamp[$next]: Jun 05 20:09 [download] Regards, Velusamy R. eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@\|6%,53!-9@2~j';	[reply] [d/l] [select]
Re: Tips for elegance by polettix (Vicar) on Jun 06, 2006 at 11:05 UTC
Others already suggested some cleanup for your regex, but I'd suggest to comment the regex a bit using the `x` modifier (starting from GrandFather's version): `m< ( # Start of capture #1 [A-Z][a-z]{2} # Month name \s \d{2} # Day \s \d{2}:\d{2} # Hour:Minutes ) # End of capture #1 \s (/.+\.Z) # Capture #2: File name >x; # End of regex, note 'x' modifier` [download] Of course the actual indentation and commenting style is at your will. Flavio perl -ple'$_=reverse' <<<ti.xittelop@oivalf Don't fool yourself.	[reply] [d/l] [select]


No such thing as a small change
	PerlMonks