My solution

Hello, brethren!
This is what i did after your precious advices.
Now my mailman program correctly parses that messy html and extracts the right fields to use 'em in renaming the file itself according to name and date.
You should then check another post o'mine named "Epiphany" that I'll post in the "Meditations" section, as it's related to this script developing process...something i'm pretty silly!
It also checks for already existing filenames and adds a roman number to distinguish between them.
And, *WOW*, it's strict compliant! ;-)

Thanks to everyone, ++ you all!

Here's my code:


#!/usr/bin/perl -w
#
# mailman
#

$\="\n";

use strict;
use HTML::TokeParser;
use Date::Manip;



$ARGV[0] || die "\n\tUsage: rp FILENAME\n\n";
my $p = HTML::TokeParser->new("$ARGV[0]") || die "\nthe file $ARGV[0] 
+doesn't exist.\n\n";
[download]

########################################
#
# insert here the list of known names
#
########################################

my %nomi=(
    'elenaleonardi' => 'ele',
    'lellobove' => 'cte',
    'dark.prg' => 'mazzini',
    'robertopar' => 'parolisi',
    'meloinfo' => 'melo',
    'massimogab' => 'max',
    'simofiore' => 'simone',
    'mpagnucci'=> 'pagnucci',
    'zeromega'=> 'pagnucci',
    'melojunior'=> 'melo',
    'mmelillo'=> 'melo',
    'marcomelillo'=> 'melo',
    'matteobagn'=> 'matteo',
    'uncoucou' => 'lorena',
    'zappagalattica' => 'zappa',
    'gavrilus' => 'zappa',
    'carloalbertodue' => 'carlone',
    'bugman996' => 'bug',
    'michel' => 'ziobudda',
    'Bagnoli' => 'matteo',
    'salciaiola' => 'stegualerci',
      );
      
########################################
#
# hash of main data types
#
########################################
      
my %hash;
while (my $t = $p->get_token){
next    unless     $t->[0] eq 'S'
    and        $t->[1] eq 'th'
    and        defined $t->[2]{'class'}
    and        $t->[2]{'class'} eq 'DatTh';
my $key = $p->get_text('/th');
chop $key;
$p->get_tag('td');
my $value = $p->get_text('/td');
$value =~ s/^\s*//;
$value =~ s/\s*\[Add\]\s*$//;
$value=~s/[AP]M.*$//;
$hash{$key} = $value;
}

my $name=$hash{From};

my $date=$hash{Date};

&Date_Init("TZ=GMT","DateFormat=UK");
$date=UnixDate($date,"%y%m%d");
$name=~s/\@.*//;
$name=~s/.*\"//;
$name=~s/.*\<//;
if (exists $nomi{$name})
    {
    $name=$nomi{$name};
    }
$name=$name.$date.".html";
###########################################
#
# find contemporaries and number'em 
# using roman numbers upto V. more than V
# messages a day from the same folk are
#          *EVIL* !!! :-P
#
###########################################

if (-e $name)
    {
    $name=~s/.html/II.html/;
    }
if (-e $name)
    {
    $name=~s/II.html/III.html/;
    }
if (-e $name)
    {
    $name=~s/III.html/IV.html/;
    }
if (-e $name)
    {
    $name=~s/IV.html/V.html/;
    }
rename $ARGV[0], $name if (! -e $name) || die "\nToo many! Please rena
+me manually.\n\n";
print "\nRenamed as: ";
print "$name\n\n";
[download]

Comment on My solution Select or Download Code

Replies are listed 'Best First'.
Re: My solution by Mr_Jon (Monk) on Aug 23, 2004 at 17:22 UTC
A nice way of adding incremental roman numerals, as you do at the end of your script, would be to use the Math::Roman module (handily mentioned in the Perl Cookbook). This would impose no arbitrary limit on the messages you receive from the same person - it even goes above the 'highest' Roman numeral of 5000. `#!/usr/bin/perl -w use strict; use Math::Roman qw(roman); my $roman = new Math::Roman; $\ = "\n"; while (<DATA>) { chomp; my $old_name = $_; if ($old_name =~ /^(.+\d{6})(\w+)?(\.htm)$/) { my $old_roman = $2 \|\| 1; my $new_roman = roman("$old_roman") + 1; my $new_name = $1 . $new_roman . $3; print "$old_name => $new_name"; } } __DATA__ filename230804.htm filename230804IV.htm filename230804V.htm filename230804II.htm filename230804X.htm` [download] Output: `filename230804.htm => filename230804II.htm filename230804IV.htm => filename230804V.htm filename230804V.htm => filename230804VI.htm filename230804II.htm => filename230804III.htm filename230804X.htm => filename230804XI.htm` [download] Of course, you could simply append 'normal' numbers to achieve the same result, but that wouldn't be half as much fun...	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re: My solution
by Mr_Jon (Monk) on Aug 23, 2004 at 17:22 UTC

Math::Roman

Perl Cookbook

#!/usr/bin/perl -w
use strict;

use Math::Roman qw(roman);
my $roman = new Math::Roman;

$\ = "\n";

while (<DATA>) {
    chomp;
    my $old_name = $_;

    if ($old_name =~ /^(.+\d{6})(\w+)?(\.htm)$/) {
        my $old_roman = $2 || 1; 
        my $new_roman = roman("$old_roman") + 1;
        my $new_name = $1 . $new_roman . $3;
        print "$old_name => $new_name";
    }
}

__DATA__
filename230804.htm
filename230804IV.htm
filename230804V.htm
filename230804II.htm
filename230804X.htm
[download]

filename230804.htm => filename230804II.htm
filename230804IV.htm => filename230804V.htm
filename230804V.htm => filename230804VI.htm
filename230804II.htm => filename230804III.htm
filename230804X.htm => filename230804XI.htm
[download]

[reply]
[d/l]
[select]