in reply to Parsing a text file

nimajneb, try something like this:

### Always use strict =) use strict; ### Set up a hash table ("associative array") to associate ### numbers with names my %totals; ### Open the file open my $fIN, "<msgcount.txt"; ### Read through the file line by line. Inside the loop, the ### special variable $_ will refer to the "current line" and ### this loop will move to the next line in turn with each ### loop iteration. It will quit when it runs out of file to read. ### ### I'll define some regexp patterns here, but you could do it ### all at once, of course. I'm only splitting it here to make ### the code in the loop more readable. my $date = '\d\d\/\d\d\/\d\d\d\d'; ### Matches MM/DD/YYYY my $name = '\w+'; ### Matches any number of a-z, A-Z, or _ my $count = '\d+'; ### Matches any number of digits. while( <$fIN> ) { ### We only care about this line if it's our special format. This ### will ignore lines that don't have valid data, such as blanks ### trailing blank lines in the file or something. ### matches 01/03/2008 yasmin 67 if( $_ =~ /($date) ($name) ($count)/ ) { ### The parenthases above captured the data in this line (if ### applicable). Now we can access the first, second, and third ### matches: my $this_date = $1; ### The first () field my $this_name = $2; ### The second () field my $this_count = $3; ### The third () field ### Do one thing if the user already has data, and something ### else if not. if( not exists $totals{$this_name} ) { ### If this person isn't already in the hash, add her with ### this data. $totals{$this_name} = $this_count; } else { ### This person is already in the hash, so just add to the ### existing totals count. $totals{$this_name} += $this_count; } } } ### Don't forget to close your file. close $fIN; ### That's it! Now you've got a hash full of your data (garaunteed ### not to have duplicates =) Now you can access someone's ### data directly: print 'yasmin has ' . $totals{'yasmin'} . ' posts.'; ### or with a loop foreach my $username (keys(%totals)) { print "User $username has " . $totals{$username} . " posts.\n"; }

That could all be done in a much more compact manner, but that makes it harder to learn at first, of course =)

For information about hashes using hashes in perl, do a google search for "perl hash tutorial". (here).

For information about capturing data as I did above (called "regular expressions") check out the Perl Regular Expressions documentation page.

Hope it helps!

Replies are listed 'Best First'.
Re^2: Parsing a text file
by johngg (Canon) on Apr 16, 2008 at 17:45 UTC
    That's good advice, "Always use strict." Unfortunately, you've fallen almost at the first hurdle by forgetting to use my when opening your lexical filehandle. It is also recommended practice to test for the success of the open statement (and close as well) and to use it's three argument form.

    open my $fIN, q{<}, q{msgcount.txt} or die qq{open: msgcount.txt: $!\n};

    Failing to test for success can lead to "readline() on closed filehandle $fIN at myscript.pl line nnn" errors if, say, you mis-type the path or the file has been deleted or ...

    I hope this is of interest.

    Cheers,

    JohnGG