Davo1977 has asked for the wisdom of the Perl Monks concerning the following question:
Analysing text files to obtain statistics on their content You are to write a Perl program that analyses text files to obtain sta +tistics on their content. The program should operate as follows: 1) When run, the program should check if an argument has been provided +. If not, the program should prompt for, and accept input of, a filen +ame from the keyboard. 2) The filename, either passed as an argument or input from the keyboa +rd, should be checked to ensure it is in MS-DOS format. The filename +part should be no longer than 8 characters and must begin with a lett +er or underscore character followed by up to 7 letters, digits or und +erscore characters. The file extension should be optional, but if giv +en is should be ".TXT" (upper- or lowercase). If no extension if given, ".TXT" should be added to the end of the fil +ename. So, for example, if "testfile" is input as the filename, this +should become "testfile.TXT". If "input.txt" is entered, this should +remain unchanged. 3) If the filename provided is not of the correct format, the program +should display a suitable error message and end at this point. 4) The program should then check to see if the file exists using the f +ilename provided. If the file does not exist, a suitable error messag +e should be displayed and the program should end at this point. 5) Next, if the file exists but the file is empty, again a suitable er +ror message should be displayed and the program should end. 6) The file should be read and checked to display crude statistics on +the number of characters, words, lines, sentences and paragraphs that + are within the file. I am very new to Perl and have managed to compile this code using exam +ples from various books. Could anyone oversee this coding and see how + it could be improved. #!/usr/bin/perl use strict; use warnings; if ($#ARGV == -1) #no filename provided as a command line argument. { print("Please enter a filename: "); $filename = <STDIN>; chomp($filename); } else #got a filename as an argument. { $filename = $ARGV[0]; } #perform the specified checks #check if filename is valid, exit if not if ($filename !~ m^/[a-z]{1,7}\.TXT$/i) { die("File format not valid\n");) } if ($filename !~ m/\.TXT$/i) { $filename .= ".TXT"; } #check if filename is actual file, exit if it is. if (-e $filename) { die("File does not exist\n"); } #check if filename is empty, exit if it is. if (-s $filename) { die("File is empty\n"); } my $i = 0; my $p = 1; my $words = 0; my $chars = 0; open(READFILE, "<$data1.txt") or die "Can't open file '$filename: $!"; + #then use a while loop and series of if statements similar to the foll +owing while (<READFILE>) { chomp; #removes the input record Separator $i = $.; #"$". is the input record line numbers, $i++ will also work $p++ if (m/^$/); #count paragraphs $my @t = split (/\s+/); #split sentences into "words" $words += @t; #add count to $words $chars += tr/ //c; #tr/ //c count all characters except spaces and add + to $chars } #display results print "There are $i lines in $data1\n"; print "There are $p Paragraphs in $data1\n"; print "There are $words in $data1\n"; print "There are $chars in $data1\n"; close(READFILE);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Analysing text files to obtain statistics on their content.
by kyle (Abbot) on Jun 25, 2008 at 17:10 UTC | |
|
Re: Analysing text files to obtain statistics on their content.
by toolic (Bishop) on Jun 25, 2008 at 17:09 UTC | |
|
Re: Analysing text files to obtain statistics on their content.
by moritz (Cardinal) on Jun 25, 2008 at 17:14 UTC | |
|
Re: Analysing text files to obtain statistics on their content.
by johngg (Canon) on Jun 25, 2008 at 19:43 UTC | |
by Gavin (Archbishop) on Jun 26, 2008 at 09:32 UTC |