in reply to Re: Add colums and rows
in thread Add colums and rows

Thank you, this gets me closer to what I need!! I guess I will need two more things.... How do I print the table to a file. I can open a file handle but I dont know how to print the entire table to a file, this far i only figured out how to print lines... (perl baby steps!) The next thing, can I modify this to count the shared words between stories, too? Thanks!

Replies are listed 'Best First'.
Re^3: Add colums and rows
by apl (Monsignor) on Apr 22, 2008 at 12:54 UTC
    i can open a file handle but I dont know how to print the entire table to a file

    Let's assume you opened filehandle REPORT. The following code from Grandfather

    for my $word (sort keys %freqs) { $freqs{$word}{$_} ||= 0 for keys %{$freqs{all}}; printf "$word\t"; print join "\t", join "\t", map $freqs{$word}{$_}, sort keys % +{$freqs{$word}}; print "\n"; }

    could be changed to:

    for my $word (sort keys %freqs) { $freqs{$word}{$_} ||= 0 for keys %{$freqs{all}}; printf REPORT "$word\t"; print REPORT join "\t", join "\t", map $freqs{$word}{$_}, sort + keys %{$freqs{$word}}; print REPORT "\n"; }
    The next thing, can I modify this to count the shared words between stories, too?
    Certainly. Will you be reading several stories from the same DATA block? Then you don't need to change a thing.

    Will you change the program to read from several different input files? How do you want to give those filenames to your program?

      That was easy!! I will read from one nig file, just like Grandfather data sample One big file that will look like
      [Story 1] blblblblbalblablablabal blblblblbalblablablabal [Story 2] blblblblbalblablablabal blblblblbalblablablabal [Story 3] blblblblbalblablablabal blblblblbalblablablabal [Story 4] blblblblbalblablablabal blblblblbalblablablabal etc...
Re^3: Add colums and rows
by lechateau (Initiate) on Apr 23, 2008 at 15:11 UTC
    Thanks everyone that helped me to get this done. I was finally able to do the shared word count on my own! Thanks again for all your time and help! And apl Trivial is a very relative term!!! I spent 20 hours to get the counting piece done. :)
    $filename = "tryit.txt"; open(IN, $filename) || die; open(OUT, ">test1.csv") || die; open(OUT1, ">test2.csv") || die; my %freqs; my $story; # Current story name while (<IN>) { if (/^\<(.*)\>\s*$/) { $story = ucfirst $1; die "Duplicate story title: $story" if exists $freqs{$story}; next; } next unless defined $story; # wait until we have a story title s/[\.,:;\?"!\(\)\[\]\{\}(--)_]//g; for my $word (/\w+/g) { # Current story counts $word = ucfirst $word; $freqs{$word}{$story}++; $freqs{all}{$story}++; # Total counts $freqs{$word}{total}++; $freqs{all}{total}++; } } # Print title line print OUT "\t", (join "\t", sort keys %{$freqs{all}}), "\n"; # Print table for my $word (sort keys %freqs) { $freqs{$word}{$_} ||= 0 for keys %{$freqs{all}}; printf OUT "$word\t"; print OUT join "\t", join "\t", map $freqs{$word}{$_}, sort keys %{ +$freqs{$word}}; print OUT "\n"; } @info=sort keys %{$freqs{all}}; my @countop; for($i=0;$i<scalar(@info)-1;$i=$i+1) { $story1=$info[$i]; for($j=0;$j<scalar(@info)-1;$j=$j+1) { $story2=$info[$j]; for my $word (sort keys %freqs) { $freqs{$word}{$_} ||= 0 for keys %{$freqs{all}}; if($freqs{$word}{$story1} > 0){ if($freqs{$word}{$story2} > 0){ $countop[$i][$j]++;}} } } } my $m; my $n; for($m=0;$m<scalar(@info)-1;$m=$m+1) { for($n=0;$n<scalar(@info)-1;$n=$n+1) { printf OUT1 "$info[$m],$info[$n],$countop[$m][$n] \n"; } } close IN; close OUT; close OUT1;

      A few things to keep in mind while you learn Perl:

      • Always use strictures (use strict; use warnings).
      • Use the three parameter version of open.
      • Use consistent indentation.
      • Don't use C style for loops unless you really really must.

      Consider the following "cleaned up" version of your added code:

      my @info = sort keys %{$freqs{all}}; my %countop; for my $story1 (@info) { next if $story1 eq 'total'; for my $story2 (@info) { next if $story2 eq 'total'; for my $word (sort keys %freqs) { next unless $freqs{$word}{$story1}; next unless $freqs{$word}{$story2}; $countop{$story1}{$story2}++; } } } for my $m (sort keys %countop) { for my $n (sort keys %{$countop{$m}}) { printf OUT1 "$m, $n, $countop{$m}{$n}\n"; } }

      Perl is environmentally friendly - it saves trees