crep has asked for the wisdom of the Perl Monks concerning the following question:

Hello all. I have scoured this source for errors and can find none. However, when run with ActiveState on windows, it tells me: Bad switch statement (problem in parenthesies?) in frequency.pl line 291. That's funny, because 291 is the last line of the file, and there is no code on it. :(. Here's the code. Basically, it goes through a file, and tells you how often a charactor occurs. This is usefull for cryptographic things. BTW, yes I know there must be better/more efficient ways of doing this. But I am yet a newbie to perl.
#!/usr/bin/perl use Switch; print "Frequency Counter\nEnter file to analyze:"; my $infile = <STDIN>; chomp $infile; my $indata = ""; my $count=0; my $a=0; my $b=0; my $c=0; my $d=0; my $e=0; my $f=0; my $g=0; my $h=0; my $i=0; my $j=0; my $k=0; my $l=0; my $m=0; my $n=0; my $o=0; my $p=0; my $q=0; my $r=0; my $s=0; my $t=0; my $u=0; my $v=0; my $w=0; my $x=0; my $y=0; my $z=0; my $num0=0; my $num1=0; my $num2=0; my $num3=0; my $num4=0; my $num5=0; my $num6=0; my $num7=0; my $num8=0; my $num9=0; open(F,"$infile") or die "can't open $infile: $!"; while (read(F, $buffer, 1024)) { chomp $buffer; $indata = $indata.$buffer; } close F; print "Now analyzing..."; my $i=0; my $len = length $indata; my $tmp = ""; for (($i=0; $i<$len; $i++) { $tmp = uc(substr(($indata,$i,1)); switch (($tmp) { case "A" { $a += 1; $count = $count + 1; } case "B" { $b += 1; $count = $count + 1; } case "C" { $c += 1; $count = $count + 1; } case "D" { $d += 1; $count = $count + 1; } case "E" { $e += 1; $count = $count + 1; } case "F" { $f += 1; $count = $count + 1; } case "G" { $g += 1; $count = $count + 1; } case "H" { $h += 1; $count = $count + 1; } case "I" { $i += 1; $count = $count + 1; } case "J" { $j += 1; $count = $count + 1; } case "K" { $k += 1; $count = $count + 1; } case "L" { $l += 1; $count = $count + 1; } case "M" { $m += 1; $count = $count + 1; } case "N" { $n += 1; $count = $count + 1; } case "O" { $o += 1; $count = $count + 1; } case "P" { $p += 1; $count = $count + 1; } case "Q" { $q += 1; $count = $count + 1; } case "R" { $r += 1; $count = $count + 1; } case "S" { $s += 1; $count = $count + 1; } case "T" { $t += 1; $count = $count + 1; } case "U" { $u += 1; $count = $count + 1; } case "V" { $v += 1; $count = $count + 1; } case "W" { $w += 1; $count = $count + 1; } case "X" { $x += 1; $count = $count + 1; } case "Y" { $y += 1; $count = $count + 1; } case "Z" { $z += 1; $count = $count + 1; } case "0" { $num0 += 1; $count = $count + 1; } case "1" { $num1 += 1; $count = $count + 1; } case "2" { $num2 += 1; $count = $count + 1; } case "3" { $num3 += 1; $count = $count + 1; } case "4" { $num4 += 1; $count = $count + 1; } case "5" { $num5 += 1; $count = $count + 1; } case "6" { $num6 += 1; $count = $count + 1; } case "7" { $num7 += 1; $count = $count + 1; } case "8" { $num8 += 1; $count = $count + 1; } case "9" { $num9 += 1; $count = $count + 1; } } } system("cls"); system("clear"); print "Count for $infile.\nOf $count charactors:\n\n"; print "A occured $a times, or ".($a/$count)."%.\n"; print "B occured $b times, or ".($b/$count)."%.\n"; print "C occured $c times, or ".($c/$count)."%.\n"; print "D occured $d times, or ".($d/$count)."%.\n"; print "E occured $e times, or ".($e/$count)."%.\n"; print "F occured $f times, or ".($f/$count)."%.\n"; print "G occured $g times, or ".($g/$count)."%.\n"; print "H occured $h times, or ".($h/$count)."%.\n"; print "I occured $i times, or ".($i/$count)."%.\n"; print "J occured $j times, or ".($j/$count)."%.\n"; print "K occured $k times, or ".($k/$count)."%.\n"; print "L occured $l times, or ".($l/$count)."%.\n"; print "M occured $m times, or ".($m/$count)."%.\n"; print "N occured $n times, or ".($n/$count)."%.\n"; print "O occured $o times, or ".($o/$count)."%.\n"; print "P occured $p times, or ".($p/$count)."%.\n"; print "Q occured $q times, or ".($q/$count)."%.\n"; print "R occured $r times, or ".($r/$count)."%.\n"; print "S occured $s times, or ".($s/$count)."%.\n"; print "T occured $t times, or ".($t/$count)."%.\n"; print "U occured $u times, or ".($u/$count)."%.\n"; print "V occured $v times, or ".($v/$count)."%.\n"; print "W occured $w times, or ".($w/$count)."%.\n"; print "X occured $x times, or ".($x/$count)."%.\n"; print "Y occured $y times, or ".($y/$count)."%.\n"; print "Z occured $z times, or ".($z/$count)."%.\n"; print "0 occured $num0 times, or ".($num0/$count)."%.\n"; print "1 occured $num1 times, or ".($num1/$count)."%.\n"; print "2 occured $num2 times, or ".($num2/$count)."%.\n"; print "3 occured $num3 times, or ".($num3/$count)."%.\n"; print "4 occured $num4 times, or ".($num4/$count)."%.\n"; print "5 occured $num5 times, or ".($num5/$count)."%.\n"; print "6 occured $num6 times, or ".($num6/$count)."%.\n"; print "7 occured $num7 times, or ".($num7/$count)."%.\n"; print "8 occured $num8 times, or ".($num8/$count)."%.\n"; print "9 occured $num9 times, or ".($num9/$count)."%.\n";
Thanks, -Jack C jack@crepinc.com

Replies are listed 'Best First'.
Re: Switch statement?
by Corion (Patriarch) on Jul 18, 2004 at 20:34 UTC

    The problem is most likely because you have one too many opening parenthesis somewhere, and Switch.pm only knows that when it has reached the end of the file.

    But you can write your program much much easier and shorter and thus avoid the counting of parentheses by using a hash, which maps each character to the number it occurs:

    #!/usr/bin/perl -w use strict; my $infile = $ARGV[0]; open INPUT, "<", $infile or die "Couldn't open '$infile' : $!"; my $input = do { local $/; <INPUT> }; my %histogram; for my $char (split //, $input) { $histogram{$char}++; }; for (keys %histogram) { print "$_ occurred $histogram{$_} times.\n"; };
      Corion is right; there are two places where you actually have one too many open parens:
      $tmp = uc(substr(($indata,$i,1)); # a bc cb ... no a switch (($tmp) { # ab b ... no a
        for (($i=0; $i<$len; $i++) { # ab b ... no a
Re: Switch statement?
by revdiablo (Prior) on Jul 19, 2004 at 01:58 UTC

    I'd like to expand a bit on Corion's fine response. In general, you should take it as a big red flag whenever you have many (and in your case, MANY) separate variables, which are all somehow related. Perhaps they share a common prefix, or are an incrementing pattern (such as your code). This red flag should make you consider using a datastructure that actually relates the values together by more than just a name.

    Another big red flag is repetition. Having 36 lines of code that are only different by a few characters should make you stop and think. Copying and pasting things many times is not something you should be doing very often. Subroutines often help avoid this (all-too-common) syndrome.

    Both of these things are present in your program, and in very large doses. Corion gives a nice example of how to clean this specific code up, but the ideas are more general. Use real data structures, use loops, and use subroutines. Copying and pasting is almost always wrong. :-)

    Update: cleaned up some wording.

      Thank you... I suppose I should be ashamed that I wrote a 300 line script that didn't work while corian made one that worked in 20 lines... I was trying indeed to find a way around all that repitition. I know c fairly well, and beleive me, I don't write c like that. But I couldn't figure out how to save all of those in one variable. I was thinking about an array, but I'm just not yet comfortable enough with perl to scripts a function to do that, etc. Thanks again. -Jack C jack@crepinc.com
        Well, I've used the code corian posted. And, o course, it works perfectly. Except, I edited it in places, and now wierd stuff happens. :(. Basically, when I use the -c or -r option, it works great. But when I use the -a option, it tells me I've left off the option, but doesn't give me the error until AFTER it prints SOME of the data. Strange. The other thing is, why does the order change everytime the script is run?
        #!/usr/bin/perl -w use strict; my $infile = $ARGV[0]; my $act = $ARGV[1]; my $included = "n"; my $lvar =""; my @validchars = ("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"); if ($#ARGV < 1) { print "Usage: $0 <file> <chars> Where <chars> can be -a or -c: -a: show all charactors. (ie, spaces, etc.) -c: show only regular charactors, but keep case. (ie, x is diff +erent than X.) -r: show only regular charactors, disregard case. (ie, t and T +are the same.)\n"; exit; } my $count = 0; system("cls"); #win system("clear"); #*nix open INPUT, "<$infile" or die "Couldn't open '$infile' : $!"; my $input = do { local $/; <INPUT> }; my %histogram; for my $char (split //, $input) { $included = "n"; if ($act eq "-a") { $histogram{$char}++; $count++; } elsif ($act eq "-c") { foreach $lvar (@validchars){ if ($lvar eq uc($char)) {$included = "y";} } if ($included eq "y") { $histogram{$char}++; $count++; } } elsif ($act eq "-r") { foreach $lvar (@validchars){ if ($lvar eq uc($char)) {$included = "y";} } if ($included eq "y") { $histogram{uc($char)}++; $count++; } } else { print "Bad charactor option: '$act'.\n"; exit; } }; print "Of $count charactors:\n\n"; for (keys %histogram) { if ($_ eq "\n") { print "'\\n' occurred $histogram{$_} times. (".(($ +histogram{$_}/$count)*100)."%)\n"} else { print "'$_' occurred $histogram{$_} times. (".(($histogram{$ +_}/$count)*100)."%)\n" } };
        Thanks again, -Jack C jack@crepinc.com
        There is no shame in learning something. The shame is if you don't learn it.

Re: Switch statement?
by pbeckingham (Parson) on Jul 18, 2004 at 22:39 UTC

    I see:

    for (($i=0; $i<$len; $i++) { $tmp = uc(substr(($indata,$i,1)); switch (($tmp) { ... } } }
    Which is wrong in three ways - at least - with regards to brackets and parentheses. Four, if the "design" of the program is included.

Re: Switch statement?
by pg (Canon) on Jul 18, 2004 at 23:11 UTC
    "But I am yet a newbie to perl."

    Allow me to be blunt, you probably interpreted the situation wrong. Perl syntax or switch syntax is not the big deal here (although it is wrong), by looking at the fact you thought switch was needed, I believe, what important, and seems to be missing here is the skill to analyze the problem, and the ability to come up the right algorithm.

    I am not try to hurt your feeling, not at all, but rather point out your weak point, so you know what you need to improve. Pay a little bit more attention to that area, it helps.

    But hey, with your couragement, no issue is unresolvable!

      Since he's a newbie, he was probably unaware of hashes, which is the "Perl way" to address this kind of problem. Without hashes, you get to self-flagellate with arrays and switches.

      It's the fault of all the other languages that don't expose people to the concept of a hash.

Re: Switch statement?
by crep (Novice) on Jul 19, 2004 at 21:26 UTC
    Errg... corian, you are amazing.. but I think I've screwed up you nice code in trying to put it in with the old. I get all sorts of errors, particularly one about a bad charactor, 0xff. Take a look:
    #!/usr/bin/perl -w #use strict; my $infile = $ARGV[0]; my $act = $ARGV[1]; my $included = "n"; my $lvar =""; my %validchars; if ($#ARGV < 1) { print "Usage: $0 <file> <chars> Where <chars> can be -a, -c, or -r: -a: show all charactors. (ie, spaces, etc.) -c: show only regular charactors, but keep case. (ie, x is diff +erent than X.) -r: show only regular charactors, disregard case. (ie, t and T +are the same.)\n"; exit; } if ( $act eq '-a' ) { %validchars = map { $_ => uc $_ } ( \x00 .. \xff ); } elsif ( $act eq '-c' ) { %validchars = map { $_ => $_ } ( 'A' .. 'Z', 'a'..'z' ); }; my $count = 0; system("cls"); #win system("clear"); #*nix open INPUT, "<$infile" or die "Couldn't open '$infile' : $!"; my $input = do { local $/; <INPUT> }; my %histogram; for my $char (split //, $input) { if (exists $validchars{$_}) { my $target = $validchars{$_}; $count++; $histogram{$target}++}; }; print "Of $count charactors:\n\n"; for (sort keys %histogram) { if ($_ eq "\n") { print "'\\n' occurred $histogram{$_} times. (".(($ +histogram{$_}/$count)*100)."%)\n"} else { print "'$_' occurred $histogram{$_} times. (".(($histogram{$ +_}/$count)*100)."%)\n" } };
    Thanks again guys (monks)! -Jack

      Sorry for giving you code that didn't directly work earlier - please reply directly to my posts if you expect me to read them, as only then I get a notification.

      The correct code for creating the list of keys and the list of where they should be mapped to is the following. Previously, I had thought that the range operator .. would also work for characters outside the printable range, but it works its magic only for strings between A and Z...

      First, always keep use strict; enabled - it will help you catch spelling errors and other hard to track errors. use strict; is your friend!

      The initialization of the validchars map can be done as follows:

      if ( $act eq '-a' ) { %validchars = map { $_ => uc $_ } map { pack "C", $_ } ( 0 .. 255 ); } elsif ( $act eq '-c' ) { %validchars = map { $_ => $_ } map { pack "C", $_ } ( ord('A').. ord('Z'), ord('a').. +ord('z') ); };

      The sequence of map calls is converting the list at the end of that sequence in two steps. First, the list is converted from the numbers to the characters represented by these numbers, and then, in the first map, every character is converted to a pair mapping the character onto the character that is to be counted.

        Hm... well I switched out the statement, and I get several lines of the same error:
        Use of uninitialized value in exists at frequency-new.pl line 41, <INPUT> line 1.
        where line 41 is:
          if (exists $validchars{$_}) {

        Hm.
        Thanks again, -Jack C