Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Sorry for the stupid question. Im taking an informatics class and know barely anything about PERL. I am trying to find the syntax error listed and I know its something stupid Im missing.

syntax error at nucleotide-counting2.pl line 190, near "elsif" Missing right curly or square bracket at nucleotide-counting2.pl line 295, at en d of line Execution of nucleotide-counting2.pl aborted due to compilation errors.

# # # The directives below enforce variable declarations and ensure your p +rogram # will be parsed to provide more feedback about possible syntax and lo +gical errors. # use strict; # enforce variable declarations use warnings; # enable warnings # # Declare/initialize all variables used in main program body. # my ($A_count) = 0; # number of 'a' or 'A' in user-supplied s +equence my ($base) = ''; # an extracted letter from the sequence my ($C_count) = 0; # number of 'c' or 'C' in user-supplied s +equence my ($filename) = ''; # the name of the file containing the inp +ut sequence(s) my (@filesequences) = (); # array containing lines read from input +file my ($G_count) = 0; # number of 'g' or 'G' in user-supplied s +equence my ($i) = 0; # used to index into array of sequences my ($line_count) = 0; # number of lines read from input file my ($option) = 0; # the user's selected menu option my ($position) = 0; # the current position within the sequenc +e my ($sequence) = ''; # the DNA sequence to be processed my ($T_count) = 0; # number of 't' or 'T' in user-supplied s +equence # # Display welcome message and program explanation. # print "\n *****************************************************"; print "\n ** **"; print "\n ** Nucleotide Counter, version 2.0 **"; print "\n ** **"; print "\n ** Written by J.M Wagner 2014 **"; print "\n ** **"; print "\n *****************************************************"; print "\n * *"; print "\n * This program will count the number of different *"; print "\n * nucleotides present in the sequence you provide. *"; print "\n * Both uppercase and lowercase abbreviations will *"; print "\n * be counted. Input can be provided at the *"; + print "\n * keyboard or can be read from a file. *"; print "\n * *"; print "\n * *"; print "\n *****************************************************"; print "\n\n\t Program options:"; print "\n\t -----------------------------"; print "\n\t 1. Enter sequence at keyboard"; print "\n\t 2. Read input sequences from a file"; print "\n\t 3. Exit"; print "\n\n Please enter your menu option>"; $option = <STDIN>; # # Confirm that user has entered a valid menu option. # while (($option < 1) or ($option > 3)) { chomp $option; print "\n ERROR: The value $option is not a valid menu option."; print "\n\n Please enter your menu option>"; $option = <STDIN>; } # end while # # Obtain input DNA sequence from user. # # ******************************************************************** +****** # OPTION 1 - Obtain and process input sentence provided at keyboard. # ******************************************************************** +****** if ($option == 1) { # # Obtain input DNA sequence from user. # print "\n\n Please enter the DNA Sequence to be processed"; print "\n >"; $sequence = <STDIN>; print "\n Processed DNA sequence: $sequence"; # # Count the number of nucleotides by processing the sequence of charac +ters # one character at a time. # for ($position = 0; $position < length($sequence); ++$position) { $base = substr($sequence, $position, 1); if (($base eq 'a') or ($base eq 'A') or ($base eq 'c') or ($base eq 'C') or ($base eq 'g') or ($base eq 'G') or ($base eq 't') or ($base eq 'T')) { if (($base eq 'a') or ($base eq 'A')) { ++$A_count; } # end if if (($base eq 'c') or ($base eq 'C')) { ++$C_count; } # end if if (($base eq 'g') or ($base eq 'G')) { ++$G_count; } # end if if (($base eq 't') or ($base eq 'T')) { ++$T_count; } # end if } # end if } # end for # # Display final results. # print "\n Number of A nucleotides: $A_count"; print "\n Number of C nucleotides: $C_count"; print "\n Number of G nucleotides: $G_count"; print "\n Number of T nucleotides: $T_count"; print "\n\n Program completed successfully.\n"; }end if # ******************************************************************** +****** # OPTION 2 - Obtain and process input sentences provided in user file. # ******************************************************************** +****** elsif ($option == 2) { # # Read input sequence(s) from file provided by user. # print "\n\n Please enter the filename>"; $filename = <STDIN>; unless(open(SEQUENCEFILE, $filename)) { print "\n\t --> ERROR: Cannot open file: $filename"; exit; } @filesequences = <SEQUENCEFILE>; close(SEQUENCEFILE); $line_count = @filesequences; print "\n\t --> NOTE: File opened -- $line_count lines of text read.\n +"; # print @filesequences; # # Count the number of nucleotides by processing the sequence of charac +ters # one sequence at a time. # for ($i = 0; $i < $line_count; ++$i) { # # Initialize $sequence to the next line of text from file. # $sequence = $filesequences[$i]; print "\n $i. Processed sequence: $sequence"; # # Count the number of nucleotides in the current sequence. # $position = 0; while ($position < length($sequence)) { $base = substr($sequence, $position, 1); if (($base eq 'a') or ($base eq 'A') or ($base eq 'c') or ($base eq 'C') or ($base eq 'g') or ($base eq 'G') or ($base eq 't') or ($base eq 'T')) { if (($base eq 'a') or ($base eq 'A')) { ++$A_count; } # end if if (($base eq 'c') or ($base eq 'C')) { ++$C_count; } # end if if (($base eq 'g') or ($base eq 'G')) { ++$G_count; } # end if if (($base eq 't') or ($base eq 'T')) { ++$T_count; } # end if ++$position; } # end while } # end for print "\n Number of A nucleotides: $A_count"; print "\n Number of C nucleotides: $C_count"; print "\n Number of G nucleotides: $G_count"; print "\n Number of T nucleotides: $T_count"; print "\n\n Program completed successfully.\n"; } # end elsif # # ******************************************************************** +****** # Write final results to default output file. # ******************************************************************** +****** $filename = "results.dat"; unless(open(OUTFILE, ">$filename")) { print "\n\t --> ERROR: Cannot open file: $filename"; exit; } print "\n\n\t --> NOTE: File opened for writing: $filename\n"; print OUTFILE "\n Total number of Adenine: $A_count"; print OUTFILE "\n Total number of Cytosine: $C_count"; print OUTFILE "\n Total number of Guanine: $G_count"; print OUTFILE "\n Total number of Thymine: $T_count"; close(OUTFILE); print "\n\n Program completed successfully.\n"; exit;

Replies are listed 'Best First'.
Re: Syntax error
by McA (Priest) on Feb 10, 2014 at 00:07 UTC

    Have a look at

    }end if # ******************************************************************** +****** # OPTION 2 - Obtain and process input sentences provided in user file. # ******************************************************************** +****** elsif ($option == 2)

    you should find the problem. Strip all unnecessary lines away and you'll keep your hairs... ;-)

    Best regards
    McA

Re: Syntax error
by Eily (Monsignor) on Feb 10, 2014 at 00:16 UTC

    Just before option 2, you have written }end if. There's no such thing as an endif in Perl, you forgot to make it a comment :)

    I didn't read your whole script. One advice though, instead of ($base eq 'g' or $base eq 'G') you can write (lc $base eq 'g') (see lc). That should be the easy enough to understand. Then, if you want to try matching $base against several values you can write (grep { lc $base eq $_ } 'a', 'c', 'g', 't'). Read grep on that.

    A lot of Perl users would use regular expressions instead of grep and lc, you can have a look at perlretut on the subject, if you feel adventurous.

      Or better still, lowercase $base as soon as you set it:

      $base = lc substr($sequence, $position, 1);

      Then comparisons can just be like:

      if ($base eq 'g' or $base eq 't') { ... }

      ... and you don't need to worry about capital letters at all.

      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

        Yeah, I guess it might have been a better idea to look at least a little more through the code before giving advice :). I just commented on the big obvious thing that jumped on my face, but I'm sure there is more than one way to improve it (that does sound perlish already :D).

      Right. Furthermore, the perlish way is to avoid iterating over characters, if possible. Consider this:

      my $A_count = ($sequence =~ tr/Aa/Aa/);

Re: Syntax error
by Anonymous Monk on Feb 09, 2014 at 23:56 UTC

    I wanted to add I have learned a ton of respect for programmers in this project. I have put hours into to writing this simple program and I am ready to pull my hair out. How you guys do this on a daily basis is a testament to your patience.

      One of the things we usually do when writing perl code, which makes it quicker and easier and less frustrating, is to create subroutines, rather than repeating lots of lines of code.

      Apart from all the overly verbose comments and excessive printed messages to the user, you have about two times more lines of actual code than you really need, because a lot of it has been written twice. You should put the parts that are repeated into a subroutine, so that they only appear once, then call the subroutine from the places where you originally put each copy of identical code.

      The basic idea is called "DRY": Don't Repeat Yourself.

      Just to give you an idea of how little effort it should take to do the thing that your script does, here's a version that:
      • allows the user to specify all inputs and options at the command line, so the script knows what to do based on command-line args;
      • uses a hash instead of a set of separate scalar variables;
      • uses loops for input and output.
      #!/usr/bin/perl use strict; # enforce variable declarations use warnings; # enable warnings my $Usage = "Usage: $0 [sequence.file]\n (reads stdin if no file name + is given)\n"; die $Usage unless ( -p STDIN or ( @ARGV and -f $ARGV[0] )); my %counts; # tally of letter occurrences # read lines from named file(s) or from stdin: while (<>) { s/\s+$//; # remove white space (line termination) from end of lin +e for my $ltr ( split // ) { # split into individual characters $counts{$ltr}++; } } # print results: for my $ltr ( sort keys %counts ) { print " Number of $ltr nucleotides: $counts{$ltr}\n"; }
      Note that this version will count any character inventory you give it; if the input happens to contain something other than ACGT, you'll get the quantity of occurrence for all the letters that occur in the data.