zw has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am a COMPLETE beginner at Perl. I want to split my line of a file by \t and if a column matches either 'u','x','i',or 's' then push another column entry into an array. I have the following but it just says syntax error and I don't know how to fix this.
use strict; use warnings; my $filename = 'gff.annotated.gtf'; open(my $fh, '<:encoding(UTF-8)', $filename) or die "Can't open $filen +ame: $!"; my @transcript_id = (); my @lines = <$fh>; foreach my $lines(@lines) { my @column= split /\t/, $lines; foreach $element(@column) { if ($element[16] eq '"u"' || '"x"' || '"i"' || '"s"'){ push @transcript_id, $element[10]; } } print @transcript_id;

Replies are listed 'Best First'.
Re: if array contain push another array
by Athanasius (Archbishop) on Nov 22, 2017 at 03:36 UTC

    Hello zw, and welcome to the Monastery!

    First, there is a right brace (curly bracket) missing, but that’s hard to see because of the way the code is formatted. A little reformatting —

    use strict; use warnings; my $filename = 'gff.annotated.gtf'; open(my $fh, '<:encoding(UTF-8)', $filename) or die "Can't open $filen +ame: $!"; my @transcript_id = (); my @lines = <$fh>; foreach my $lines (@lines) { my @column= split /\t/, $lines; foreach $element (@column) { if ($element[16] eq '"u"' || '"x"' || '"i"' || '"s"') { push @transcript_id, $element[10]; } } print @transcript_id;

    — and the missing brace is easily seen.

    Second, you have use strict (good!), so $element needs to be declared with my:

    foreach my $element (@column) # ^^

    Third, $element is a scalar variable, which on each iteration of the loop holds a single element of the @column array. But the expressions $element[16] and $element[10] reference an entirely different variable, an array named @element which you haven’t declared. In the case of $element[16], you probably just want to use $element. In the case of $element[10], I’m not sure what you want to do; maybe you meant $column[10]?

    Fourth, Perl syntax requires that you make each comparison separately:

    if ($element eq 'u' || $element eq 'x' || $element eq 'i' || $element eq 's') { push @transcript_id, $column[10]; }

    Fifth, note that in the above snippet I have removed the extra quotation marks. Your problem description implies that you want to test for equality with the character u, not with the 3-character string "u" as you have in your code.

    Sixth (and finally!), I think the logic of your inner foreach loop is questionable. Do you want to push to @transcript_id on each match in the line, or only once per line if a match is found? If the latter, you need to break out of the loop after the first match:

    foreach $element (@column) { if ($element eq 'u' || $element eq 'x' || $element eq 'i' || $element eq 's') { push @transcript_id, $column[10]; last; } }

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: if array contain push another array
by 1nickt (Canon) on Nov 22, 2017 at 03:16 UTC

    Hi, what is the syntax error according to the error message?

    The code you posted produces:

    Global symbol "$element" requires explicit package name (did you forge +t to declare "my $element"?) at 1203983.pl line 14. Global symbol "@element" requires explicit package name (did you forge +t to declare "my @element"?) at 1203983.pl line 16. Global symbol "@element" requires explicit package name (did you forge +t to declare "my @element"?) at 1203983.pl line 17. Missing right curly or square bracket at 1203983.pl line 21, at end of + line syntax error at 1203983.pl line 21, at EOF Execution of 1203983.pl aborted due to compilation errors.
    Note that the last line of the message tells you exactly where one of the errors is.

    I assume you also didn't have the undeclared variable error in your real code, but it was a typing error here. See:

    You have a number of unneeded loops and variables in your script. Meanwhile the main error is in your attempt to match one of a list of values against a string. Please see perlrequick for a beginner's intro to regular expressions. You might have to go on to perlretut for character classes, which is what's used below.

    Here's a version that does what you want, with sample data and a Test::More test to verify that the code is working correctly.

    use strict; use warnings; use feature 'say'; use Test::More tests => 1; # my $filename = 'gff.annotated.gtf'; # open(my $fh, '<:encoding(UTF-8)', $filename) # or die "Can't open $filename: $!"; my $fh = \*DATA; my @transcript_ids; while ( my $line = <$fh> ) { my @columns = split / /, $line; if ( $columns[16] =~ /^[uxis]$/ ) { push @transcript_ids, $columns[10]; } } is_deeply( \@transcript_ids, [qw/ bb cc dd ee /], 'the right lines were matched' ); __DATA__ 1: 01 02 03 04 05 06 07 08 09 aa 11 12 13 14 15 a 2: 01 02 03 04 05 06 07 08 09 bb 11 12 13 14 15 u 3: 01 02 03 04 05 06 07 08 09 cc 11 12 13 14 15 x 4: 01 02 03 04 05 06 07 08 09 dd 11 12 13 14 15 i 5: 01 02 03 04 05 06 07 08 09 ee 11 12 13 14 15 s 6: 01 02 03 04 05 06 07 08 09 ff 11 12 13 14 15 z
    Note that I am using the _DATA_ section but you can read from a file just as well.

    Update: While I was updating this node to show some helpful links and an example, Brother Athanasius composed his excellent reply (although note that he and I came to different conclusions about what your actual problem spec is).

    Hope this helps!


    The way forward always starts with a minimal test.