Arengin has asked for the wisdom of the Perl Monks concerning the following question:

Hi.
I have the following code


#!/usr/bin/perl open(FILE1, $ARGV[0]) || die "Error: $!\n"; @lines = <FILE1>; while ($line=<FILE1>){ if ($line=~/^BEGIN_TAG/){ <FILE1>; #get next line if ($line=~/^X1/ .. /^X2/){ # get all between X1 and X2 print $line; # print all of it } } }

I need to get the values between X1 and X2 even if that is over multiple line and print them out concatenated (if oder more than 1 line)
However I'm not getting anything from the above.... The inputfile (inputtext.txt) looks like:

BEGIN_TAG X1 test test test X2 no no no no X3 yes yes yes BEGIN_TAG X1 test test test tes test test X2 no no no no no nono non no no nononono no no no X3 hi hi hi hi hi hi hi hi hi hi hi hi

Any ideas? I get for you this is trivial but I'm stuck at this a weekend over already.
Thanks for any help you can provide.
Arengin

Original contents restored above by GrandFather

Hi.

This is an edited question since the first part of the problem is solved

The code I have is:

#!/usr/bin/perl use strict; use warnings; open (FILE, '<', $ARGV[0]) or die "Could not open file: $!"; my $saveX1=0; my $saveX2=0; my $saveX3=0; my $savetoX1=''; my $savetoX2=''; my $savetoX3=''; while (my $line=<FILE>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$saveX1) && $parts[0] eq 'X1') { $savetoX1=$parts[1]; $saveX1 +=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX1."\n" if ($savetoX1) ; $savetoX1=''; $saveX1=0; } elsif ($saveX1) { $savetoX1.=' '.$line; } if ( (!$saveX2) && $parts[0] eq 'X2') { $savetoX2=$parts[1]; $saveX2 +=1; } elsif ($parts[0] eq 'X3' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX2."\n" if ($savetoX2) ; $savetoX2=''; $saveX2=0; } elsif ($saveX2) { $savetoX2.=' '.$line; } if ( (!$saveX3) && $parts[0] eq 'X3') { $savetoX3=$parts[1]; $saveX3 +=1; } elsif ($parts[0] eq 'BEGIN' && $parts[2] eq 'TAG' ){ #print $savetoX3."\n" if ($savetoX3) ; $savetoX3=''; $saveX3=0; } elsif ($saveX3) { $savetoX3.=' '.$line; } } print "X1: ".$savetoX1."\n" if ($savetoX1) ; print "X2: ".$savetoX2."\n" if ($savetoX2) ; print "X3: ".$savetoX3."\n" if ($savetoX3) ;

The values of the print at the end are all messed up here is what I have and what I sould have:

The wrong one (the one I have) I get is: <br/> <code> X3: yes yes yes BEGIN_TAG X1 test test test tes test test X2 no + no no no no nono non no no nononono no no no X3 hi hi hi hi hi hi + hi hi hi hi hi hi


The right one would be:
X1: test test test test test test tes test test X2: no no no no no no no no no nono non no no nononono no no no X3: yes yes yes hi hi hi hi hi hi hi hi hi hi hi hi



The inputfile (inputtext.txt) looks like:

BEGIN_TAG X1 test test test X2 no no no no X3 yes yes yes BEGIN_TAG X1 test test test tes test test X2 no no no no no nono non no no nononono no no no X3 hi hi hi hi hi hi hi hi hi hi hi hi

Any ideas? I get for you this is trivial but I'm stuck at this a weekend over already.
Thanks for any help you can provide.
Arengin

Replies are listed 'Best First'.
Re: Scanning a file and extracting certains values within one or multiple lines
by Corion (Patriarch) on Mar 13, 2017 at 09:49 UTC

    Here you read in the whole file:

    open(FILE1, $ARGV[0]) || die "Error: $!\n"; @lines = <FILE1>;

    ... and here you try to read some more, which fails:

    while ($line=<FILE1>){

    Often it helps to add debug print statements to show you the progress of your program, so you see which loops it enters and which it skips.

Re: Scanning a file and extracting certains values within one or multiple lines
by Discipulus (Canon) on Mar 13, 2017 at 10:12 UTC
    Hello Arengin and welcome to the monastery and to the wonderful world of Perl!

    as the wise Corion already said in list context (given by the array @lines = < ) the diamond operator read all the file and so you r next call, in scalar context this time $line=<FILE1> , to the diamond operator will retrieve nothing.

    I can just add to always use use strict; use warnings at the very beginning of your program.

    Also I suggest you to always use the 3 args form for open and using a lexical scoped filehadle instead of the bareword one:

    open my $file_handle, "<", $filepath or die "unable to open [$filepath] in reading monde!";

    also the autodie can be useful for such open situation.

    L*

    PS: it seems i'm slow answering this morning..

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Scanning a file and extracting certains values within one or multiple lines
by huck (Prior) on Mar 13, 2017 at 10:06 UTC

    open(FILE1, $ARGV[0]) || die "Error: $!\n"; my $save=0; my $saveto=''; while (my $line=<FILE1>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$save) && $parts[0] eq 'X1') { $saveto=$parts[1]; $save=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TAG' ) ){ print $saveto."\n" if ($saveto) ; $saveto=''; $save=0; } elsif ($save) { $saveto.=' '.$line;); } } print $saveto."\n" if ($saveto) ;
    Im not gonna harp about lexical filehandles or 3 arg opens, somebody else can.

    @lines = <FILE1>; "ate" the whole file at once. there was nothing left for the rest of your loop that wouldnt work anyway.

    There better be a X2 or BEGIN TAG before a X3 or it will eat it too.

      Thanks for all the help.
Re: Scanning a file and extracting certains values within one or multiple lines
by Arengin (Novice) on Mar 13, 2017 at 10:44 UTC
    Ok I used the code in the comments and tryed adding the scanning for X2 and X3 the same way as for X1 but for some reason the saveto vars are mixed up at the end
    The wrong one I get is:
     X3: yes yes yes BEGIN_TAG X1 test test test tes test test X2 no no no no no  nono non no no  nononono no no  no X3 hi hi hi hi hi hi  hi hi hi hi  hi hi

    The right one would be:
    X1: test test test test test test tes test test X2: no no no no no no no no no nono non no no nononono no no no X3: yes yes yes hi hi hi hi hi hi hi hi hi hi hi hi



    The code I have is:

    #!/usr/bin/perl use strict; use warnings; open (FILE, '<', $ARGV[0]) or die "Could not open file: $!"; my $saveX1=0; my $saveX2=0; my $saveX3=0; my $savetoX1=''; my $savetoX2=''; my $savetoX3=''; while (my $line=<FILE>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$saveX1) && $parts[0] eq 'X1') { $savetoX1=$parts[1]; $saveX1 +=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX1."\n" if ($savetoX1) ; $savetoX1=''; $saveX1=0; } elsif ($saveX1) { $savetoX1.=' '.$line; } if ( (!$saveX2) && $parts[0] eq 'X2') { $savetoX2=$parts[1]; $saveX2 +=1; } elsif ($parts[0] eq 'X3' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX2."\n" if ($savetoX2) ; $savetoX2=''; $saveX2=0; } elsif ($saveX2) { $savetoX2.=' '.$line; } if ( (!$saveX3) && $parts[0] eq 'X3') { $savetoX3=$parts[1]; $saveX3 +=1; } elsif ($parts[0] eq 'BEGIN' && $parts[2] eq 'TAG' ){ #print $savetoX3."\n" if ($savetoX3) ; $savetoX3=''; $saveX3=0; } elsif ($saveX3) { $savetoX3.=' '.$line; } } print "X1: ".$savetoX1."\n" if ($savetoX1) ; print "X2: ".$savetoX2."\n" if ($savetoX2) ; print "X3: ".$savetoX3."\n" if ($savetoX3) ;

      I'm not entirely sure I understand the rules by which you want newlines to be added to the output or not. In the following, if you don't want newlines in the output, change the two occurrences of "\n" to " ".

      #!/usr/bin/env perl use warnings; use strict; my %data; my $cur_tag; while (<DATA>) { chomp; if ( my ($tag,$line) = /^(X\S+)\s+(.+)$/ ) { $data{$tag} .= "\n" if exists $data{$tag}; $data{$tag} .= $line; $cur_tag = $tag; } elsif ($cur_tag && !/^BEGIN_TAG/) { $data{$cur_tag} .= "\n".$_; } } for my $tag (sort keys %data) { print "$tag: $data{$tag}\n"; } __DATA__ BEGIN_TAG X1 test1 test1 test1 X2 no1 no1 no1 no1 X3 yes1 yes1 yes1 BEGIN_TAG X1 test2 test2 test2 test3 test3 test3 X2 no2 no2 no2 no2 no2 no3 no3 no no3 no3 no4 no4 no4 no4 no4 no4 no4 X3 hi2 hi2 hi2 hi2 hi2 hi2 hi3 hi3 hi3 hi3 hi4 hi4

      Output:

      X1: test1 test1 test1 test2 test2 test2 test3 test3 test3 X2: no1 no1 no1 no1 no2 no2 no2 no2 no2 no3 no3 no no3 no3 no4 no4 no4 no4 no4 no4 no4 X3: yes1 yes1 yes1 hi2 hi2 hi2 hi2 hi2 hi2 hi3 hi3 hi3 hi3 hi4 hi4

      Update: Inverted "\n" vs " ".

        Wow that's it thank you!!!!

      Sorry, i missed the underscore in BEGIN_TAG, and i even thought i remembered going up to look to make sure. That meant that BEGIN_TAG didnt serve as the ending trigger like it should have, instead i was looking for "BEGIN TAG" with a space. So this would fix my example.

      open(FILE1, $ARGV[0]) || die "Error: $!\n"; my $save=0; my $saveto=''; while (my $line=<FILE1>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$save) && $parts[0] eq 'X1') { $saveto=$parts[1]; $save=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN_TAG' ) ){ print $saveto."\n" if ($saveto) ; $saveto=''; $save=0; } elsif ($save) { $saveto.=' '.$line;); } } print $saveto."\n" if ($saveto) ;
      And you never said you wanted to capture all of them to the end, so after each set was complete (bu hitting X2 or (now fixed to ) BEGIN_TAG i printed it, AND reset $saveto to blank. When you commented out the print, you didnt comment out the reset to blank, so only the last set was saved.

      Please show wrong and expected output and code with proper indentation.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

        Edited with wrong and expected output plus formatted source