in reply to Scanning a file and extracting certains values within one or multiple lines

Ok I used the code in the comments and tryed adding the scanning for X2 and X3 the same way as for X1 but for some reason the saveto vars are mixed up at the end
The wrong one I get is:
 X3: yes yes yes BEGIN_TAG X1 test test test tes test test X2 no no no no no  nono non no no  nononono no no  no X3 hi hi hi hi hi hi  hi hi hi hi  hi hi

The right one would be:
X1: test test test test test test tes test test X2: no no no no no no no no no nono non no no nononono no no no X3: yes yes yes hi hi hi hi hi hi hi hi hi hi hi hi



The code I have is:

#!/usr/bin/perl use strict; use warnings; open (FILE, '<', $ARGV[0]) or die "Could not open file: $!"; my $saveX1=0; my $saveX2=0; my $saveX3=0; my $savetoX1=''; my $savetoX2=''; my $savetoX3=''; while (my $line=<FILE>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$saveX1) && $parts[0] eq 'X1') { $savetoX1=$parts[1]; $saveX1 +=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX1."\n" if ($savetoX1) ; $savetoX1=''; $saveX1=0; } elsif ($saveX1) { $savetoX1.=' '.$line; } if ( (!$saveX2) && $parts[0] eq 'X2') { $savetoX2=$parts[1]; $saveX2 +=1; } elsif ($parts[0] eq 'X3' ||($parts[0] eq 'BEGIN' && $parts[2] eq 'TA +G' ) ){ #print $savetoX2."\n" if ($savetoX2) ; $savetoX2=''; $saveX2=0; } elsif ($saveX2) { $savetoX2.=' '.$line; } if ( (!$saveX3) && $parts[0] eq 'X3') { $savetoX3=$parts[1]; $saveX3 +=1; } elsif ($parts[0] eq 'BEGIN' && $parts[2] eq 'TAG' ){ #print $savetoX3."\n" if ($savetoX3) ; $savetoX3=''; $saveX3=0; } elsif ($saveX3) { $savetoX3.=' '.$line; } } print "X1: ".$savetoX1."\n" if ($savetoX1) ; print "X2: ".$savetoX2."\n" if ($savetoX2) ; print "X3: ".$savetoX3."\n" if ($savetoX3) ;

Replies are listed 'Best First'.
Re^2: Scanning a file and extracting certains values within one or multiple lines
by haukex (Archbishop) on Mar 13, 2017 at 11:09 UTC

    I'm not entirely sure I understand the rules by which you want newlines to be added to the output or not. In the following, if you don't want newlines in the output, change the two occurrences of "\n" to " ".

    #!/usr/bin/env perl use warnings; use strict; my %data; my $cur_tag; while (<DATA>) { chomp; if ( my ($tag,$line) = /^(X\S+)\s+(.+)$/ ) { $data{$tag} .= "\n" if exists $data{$tag}; $data{$tag} .= $line; $cur_tag = $tag; } elsif ($cur_tag && !/^BEGIN_TAG/) { $data{$cur_tag} .= "\n".$_; } } for my $tag (sort keys %data) { print "$tag: $data{$tag}\n"; } __DATA__ BEGIN_TAG X1 test1 test1 test1 X2 no1 no1 no1 no1 X3 yes1 yes1 yes1 BEGIN_TAG X1 test2 test2 test2 test3 test3 test3 X2 no2 no2 no2 no2 no2 no3 no3 no no3 no3 no4 no4 no4 no4 no4 no4 no4 X3 hi2 hi2 hi2 hi2 hi2 hi2 hi3 hi3 hi3 hi3 hi4 hi4

    Output:

    X1: test1 test1 test1 test2 test2 test2 test3 test3 test3 X2: no1 no1 no1 no1 no2 no2 no2 no2 no2 no3 no3 no no3 no3 no4 no4 no4 no4 no4 no4 no4 X3: yes1 yes1 yes1 hi2 hi2 hi2 hi2 hi2 hi2 hi3 hi3 hi3 hi3 hi4 hi4

    Update: Inverted "\n" vs " ".

      Wow that's it thank you!!!!
Re^2: Scanning a file and extracting certains values within one or multiple lines
by huck (Prior) on Mar 13, 2017 at 12:17 UTC

    Sorry, i missed the underscore in BEGIN_TAG, and i even thought i remembered going up to look to make sure. That meant that BEGIN_TAG didnt serve as the ending trigger like it should have, instead i was looking for "BEGIN TAG" with a space. So this would fix my example.

    open(FILE1, $ARGV[0]) || die "Error: $!\n"; my $save=0; my $saveto=''; while (my $line=<FILE1>){ chomp $line; my @parts=split(' ',$line,2); if ( (!$save) && $parts[0] eq 'X1') { $saveto=$parts[1]; $save=1; } elsif ($parts[0] eq 'X2' ||($parts[0] eq 'BEGIN_TAG' ) ){ print $saveto."\n" if ($saveto) ; $saveto=''; $save=0; } elsif ($save) { $saveto.=' '.$line;); } } print $saveto."\n" if ($saveto) ;
    And you never said you wanted to capture all of them to the end, so after each set was complete (bu hitting X2 or (now fixed to ) BEGIN_TAG i printed it, AND reset $saveto to blank. When you commented out the print, you didnt comment out the reset to blank, so only the last set was saved.

Re^2: Scanning a file and extracting certains values within one or multiple lines
by LanX (Saint) on Mar 13, 2017 at 10:53 UTC
    Please show wrong and expected output and code with proper indentation.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

      Edited with wrong and expected output plus formatted source