in reply to Re: Combined lines from a file into one
in thread Combined lines from a file into one

Can you please explain to me what is happening in this line $_ = <<END; and instead of having the data in there can I read them from a file like
use strict; use warnings; open (rangeOutput, '>', "test.txt") or die "Cannot Open File: test.txt +: $!"; open (rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: rang +eFile.txt $!"; while ( <rangeFixed> ) { $_ = s/^(IMB,\d+,V1\s,)(\d+),\K(?:.*\n)+\1\d+,(\d+).*/$3/gmx; print rangeOutput $_ . "\n"; }
It didn't work for me when I tried to read the range from a file any suggestion ?

Replies are listed 'Best First'.
Re^3: Combined lines from a file into one
by poj (Abbot) on Aug 05, 2015 at 15:15 UTC
    open (ImbRangeHelper, '>', "test.txt") opens a file for writing , did you mean
    open (ImbRangeHelper, '<', "test.txt")
    poj
Re^3: Combined lines from a file into one
by Anonymous Monk on Aug 05, 2015 at 15:35 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1137428 use strict; use warnings; open (my $rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: +rangeFile.txt $!"; $_ = join '', <$rangeFixed>; s/^(IMB,\d+,V1\s,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; open (my $rangeOutput, '>', "test.txt") or die "Cannot Open File: test +.txt: $!"; print $rangeOutput $_; close $rangeOutput;
      #!/usr/bin/perl # http://perlmonks.org/?node_id=1137428 use strict; use warnings; open (my $rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: +rangeFile.txt $!"; $_ = join '', <$rangeFixed>; s/^(IMB,\d+,V1\s,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; open (my $rangeOutput, '>', "test.txt") or die "Cannot Open File: test +.txt: $!"; print $rangeOutput $_; close $rangeOutput;
      It didn't work. The output was all the lines, it didn't combine them?

        Can you put your data in code tags, it looks like you have more than 1 space here
        V1      ,
        If so regex should have + here V1\s+,
        poj

      It works now it was missing the space. Thanks for this catch. Now, I want to try to plug in variables. Can I use a variable for this:
      s/^(IMB,\d+,V1\s+,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; s/^($type,\d+,$version,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx;
      do you think it will work?
        Yes, if you mean like this
        my $type = 'IMB'; my $version = 'V1'; s/^($type,\d+,$version\s+,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx;

        or do you mean the file contains records other than IMB,V1 ?
        poj
      I tried it with variable it worked fine, but I have one problem I am losing the comma at the end of the second line. The first line is coming out fine but the second line is losing comma
      IMB,060410|,V1 ,371096340,371096486,147 IMB,107951|,V1 ,981157588,981176199
      How can I add a comma to the end of the second line
      IMB,060410|,folded ,307057959,307058193,235 IMB,060410|,selfmail ,307058194,307066458,8265 IMB,107951|,folded ,958090350,958091491,1142 IMB,107951|,selfmail ,958091492,958132856,41365 SEQ,,folded ,000000001,000001377,1377 SEQ,,selfmail ,000001378,000051007,49630
      It is not always V1, some times it might be different name, what i just notice is it won't work if you have different name like the example above it will delet the first line. I my case the ultimate solution is, the code will look at the line if there is lines with the same

      IMB,060410|,folded ,

      IMB,060410|,folded ,

      then will combine them if they are different like the example above will just keep them the same will do nothing.

        Assuming your file is not millions of lines, try this

        #!perl use strict; use warnings; my $infile = 'rangeFile.txt'; my $outfile = 'test.txt'; open IN, '<', $infile or die "Cannot Open InFile: $infile : $!"; open OUT, '>', $outfile or die "Cannot Open OutFile: $outfile : $!"; my %out=(); my @key=(); my $count_in = 0; my $count_out = 0; # input while (<IN>){ chomp; ++$count_in; my @in = split ',',$_; my $key = join ',',@in[0..2]; if ( ! defined $out{$key} ){ # initialise @{$out{$key}} = @in[3..4]; push @key,$key; # preserve order } else { # min if ($in[3] < $out{$key}[0]){ $out{$key}[0] = $in[3]; } # max if ($in[4] > $out{$key}[1]){ $out{$key}[1] = $in[4]; } } } # output for my $key (@key){ print OUT join ',',$key,@{$out{$key}},"\n"; ++$count_out; } close IN; close OUT; print " $count_in records read from $infile $count_out records written to $outfile\n"; __DATA__ IMB,060410,V1 ,371094378,371096338,1961 IMB,060410,V1 ,371096340,371096486,147 IMB,107951,V1 ,981157588,981164939,7352 IMB,107951,V1 ,981164941,981165606,666 IMB,107951,V1 ,981165608,981175100,9493 IMB,107951,V1 ,981175102,981176199,1098 IMB,060410|,folded ,307057959,307058193,235 IMB,060410|,selfmail ,307058194,307066458,8265 IMB,107951|,folded ,958090350,958091491,1142 IMB,107951|,selfmail ,958091492,958132856,41365 SEQ,,folded ,000000001,000001377,1377 SEQ,,selfmail ,000001378,000051007,49630
        poj
      It didn't work, it is print all the lines
      It works now it was missing the space.