in reply to Combined lines from a file into one

#!/usr/bin/perl # http://perlmonks.org/?node_id=1137428 use strict; use warnings; $_ = <<END; IMB,060410,V1 ,371094378,371096338,1961 IMB,060410,V1 ,371096340,371096486,147 IMB,107951,V1 ,981157588,981164939,7352 IMB,107951,V1 ,981164941,981165606,666 IMB,107951,V1 ,981165608,981175100,9493 IMB,107951,V1 ,981175102,981176199,1098 END s/^(IMB,\d+,V1\s,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; print;

Replies are listed 'Best First'.
Re^2: Combined lines from a file into one
by emadmahou (Acolyte) on Aug 05, 2015 at 14:38 UTC
    Can you please explain to me what is happening in this line $_ = <<END; and instead of having the data in there can I read them from a file like
    use strict; use warnings; open (rangeOutput, '>', "test.txt") or die "Cannot Open File: test.txt +: $!"; open (rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: rang +eFile.txt $!"; while ( <rangeFixed> ) { $_ = s/^(IMB,\d+,V1\s,)(\d+),\K(?:.*\n)+\1\d+,(\d+).*/$3/gmx; print rangeOutput $_ . "\n"; }
    It didn't work for me when I tried to read the range from a file any suggestion ?
      open (ImbRangeHelper, '>', "test.txt") opens a file for writing , did you mean
      open (ImbRangeHelper, '<', "test.txt")
      poj
      #!/usr/bin/perl # http://perlmonks.org/?node_id=1137428 use strict; use warnings; open (my $rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: +rangeFile.txt $!"; $_ = join '', <$rangeFixed>; s/^(IMB,\d+,V1\s,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; open (my $rangeOutput, '>', "test.txt") or die "Cannot Open File: test +.txt: $!"; print $rangeOutput $_; close $rangeOutput;
        #!/usr/bin/perl # http://perlmonks.org/?node_id=1137428 use strict; use warnings; open (my $rangeFixed, '<', "rangeFile.txt") or die "Cannot Open File: +rangeFile.txt $!"; $_ = join '', <$rangeFixed>; s/^(IMB,\d+,V1\s,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; open (my $rangeOutput, '>', "test.txt") or die "Cannot Open File: test +.txt: $!"; print $rangeOutput $_; close $rangeOutput;
        It didn't work. The output was all the lines, it didn't combine them?
        It works now it was missing the space. Thanks for this catch. Now, I want to try to plug in variables. Can I use a variable for this:
        s/^(IMB,\d+,V1\s+,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx; s/^($type,\d+,$version,)(\d+),\K (?:.*\n)+ \1\d+,(\d+).*/$3/gmx;
        do you think it will work?
        I tried it with variable it worked fine, but I have one problem I am losing the comma at the end of the second line. The first line is coming out fine but the second line is losing comma
        IMB,060410|,V1 ,371096340,371096486,147 IMB,107951|,V1 ,981157588,981176199
        How can I add a comma to the end of the second line
        IMB,060410|,folded ,307057959,307058193,235 IMB,060410|,selfmail ,307058194,307066458,8265 IMB,107951|,folded ,958090350,958091491,1142 IMB,107951|,selfmail ,958091492,958132856,41365 SEQ,,folded ,000000001,000001377,1377 SEQ,,selfmail ,000001378,000051007,49630
        It is not always V1, some times it might be different name, what i just notice is it won't work if you have different name like the example above it will delet the first line. I my case the ultimate solution is, the code will look at the line if there is lines with the same

        IMB,060410|,folded ,

        IMB,060410|,folded ,

        then will combine them if they are different like the example above will just keep them the same will do nothing.
        It didn't work, it is print all the lines
        It works now it was missing the space.