Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I hereby seek your wisdom for a problem that I am encountering in a script of mine:
I end up having lines like the following:
.06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 -.05563282$-.00141932$.00036263$.00035752$.00541792$.00345471$.9269254 +3$rs11080 > 530 -.00042649$-.00475721$.00167316$.00182299$.99057815$.90393977$.6990673 +3$rs11080 > 537 .10901125$.0361255$.02148475$.00853908$.74113541$.45908988$.68361003$r +s11080542 -.03866776$-.05004879$.00185491$.00145606$.38131545$.36141448$.1755140 +3$rs11080 > 557 -.004521$.01312692$.00033174$.00070174$.77873394$.88615378$.63499741$r +s1108056 .0339248$.02398934$.00276444$.00303152$.80478053$.55026576$.67512938$r +s11080561

What I need to do is, whenever the line starts with a > , append this data to the previous line...
In my example, I would then get:
.06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 -.05563282$-.00141932$.00036263$.00035752$.00541792$.00345471$.9269254 +3$rs11080530 -.00042649$-.00475721$.00167316$.00182299$.99057815$.90393977$.6990673 +3$rs11080537 .10901125$.0361255$.02148475$.00853908$.74113541$.45908988$.68361003$r +s11080542 -.03866776$-.05004879$.00185491$.00145606$.38131545$.36141448$.1755140 +3$rs11080557 -.004521$.01312692$.00033174$.00070174$.77873394$.88615378$.63499741$r +s1108056 .0339248$.02398934$.00276444$.00303152$.80478053$.55026576$.67512938$r +s11080561

So, the obvious part of the code would be something like:
while(<>) { if($_=~/^> ([\d\.\-]+)/) { $part_to_attach=$1; } }

but I am stuck as to how to paste it in the line above.

Replies are listed 'Best First'.
Re: append data to previous line
by fishmonger (Chaplain) on Apr 15, 2015 at 14:10 UTC

    One approach would be to reset the input record separator.

    #!/usr/bin/perl use warnings; use strict; $/ = "\n> "; while (<DATA>) { chomp; print; } __DATA__ .06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 -.05563282$-.00141932$.00036263$.00035752$.00541792$.00345471$.9269254 +3$rs11080 > 530 -.00042649$-.00475721$.00167316$.00182299$.99057815$.90393977$.6990673 +3$rs11080 > 537 .10901125$.0361255$.02148475$.00853908$.74113541$.45908988$.68361003$r +s11080542 -.03866776$-.05004879$.00185491$.00145606$.38131545$.36141448$.1755140 +3$rs11080 > 557 -.004521$.01312692$.00033174$.00070174$.77873394$.88615378$.63499741$r +s1108056 .0339248$.02398934$.00276444$.00303152$.80478053$.55026576$.67512938$r +s11080561

    outputs:

    .06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 -.05563282$-.00141932$.00036263$.00035752$.00541792$.00345471$.9269254 +3$rs11080530 -.00042649$-.00475721$.00167316$.00182299$.99057815$.90393977$.6990673 +3$rs11080537 .10901125$.0361255$.02148475$.00853908$.74113541$.45908988$.68361003$r +s11080542 -.03866776$-.05004879$.00185491$.00145606$.38131545$.36141448$.1755140 +3$rs11080557 -.004521$.01312692$.00033174$.00070174$.77873394$.88615378$.63499741$r +s1108056 .0339248$.02398934$.00276444$.00303152$.80478053$.55026576$.67512938$r +s11080561

    EDIT:
    The simple example I gave just outputs the data, but, instead, you could push each line onto an array or append it to a string for further processing.

      Ah, damn, I should have thought of the record separator... Silly mistake, thanks a lot!
Re: append data to previous line
by Laurent_R (Canon) on Apr 15, 2015 at 17:54 UTC
    The approach offered by fishmonger is neat and clever.

    Using your approach, it could be done with deferred printing (i.e. you print your stored line only when you know it has no continuation, i.e. when you have read the next one):

    use strict; use warnings; use feature qw(:5.14); my $line_out = <DATA>; chomp $line_out; while (<DATA>) { chomp; if ($_=~/^> ([\d\.\-]+)/) { $line_out .= $1; } else { say $line_out; $line_out = $_; } } say $line_out; __DATA__ .06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 [... I have abbreviated your data for this post ...]
    This gives the following output:
    $ perl overlapping_lines.pl .06669701$-.1672469$.02157899$.0346167$.65879324$.91614802$.45012441$r +s11080516 -.05563282$-.00141932$.00036263$.00035752$.00541792$.00345471$.9269254 +3$rs11080530 -.00042649$-.00475721$.00167316$.00182299$.99057815$.90393977$.6990673 +3$rs11080537 .10901125$.0361255$.02148475$.00853908$.74113541$.45908988$.68361003$r +s11080542 -.03866776$-.05004879$.00185491$.00145606$.38131545$.36141448$.1755140 +3$rs11080557 -.004521$.01312692$.00033174$.00070174$.77873394$.88615378$.63499741$r +s1108056 .0339248$.02398934$.00276444$.00303152$.80478053$.55026576$.67512938$r +s11080561
    Well, fishmonger's solution is obviously simpler and better in this case, but I wanted to show how you could have proceeded, and this technique of deferred printing can be useful in many more complicated cases where redefining the input record separator would not solve the problem.

    Je suis Charlie.