in reply to Help in joining these lines

#!/usr/bin/perl # http://perlmonks.org/?node_id=1171851 use strict; use warnings; $_ = join '', <DATA>; 1 while s/ ^\w*\K\n(?=\w*\n) | ^.*\..*\K\n(?=.*\.) //mx; print; __DATA__ > gi|11SB_CUCMA Train|1 21 MARSSLFTFLCLAVFINGCLSQIEQQSPWEFQGS EVWQQHRYQSPRACRLENLRAQDPVRLLLPGFSNAPKLIFV AQGFGIRGIAIPGCAETYQT SSSSSSSSSSSSSSSSSSSSS.................... ........................... ................. .......... > gi|1A43_HUMAN Train|1 24 MAVMAPRTLVLLLSGALALTQTWAGSHSMRYFYTSVSRPG RGEPRFIAVGYVDDTQFVRFDSDAASQRMEPRAPWIEQEGPEYWSQTDRANLGTLRGYYNQSEDGSHTIQ +R MYGCDVGPDGRFLRGYQQDAYDGKDYIALNEDLRSWTAADMAAQITQRKWETAHEAE SSSSSSSSSSSSSSSSSSSSSSSS.............................................. +........................................... ............. ..........................................

Replies are listed 'Best First'.
Re^2: Help in joining these lines
by Anonymous Monk on Sep 15, 2016 at 15:48 UTC
    Thanks to both of you who helped me!
    Is it possible to explain the pattern match? I am really interested in learning this technique but I am afraid I can't really understand the expression that gods wrote here..

      Although use re 'debug'; is available, YAPE::Regex::Explain is a fair amount more detailed in its output when explaining a regex:

      use warnings; use strict; use YAPE::Regex::Explain; my $re = 's/ ^\w*\K\n(?=\w*\n) | ^.*\..*\K\n(?=.*\.) //mx'; print YAPE::Regex::Explain->new($re)->explain;

      Output:

      The regular expression: (?-imsx:s/ ^\w*\K\n(?=\w*\n) | ^.*\..*\K\n(?=.*\.) //mx) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- s/ 's/ ' ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \K 'K' ---------------------------------------------------------------------- \n '\n' (newline) ---------------------------------------------------------------------- (?= look ahead to see if there is: ---------------------------------------------------------------------- \w* word characters (a-z, A-Z, 0-9, _) (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \n '\n' (newline) ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \. '.' ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \K 'K' ---------------------------------------------------------------------- \n '\n' (newline) ---------------------------------------------------------------------- (?= look ahead to see if there is: ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \. '.' ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- //mx ' //mx' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

      For the items that still may not be clear, see perlre.

        I think YAPE::Regex::Explain is an excellent tool for those learning the basics. However, it does have its LIMITATIONS:

        "There is no support for regular expression syntax added after Perl version 5.6, particularly any constructs added in 5.10. ..."

        — Ken

        Explain is out of date. \K is no longer just K, but causes stuff to the left to not be included in $&.

      The '\K' and '(?=...)' parts are explained in Lookaround Assertions under "perlre: Extended Patterns".

      You can use Regexp::Debugger to see what's happening (step-by-step) as a regex is processed. I find this to be a very useful tool.

      — Ken