in reply to perl regex to match newline followed by number and text

Care to show us some sample data and how you'd like the result to look?

Note that you can use (?:...) to group stuff without capturing which can clean things up quite a bit. As a hint, I've added a little white space to your regex to make the groupings more obvious and added a digit at the start of each capture group. Maybe those numbers are not quite what you expect?

s/((\n) ([^0-9])+ (-)* (Aa-Zz)*) | ((\n) (\d{3}) (-)* (Aa-Zz)*)/$2$3/g +x; # 12 3 4 5 67 8 9 0

Update:

Maybe what you want to achieve is something like this:

use strict; use warnings; my $wholeBallOfWax = do {local $/; <DATA>}; my @records = split /(?<=\n)(?=\d+-)/, $wholeBallOfWax; s/\n+$/\n/s for @records; print join "---\n", @records; __DATA__ 1-12 last non-blank field 2-10 data more data 3-21 stuff more stuff Lots of stuff so much stuff there is no following empty field 4-73 Sneeky record with a blank field in the middle! 5-00 Last record

Which prints:

1-12 last non-blank field --- 2-10 data more data --- 3-21 stuff more stuff Lots of stuff so much stuff there is no following empty field --- 4-73 Sneeky record with a blank field in the middle! --- 5-00 Last record

In the split regex there is a look behind ((?<=\n)) which matches a new line before the current search point, and a look ahead ((?=\d+-)) which matches one or more digits followed by a hyphen. Neither match "consumes" the string that was matched so the split doesn't drop any characters.

As an aside, the do {local $/; <DATA>} bit suspends end of line detection and reads everything from <DATA> into $wholeBallOfWax (although maybe that was obvious?).

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Replies are listed 'Best First'.
Re^2: perl regex to match newline followed by number and text
by arunkumarzz (Novice) on May 31, 2019 at 15:12 UTC
    Hello Grandfather,

    The example you have provided is little bit different from my issue. I have updated my question with some sample data. Could you please have a look at it?

    Thanks in advance
      #!/usr/bin/perl use strict; my $record; while (<DATA>){ s/\n/ /; if (/^\d+~/){ $record =~ s/ +$//; # trim trailing spaces printf "%s\n",$record if ($record); $record = $_; } else { $record .= $_; } } $record =~ s/ +$//; printf "%s\n",$record if ($record); __DATA__ 99~Arun~Kumar~Mobilenum: 1234-567 , from Earth Human 98~Mahesh~Babu~Mobilenum: 5678-901 , from Earth Human
      poj
        Thank you everyone for your suggestions, poj's method worked for me right now.
      I see your Edit 1.
      Can you enclose the data in code tags so that we can see the new lines?
      The better the problems is described, the better the result will be. Your regex doesn't make much sense to me.

       s/((\n)([^0-9])+(-)*(Aa-Zz)*)|((\n)(\d{3})(-)*(Aa-Zz)*)/$2$3/g
      My brain hurts.