in reply to grep only lines having matched pattern

Both suggestions propose to check for a space to follow the date, but that will work well in this example, but will fail for dates that are located at the end of the line.

As you posted your data explicitely, that won't be a problem, but maybe looking at the criterium a bit more defensive, you can also say: match a date *not* followed by any of -, digit, letter or underscore (identifier characters).

my @lines = grep { m/ \b (?: 0[1-9] | 1[0-2] ) - (?: 0[1-9] | [12][0-9 +] | 3[01] ) - [0-9]{4} ) (?! [-\w] ) /x } @data;

And your input data is horrific: MM-DD-YYYY ... YYYY/MM/DD. How on earth does someone come up with a mixed format like that? (/me is all for a global ban on M/D/Y and Y/D/M format)


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: grep only lines having matched pattern
by Marshall (Canon) on Apr 01, 2021 at 21:34 UTC
    Well since \n is a space character - see below. However it appears to me that anchoring this regex to the beginning of the line is just fine.
    use strict; use warnings; while (<DATA>) { if (/\d{2}-\d{2}-\d{4}\s+/) { print; } } =Prints: 03-15-2021 21.1.0-s102 2021/03/15:04:00:09 21.1 21.10-s102 21.1.0-s102 2021/03/15:04:00:09 21.1 21.10-s102 03-15-2021 **works** =cut __DATA__ 03-15-2021-1 21.1.0-s103 2021/03/15:14:16:39 21.1 21.10-s103 03-15-2021-2 21.1.0-s103 2021/03/15:14:16:39 21.1 21.10-s103 03-15-2021 21.1.0-s102 2021/03/15:04:00:09 21.1 21.10-s102 21.1.0-s102 2021/03/15:04:00:09 21.1 21.10-s102 03-15-2021 21.1.0-s102 2021/03/15:04:00:09 21.1 21.10-s102 03-15-2021-4
    I guess that /\d{2}-\d{2}-\d{4}[^-]/would also work?

      Using DATA in example code tells me nothing about the *real* source for the data. It can be a log file or a database or a process that pipes otther sources into a (stream of) single lines of log that have no line endings at all.

      To *me* thinking out of that box has caused me to sometime be overprotective and think out of the box. It not only makes many lines in my code show more explicit what the intent is, but it also protects against the other ways in what this data can be supplied (in the future).

      Be liberal on the recieving end and be strict on the producing end.

      Been there, done that: you have no idea how completely valid CSV files get corrupted by people in the chain that want to "check" the content using a spreadsheet program like Excel and instead of exiting hit "OK" when the program asks them to write the changed data even if the change is just widening the column or changing the font.


      Enjoy, Have FUN! H.Merijn