in reply to Re^10: regex to return line with name but not if it has a number
in thread regex to return line with name but not if it has a number

... it doesn't have any 0-9 digits in it.

But see update to this. (I suppose you could say that if anyone were the 2nd, 3rd or 4th, etc., of whomever, they're just out of luck.)


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^12: regex to return line with name but not if it has a number
by Marshall (Canon) on Apr 03, 2017 at 23:10 UTC
    You are quite correct in that 2nd, 3rd, etc is theoretically a possibility. I think a rather rare one, but nevertheless a possibility.

    If something like that shows up, then I would go with my second code snippet. And modify the tests on the proposed name to be something other than just doesn't contain a digit. Perhaps something as simple as just paying attention to digits at the start of the proposed name field (looks like that fits OP's data)

    if (($name) = $line =~ /^\s*-\s+(.+?)\s*\|/) { print "$name\n" unless $name =~ /^\d/; }
    I suppose that these two regex'es could be combined. But, most of the time (in my personal experience), reducing the number of regex expressions doesn't matter. Also sometimes a more complex regex runs more slowly than 2 more straight-forward ones.

    I can think of other weird looking stuff that could possibly cause this simple scheme to fail.

    When writing an ad-hoc report parser, I usually try to keep things as simple as possible and then run it against as much data as practical while I am developing it. If and when some weirdo case like "3rd" shows up, I add code to handle it.

    There is always some sort of trade-off between "perfection" and development time. For some modules and subs, I scour the universe looking for all possible cases and test against them. Other times, I go with something imperfect, but likely "good enough", especially when dealing with a format that is likely to change often and require re-coding anyway. There is a lot of YMMV in this stuff! I have one parser with a documented defect which can theoretically happen according to the report spec(s). However after a decade and millions of actual examples, it hasn't happened yet. So I haven't bothered to write code to handle this very rare and complicated to handle situation.