in reply to Re^8: regex to return line with name but not if it has a number
in thread regex to return line with name but not if it has a number

Thanks for your help and I get your comment re the name regex, it's ceratinly a trick one.Here's an example file and some example code:

- 51 Laboratory Director Jobs in Redwood City, CA | › Jobs › Labor +atory Director Jobs Apply to 51 Laboratory Director jobs in Redwood C +ity, CA on . ... Apex Life Sciences; formerly known as Lab Support (2 +) BioReference Laboratories (2) - Laboratory Director Relocate to Los Angeles Job at ... /jobs/view +/250533025 We are looking for a Lab Director to lead our growing team +, ... Director, CLIA Lab Illumina Redwood City, CA, US. Posted 15 day +s ago. Sponsored. Laboratory Technician - 32 Director Laboratory Operations Jobs | › Jobs 32 Director Labo +ratory Operations jobs in United States on . ... and foster a high .. +. GCP and CLIA laboratory. Director of Lab Operations - CRO. We are . +.. - 21 Associate Director Operations Jobs in San Mateo, CA ... › Jobs + › Associate Director Operations Jobs Foster City, CA, US. Our worldw +ide ... Minimum of 5 years managerial experience in a CLIA lab enviro +nment and strong ... The Associate Director of Clinical Operations .. +. - 321 Laboratory Operations Jobs in San Francisco, CA | › Jobs › L +aboratory Operations Jobs Minimum of 5 years managerial experience in + a CLIA lab environment and strong leadership skills to ... Associate + Technical Director ... based in Foster City , CA … - John Smith| Clinical Lab Operations Director at … · - 43 Laboratory Manager Jobs in Menlo Park, CA | › Jobs › Laborato +ry Manager Jobs Director; Executive; ... role in the overall running +and safety of the laboratory and is responsible for managing the dail +y operations of the lab. ... Foster City, CA ... - Joe Bubba | /Affairs and Quality … · - 169 Clinical Laboratory Scientist Jobs in California ... › Jobs › + Clinical Laboratory Scientist Jobs Culver City, California (1) Monte +rey, California (1) West Sacramento, California (1) Company. Clinical + Management Consultants (28) University of California, San ... - David W. Anderson | /in/david-w-anderson-29b58326 View David W. +Anderson’s professional profile on . ... In-licensed & in +tegrated 2 products into CLIA lab ... •Provide leadership to f +oster team vision ... - Related Searches for "CLIA Lab Director Foster City" site:wâ&#128 +;¦ clia lab director course - Related searches clia lab director course

code below:

use strict; use warnings; use IO::Handle; use File::Basename; use File::Find; use File::Copy; my $fh; my $fh2; my $jj; my $file; my $line; print "I'm in trim\n"; print "curent working directory is $ENV{PWD}, \n"; open ($fh, ">", "output2.txt") or die "$!"; print $fh ""; close $fh; chdir "../Html3"; print "curent working directory is $ENV{PWD}, \n"; my @files2 = grep { -f } glob("*.txt"); foreach $file (@files2) { open (FILE, "$file"); while($line= <FILE> ){ open ($fh2, ">>", "output2.txt") or die "$!"; if ($line =~ m/^[A-Z]'?[- a-zA-Z]+$/) {print $fh2 "$line";} if ($line =~ m/^[A-Z]'?[- a-zA-Z]+$/) {print "$line";} } close FILE; } die;

Right now I'm thinking the best strategy if to match the leading hyphen and the name to extract the line but I can't figure that regex out, I just have the basic name matching on in there now. Thanks in advance for the assistance.

Replies are listed 'Best First'.
Re^10: regex to return line with name but not if it has a number
by Corion (Patriarch) on Apr 03, 2017 at 19:00 UTC

    Maybe it helps you if you approach your problem in a different way.

    From your data, it seems to me that the first field is always delimited by |. So maybe you should simply extract that first field and then worry about determining whether it's a person or a departement.

    For example, the following regular expression will match everything up to the first bar:

    /^(.*?)\|/;

    Then you have the data in a far more manageable thing and can worry about determining whether it's a name or a departement.

Re^10: regex to return line with name but not if it has a number
by Marshall (Canon) on Apr 03, 2017 at 19:17 UTC
    I guess that a name could be almost anything: Dr. Jon Smith, IV; Mary Tillson-Jones and lots of other possibilities. I took the approach of identifying what is to the left and right of a "name", rather than defining exactly what a names is, other than it doesn't have any 0-9 digits in it.

    #!/usr/bin/perl use strict; use warnings; while (my $line = <DATA>) { my $name; if (($name) = $line =~ /^\s*-\s+(\D+?)\s*\|/) { print "$name\n"; } } =Prints: John Smith Joe Bubba David W. Anderson =cut __DATA__ - 51 Laboratory Director Jobs in Redwood City, CA | › Jobs › Labor +atory Director Jobs Apply to 51 Laboratory Director jobs in Redwood C +ity, CA on . ... Apex Life Sciences; formerly known as Lab Support (2 +) BioReference Laboratories (2) - Laboratory Director Relocate to Los Angeles Job at ... /jobs/view +/250533025 We are looking for a Lab Director to lead our growing team +, ... Director, CLIA Lab Illumina Redwood City, CA, US. Posted 15 day +s ago. Sponsored. Laboratory Technician - 32 Director Laboratory Operations Jobs | › Jobs 32 Director Labo +ratory Operations jobs in United States on . ... and foster a high .. +. GCP and CLIA laboratory. Director of Lab Operations - CRO. We are . +.. - 21 Associate Director Operations Jobs in San Mateo, CA ... › Jobs + › Associate Director Operations Jobs Foster City, CA, US. Our worldw +ide ... Minimum of 5 years managerial experience in a CLIA lab enviro +nment and strong ... The Associate Director of Clinical Operations .. +. - 321 Laboratory Operations Jobs in San Francisco, CA | › Jobs › L +aboratory Operations Jobs Minimum of 5 years managerial experience in + a CLIA lab environment and strong leadership skills to ... Associate + Technical Director ... based in Foster City , CA â&#128;¦ - John Smith| Clinical Lab Operations Director at â&#128;¦ · - 43 Laboratory Manager Jobs in Menlo Park, CA | › Jobs › Laborato +ry Manager Jobs Director; Executive; ... role in the overall running +and safety of the laboratory and is responsible for managing the dail +y operations of the lab. ... Foster City, CA ... - Joe Bubba | /Affairs and Quality â&#128;¦ · - 169 Clinical Laboratory Scientist Jobs in California ... › Jobs › + Clinical Laboratory Scientist Jobs Culver City, California (1) Monte +rey, California (1) West Sacramento, California (1) Company. Clinical + Management Consultants (28) University of California, San ... - David W. Anderson | /in/david-w-anderson-29b58326 View David W. +Andersonâ&#128;&#153;s professional profile on . ... In-licensed & in +tegrated 2 products into CLIA lab ... â&#128;¢Provide leadership to f +oster team vision ... - Related Searches for "CLIA Lab Director Foster City" site:wâ&#128 +;¦ clia lab director course - Related searches clia lab director course
    Note this code would do the same thing:
    if (($name) = $line =~ /^\s*-\s+(.+?)\s*\|/) { print "$name\n" unless $name =~ /\d/; }
      ... it doesn't have any 0-9 digits in it.

      But see update to this. (I suppose you could say that if anyone were the 2nd, 3rd or 4th, etc., of whomever, they're just out of luck.)


      Give a man a fish:  <%-{-{-{-<

        You are quite correct in that 2nd, 3rd, etc is theoretically a possibility. I think a rather rare one, but nevertheless a possibility.

        If something like that shows up, then I would go with my second code snippet. And modify the tests on the proposed name to be something other than just doesn't contain a digit. Perhaps something as simple as just paying attention to digits at the start of the proposed name field (looks like that fits OP's data)

        if (($name) = $line =~ /^\s*-\s+(.+?)\s*\|/) { print "$name\n" unless $name =~ /^\d/; }
        I suppose that these two regex'es could be combined. But, most of the time (in my personal experience), reducing the number of regex expressions doesn't matter. Also sometimes a more complex regex runs more slowly than 2 more straight-forward ones.

        I can think of other weird looking stuff that could possibly cause this simple scheme to fail.

        When writing an ad-hoc report parser, I usually try to keep things as simple as possible and then run it against as much data as practical while I am developing it. If and when some weirdo case like "3rd" shows up, I add code to handle it.

        There is always some sort of trade-off between "perfection" and development time. For some modules and subs, I scour the universe looking for all possible cases and test against them. Other times, I go with something imperfect, but likely "good enough", especially when dealing with a format that is likely to change often and require re-coding anyway. There is a lot of YMMV in this stuff! I have one parser with a documented defect which can theoretically happen according to the report spec(s). However after a decade and millions of actual examples, it hasn't happened yet. So I haven't bothered to write code to handle this very rare and complicated to handle situation.