data67 has asked for the wisdom of the Perl Monks concerning the following question:

Hey all, I have been trying to come up with a regular expression that can detect a name in a flat-file database. The flatfile has some info. lets say for example i got a flat-file for a Dentist office and i want to add a new patient to this file if he is not present. simple enough right!
But wait, i got patients that share parts of a name.

Example:
Don Phillie Johnson
Alexander simon
simon DeGauo
simon DeGauo Henderson
Alex Johnson
Alex Johnson Stevens

You see i need to come up with somewhat of a smarter regex that can go do this. Now one thing that i forgot to mention in the scenerio is that the flat-file has different doctors.
so if you had something like the following it is ok:
123!simon DeGauo!rm-200!Dr. Andre
123!simon DeGauo!rm-200!Dr. Banks
Now you see i am gonna have to do a regex that looks at both the patient name (variation sensative) and the doctor.
I know the explanation is not the best but i think you guys can look at the follwing and get a clue also.

$patient = "Alex"; $new_doc = "Kessler"; open (REC, "record.data"); @data = <REC>; close (REC); foreach $line (@data) { my ($id, $patient, $room, $doc) = split (/!/, $line); if (($patient =~ /($new_patient)/i) && ($doc =~ $new_doc)) { print "$line\n"; } else { print "NO PATIENT BY THAT NAME\n"; }

Replies are listed 'Best First'.
Re: regular expression issue
by DamnDirtyApe (Curate) on Aug 02, 2002 at 22:14 UTC

    Two clarifications would be helpful:

    1. are `simon DeGauo' and `simon DeGauo Henderson' two different people, or the same person with their name presented two different ways in the database?

    2. In these two records:
      123!simon DeGauo!rm-200!Dr. Andre
      123!simon DeGauo!rm-200!Dr. Banks
      Are we seeing two records for the same person, or two different people with the same name?

    _______________
    DamnDirtyApe
    Those who know that they are profound strive for clarity. Those who
    would like to seem profound to the crowd strive for obscurity.
                --Friedrich Nietzsche
      Ans. to Question 1
      These are different people that may share parts of thier names.

      Ans. to Question 2
      what you wrote in the second question is the same patient but with different doctors. I mentioned this senario in my initial question.

(shockme) Re: regular expression issue
by shockme (Chaplain) on Aug 03, 2002 at 03:46 UTC
    Perhaps I'm missing something obvious, but (in its most simplistic form) if
    $x = "simon DeGauo" # aka current <INPUT> and $y= "simon DeGauo Henderson" # aka previous <INPUT>

    how can there be any confusion?

    If things get any worse, I'll have to ask you to stop helping me.

      I think you may have misunderstood the question.
      what i am after here was to be able to take a name and see if that particular patient is in the file for that doctor. simple enough. But i was trying to come up with a regex that is smarter that the common m//i;.

      JUST TO CLARIFY, IF YOU SEE A NAME IN THIS FILE THAT IS DIFFERENT IN ANY WAY. THEN IT IS GONNA BE TWO DIFFERENT PEOPLE.

Re: regular expression issue
by tommyw (Hermit) on Aug 06, 2002 at 16:27 UTC

    Do you really want to do regular expression matching?

    if ((lc $patient eq lc $new_patient && lc $doc eq lc $new_doc) {
    will give you a match only when there is an exact match (which appears to be what you're asking for).

    Otherwise you're going to have to ask for more information: if the patient's name is Simon, does this match the existing Simons, or is it a new one? With a regexp, it's an existing patient; with equality, a new one. You've got the same problem with the two Simons in your example.

    --
    Tommy
    Too stupid to live.
    Too stubborn to die.