in reply to regex help
You can use grep.
Some working code: (all you need to do to let this suit your needs is check if @x has elements)
#!/usr/bin/perl -l use strict; use warnings; my @full_authors = ( "Smith, John", "Smith, John Ronald", "Johnson, Ja +mes", "James, Ray Jack", "Van der Burg, Jon", "O'Neil, Sarah" ); my @authors = ( "Smith J", "Jackson J", "James RJ", "Van der Burg J", +"O'Neil S" ); $, = " & "; foreach my $a (@authors) { my ($f, $s) = $a =~ m/(.*)\s+(\w+)$/; # This regex will put everythi +n before the last space in $f, and everything after the space in $s # Example: if $a is James RJ, then $f is 'James', and $s 'RJ' my @g = split //, $s; # Split the last charachter, so that eac +h element in the @g-array is one letter $f = quotemeta($f); # Remove all special thingies of $f, like a . an +d a space (needed because the /x modifier is in use) my $p = join('\w+\s+', @g); # Join the letters together, and add the regex charachter \w+\s+, me +aning that RJ will become R\w+\s+J (which is used in the regex) my @x = grep (m/ ^ # Match start of line $f # Match the last name \s* # Match some optional whitespace , # Match a comma \s+ # Match some whitespace (not optinal) $p # Match the second part of the name \w+ # Match the remaining word-charachters of this name \s* # Match some optional whitespace $ # Match the end /xi, @full_authors); print $a, @x; }
If you have 'Smith, John' and 'Smith, Jack' in your @full_authors-array and you are searching for 'Smith J' then @x-array will have both these elemnts.
Update, added the note about duplicates
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: regex help
by Animator (Hermit) on Dec 09, 2004 at 15:49 UTC | |
|
Re^2: regex help
by Animator (Hermit) on Dec 09, 2004 at 17:41 UTC | |
|
Re^2: regex help
by rsiedl (Friar) on Dec 09, 2004 at 16:06 UTC |