Re: Extracting names from string
by ikegami (Patriarch) on Feb 08, 2005 at 16:06 UTC
|
($reversed_name) = $string =~ m!/(.*?)\|!;
The perlre and perlop help files give more details on regular expressions.
Update: Escaped the |. | [reply] [d/l] [select] |
|
|
echo "m!/(.*)|!;" | perl -pe's/\|/\\\|/'
?
Update
OK, by the time I posted this you already had. I only took so long to post it 'cos I found it hard to believe ikegami could make mistakes :)
Cheers, R.
Pereant, qui ante nos nostra dixerunt!
| [reply] [d/l] |
|
|
What does "m!/(.*?)|!;" actually do? Arnt you missing the ending "/" Basically what I need is for it to grab all data within these characters starting with "/" and ending with "|". Im not sure if I can extract the data using reg expression.
Thanks again,
Jack
| [reply] |
|
|
/.../, m/.../, m#...#, m!...! and so on are all equivalent. The character following the m will be used to delimit the regexp.
Use
($reversed_name) = $string =~ m/\/(.*?)\|/;
if you prefer. It means the same thing, but it's not quite as readable.
foreach $string (
"s005219/Doe John|John.Doe\\",
"s0052194/Doe Thomas|Thomas.Doe\\",
) {
($reversed_name) = $string =~ m!/(.*?)\|!;
print($reversed_name, "\n");
}
# Prints
# ======
# Doe John
# Doe Thomas
Alternatively,
foreach $string (
"s005219/Doe John|John.Doe\\",
"s0052194/Doe Thomas|Thomas.Doe\\",
) {
if (m!/(.*?)\|!) {
print("$1\n");
}
}
| [reply] [d/l] [select] |
Re: Extracting names from string
by Random_Walk (Prior) on Feb 08, 2005 at 16:08 UTC
|
$string = "s005219/Doe John|John.Doe\\";
$string = ~/s\d+\/([^|]*)/;
print $1
Assumptions:
- each string starts with an 's' then a number
- the number is a decimal integer
- There will be no | character in the name
It can be simpler than this and remove the limits of assumption 1 and 2, excercise left for the OP.
Cheers, R.
Pereant, qui ante nos nostra dixerunt!
| [reply] [d/l] [select] |
Re: Extracting names from string
by Tanktalus (Canon) on Feb 08, 2005 at 16:07 UTC
|
You would use regular expressions. (Please read this as: you didn't provide your attempts, just your restrictions. As far as we can tell, this is a homework assignment.)
$string =~ m:|(\w+)\.(\w+):;
$extracted = "$2 $1";
(No, that's not quite what you're after, but it should be close enough that you can work from here - if you have more problems, please come back.) | [reply] [d/l] |
Re: Extracting names from string
by Anonymous Monk on Feb 08, 2005 at 16:08 UTC
|
You mean, you want to extract the part between the slash and the pipe?
I'd do it this way:
my($name) = $string =~ m!/([^|]+)\|!/;
| [reply] [d/l] |
Re: Extracting names from string
by dimar (Curate) on Feb 08, 2005 at 16:08 UTC
|
$string = "s005219/Doe John|John.Doe\\";
@array = split/[^\w\s\.]+/, $string;
print join "--",@array;
Assumptions:
You care about separating the strings on delimiters.
Anything that isn't a space, alphanumeric, underscore, or period is a delimiter.
How to get "Doe John" out of the array is left as an exercise for the reader. | [reply] [d/l] |
|
|
The trouble is, names can contain other things than spaces and alphanumerics (as defined by your locale).
Hyphens, for example. Apostrophes. Accented characters.
Best to split on what you know (the slash and the pipe) than what you can only guess at.
dave
| [reply] |
|
|
A valid consideration. In fact you will notice I added the 'dot' (in addition to alphanumerics) simply because one was in the sample provided by the OP. The (simplifying) assumptions I enumerated were guesstimates.
If we *really* wanted to be precise, we'd want both a spec and input validation (to know what flavor of xSV we were dealing with here, and to ensure user-data was 'scrubbed' before ever getting into the file/database/whatever to begin with)
(for example, what happens in the alien case where you get a user who has a pipe or a slash in their name ... hmm ... isn't *slash* the name of some rock band dude ... I digress).
You should know exactly what character sequences are in your delimiter-space, and what characters are in your message-space, and the two sets should be both finite and orthogonal.
It's always a good reminder that (at least in production code) there should be none of this such 'guesswork' involved at all. ... but close enough for a tutorial on 'split'...
| [reply] |
Re: Extracting names from string
by amw1 (Friar) on Feb 08, 2005 at 16:08 UTC
|
Assuming that:
1) Name is always between / and |
2) The first / marks the start of the name
$string =~ /\/([\w\s]+)\|/;
$name = $1;
| [reply] [d/l] |
Re: Extracting names from string
by cog (Parson) on Feb 08, 2005 at 16:08 UTC
|
Seems that /\/(\w+(?:\s\w+))\|/ will do the trick...
After matching, the name will be in $1 | [reply] [d/l] |
Re: Extracting names from string
by sh1tn (Priest) on Feb 08, 2005 at 16:22 UTC
|
$s = "s00+Doe.John+52(John ) 19/Doe John|John.Doe\"";
#collecting the names in array
push @names, $1 while $s =~ m{\W?(\w+(?:\s+|\.)+\w*)+}g;
print "$_\n" for @names
__END__
Doe.John
John
Doe John
John.Doe
| [reply] [d/l] |
Re: Extracting names from string
by gube (Parson) on Feb 09, 2005 at 03:49 UTC
|
$string = "s005219/Doe John|John.Doe\\";
($john) = $string =~ m#\/(.*?)\|#gsi;
print $john;
1.With in double quotes text last slash please use escape sequence one more slash. Now, it works.
| [reply] [d/l] |
Re: Extracting names from string
by techra (Pilgrim) on Feb 09, 2005 at 05:09 UTC
|
Here's a way to do it without using a regular expression. It's probably slower, but it'll at least work for the examples you provided.
my @strings = ("s005219/Doe John|John.Doe\"", "s0052194/Doe Thomas|Tho
+mas.Doe\"");
my @names; # will hold the finished list of names
foreach my $string (@strings) {
my (undef, $tmp_string) = split('\/', $string);
my ($name, undef) = split('\|', $tmp_string);
push (@names, $name);
}
| [reply] [d/l] |
Re: Extracting names from string
by Anonymous Monk on Feb 09, 2005 at 00:42 UTC
|
Looks like a simple regular expression might do ya.
I'm assuming the / and | chars are always located after the s# and before the 'username' formatted version of the name.
$_=~/\/(.+)|/;
$name=$1;
there are more elegant ways but this shows now the capturing parens would work, since there's only one set of parens only $1 is needed.
| [reply] |
Re: Extracting names from string
by VladSu (Acolyte) on Feb 09, 2005 at 11:55 UTC
|
Sorry but the "different length" is not a reason not to use the substr.
Please try this:
print substr( $string
, index($string,'/') + 1
, index($string,'|') - index($string,'/') - 1
);
Vlad. | [reply] [d/l] |