Re: extract text between slashes
by mwah (Hermit) on Oct 31, 2007 at 17:26 UTC
|
my $str = '%/%/ISIN/US1252691001';
my @strings = $str =~ m{(?<=/)[^/]+}g ;
print join ', ', @strings, "\n";
print $strings[1];
Regards
mwa | [reply] [d/l] |
|
|
print join(', ', @strings), "\n";
I might instead tinker with the list separator
print do
{
local $" = q{, };
qq{@strings\n};
};
Cheers, JohnGG | [reply] [d/l] [select] |
|
|
I'm not sure whether you intended a trailing ', ' in the output. If not, you could use parentheses
.oO you got me ... If I'd say now "I didn't care if a trailing comma would show up" nobody would believe me. If I'd say "I tried to make the output of print "@strings\n"; more distinct for a beginner", the same would happen.
Therefore the best would be to take the blame and admit failure ;-)
Thanks & Regards
mwa
| [reply] |
Re: extract text between slashes
by halley (Prior) on Oct 31, 2007 at 16:26 UTC
|
Your current attempt is good, but the match is greedy. It looks for the longest possible match, not the shortest. Adding a ? after the .+ would work fine. To understand "greedyness," check the perldocs for regular expressions: perlre
m{/(.*?)/}
A second issue is if the string has empty slots between slashes, such as the string "%///US1252691001". You probably want to be able to return an empty result in this case, so I changed your use of .+ (one or more) to .* (zero or more) characters. Otherwise, you might get a match back of "/" for strings like my example.
Update: As others mentioned but I didn't parse correctly, to get the THIRD field (e.g., "~/~/THIS/~") takes a little more work. Instead of a bunch of complicated lookaheads and lookbehinds, or switching to a split() instead, I would just parse through. This has the advantage of easily changing the pattern to capture the other fields if the requirements change.
m{/.*?/(.*?)/}
-- [ e d @ h a l l e y . c c ]
| [reply] [d/l] [select] |
|
|
my $str = '%/%/ISIN/US1252691001';
my @elems = split m{/}, $str;
my $isin = $elems[2];
Cheers, JohnGG Update:
You need a more complex regex to do this without split using zero-width look-around assertions, an alternation of two look-behinds and a look-ahead with an alternation.
my @elems =
$str =~ m{(?(?<=\A)|(?<=/))(.*?)(?=/|\z)}g;
| [reply] [d/l] [select] |
|
|
Also, keep in mind that he wants the contents of the *second* pair of slashes. Assuming that the first one with the percent sign is static, m{/\%/(.*?)/} might work. otherwise, he could grab all matches and filter out the wrong ones, or split the whole string beforehand:
# method 1
@matches = $string =~ m{/(.*?)/}g;
# method 2
@matches = split m{/}, $string;
# print the one you want
print $matches[1];
__________
Systems development is like banging your head against a wall...
It's usually very painful, but if you're persistent, you'll get through it.
| [reply] [d/l] [select] |
|
|
Unfortunately, your method 1 isn't going to do the trick because the regex is going to consume %/%/ when doing the first match and the next attempted match is left with ISIN/US1252691001 to work with so the match fails.
$ perl -le '
> $string = q{%/%/ISIN/US1252691001};
> @matches = $string =~ m{/(.*?)/}g;
> print for @matches;'
%
$
Cheers, JohnGG | [reply] [d/l] [select] |
|
|
I think we don't know enough about what he's looking for. It was said that he's looking for the text between the second pair of slashes. What if he is looking for the string between the last %/ and the very next / ? I think the input string is not described well enough in the original question. For all I can say, he could be looking for %/%/ as a fixed token, suck out all of the following characters until the first /, but this assumes all his input strings begin with %/%/ followed by what he needs to extract, which may not be a correct assumption.
| [reply] |
Re: extract text between slashes
by Anonymous Monk on Oct 31, 2007 at 19:41 UTC
|
won't /ISIN/ work for you? | [reply] |
|
|
| [reply] [d/l] [select] |