in reply to Re: Re: Removing duplicate substrings from a string - is a regex possible?
in thread Removing duplicate substrings from a string - is a regex possible?

/Jeff(?:rey)?/

Well, I have two comments: a regex is a cool approach to take, and a regex is the wrong approach to take. You said yourself you could split and use a hash. That's what I'd do. I'd use a regex if I were trying to impress someone (which I'm not). But if you are, then here's my offering:

$cities = "Here/There/Everywhere/There/Again/Here/Too"; $cities =~ s{ ([^/]+) # non-slashes (the city): \1 / # a / (?= # look ahead for:... (?: # group (not capture) [^/]* / # city and / )*? # 0 or more times, non-greedily \1 # is the city here again? (?: /|$ ) # a / or end-of-string ) } {}gx;

_____________________________________________________
Jeff japhy Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

  • Comment on Re: Re: Re: Removing duplicate substrings from a string - is a regex possible?
  • Download Code

Replies are listed 'Best First'.
Re4: Removing duplicate substrings from a string - is a regex possible?
by Hofmator (Curate) on Jul 24, 2001 at 13:36 UTC

    Right argument, japhy!! But it's an interesting problem nevertheless. Your regex breaks on: New York/York/Boston => New York/Boston My fix makes it sadly less elegant:

    1 while s{ (/|^) # add: capture slash or start of string: $1 ([^/]+) # non-slashes (the city): \2 / # a / (?= # look ahead for:... (?: # group (not capture) [^/]* / # city and / )*? # 0 or more times, non-greedily \2 # is the city here again? (?: /|$ ) # a / or end-of-string ) } {$1}gx; # add: replacement
    Now this is close to my approach:  1 while s#(/|^)((?>[^/]*))/(?=(?:.*?/)?\2(?:/|$))#$1#g; The advantage of both solutions is that as little as possible is replaced. The disadvantage is that a regex isn't the right way to do it ;) - for all the reasons already given in this thread.

    -- Hofmator