Re: Re: Re: Removing duplicate substrings from a string

/Jeff(?:rey)?/

Well, I have two comments: a regex is a cool approach to take, and a regex is the wrong approach to take. You said yourself you could split and use a hash. That's what I'd do. I'd use a regex if I were trying to impress someone (which I'm not). But if you are, then here's my offering:

$cities = "Here/There/Everywhere/There/Again/Here/Too";

$cities =~ s{
  ([^/]+)      # non-slashes (the city): \1
  /            # a /
  (?=          # look ahead for:...
    (?:        # group (not capture)
      [^/]* /  # city and /
    )*?        # 0 or more times, non-greedily
    \1         # is the city here again?
    (?: /|$ )  # a / or end-of-string
  )
}
{}gx;
[download]

_____________________________________________________
Jeff japhy Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Comment on Re: Re: Re: Removing duplicate substrings from a string - is a regex possible? Download Code

Replies are listed 'Best First'.
Re4: Removing duplicate substrings from a string - is a regex possible? by Hofmator (Curate) on Jul 24, 2001 at 13:36 UTC
Right argument, japhy!! But it's an interesting problem nevertheless. Your regex breaks on: `New York/York/Boston => New York/Boston` My fix makes it sadly less elegant: `1 while s{ (/\|^) # add: capture slash or start of string: $1 ([^/]+) # non-slashes (the city): \2 / # a / (?= # look ahead for:... (?: # group (not capture) [^/]* / # city and / )? # 0 or more times, non-greedily \2 # is the city here again? (?: /\|$ ) # a / or end-of-string ) } {$1}gx; # add: replacement` [download] Now this is close to my approach: `1 while s#(/\|^)((?>[^/]))/(?=(?:.*?/)?\2(?:/\|$))#$1#g;` The advantage of both solutions is that as little as possible is replaced. The disadvantage is that a regex isn't the right way to do it ;) - for all the reasons already given in this thread. -- Hofmator	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re4: Removing duplicate substrings from a string - is a regex possible?
by Hofmator (Curate) on Jul 24, 2001 at 13:36 UTC

Right argument, japhy!! But it's an interesting problem nevertheless. Your regex breaks on: New York/York/Boston => New York/Boston My fix makes it sadly less elegant:

    1 while s{
      (/|^)   # add: capture slash or start of string: $1
      ([^/]+)      # non-slashes (the city): \2
      /            # a /
      (?=          # look ahead for:...
        (?:        # group (not capture)
          [^/]* /  # city and /
        )*?        # 0 or more times, non-greedily
        \2         # is the city here again?
        (?: /|$ )  # a / or end-of-string
      )
    }
    {$1}gx;  # add: replacement
[download]

1 while s#(/|^)((?>[^/]*))/(?=(?:.*?/)?\2(?:/|$))#$1#g;

-- Hofmator

[reply]
[d/l]
[select]