b4swine has asked for the wisdom of the Perl Monks concerning the following question:

I have a string, and I want to simple test to see if there is any letter duplicated. I thought of the simplistic
print 'dup' if /(.).*$1/;
but that didn't seem to work.

Replies are listed 'Best First'.
Re: Recognizing duplicates
by rhesa (Vicar) on Oct 19, 2007 at 12:52 UTC
    To quote from perlre:

    The bracketing construct ( ... ) creates capture buffers. To refer to the digit'th buffer use \<digit> within the match. Outside the match use "$" instead of "\". (The \<digit> notation works in certain circumstances outside the match. See the warning below about \1 vs $1 for details.) Referring back to another part of the match is called a backreference

    In other words, use

    print 'dup' if /(.).*\1/;
Re: Recognizing duplicates
by Fletch (Bishop) on Oct 19, 2007 at 12:53 UTC

    The numeric variables ($1 etc.) are only set on the right hand side of a substitution. You need to use the corresponding backreference (e.g. \1) on the LHS of a s/// or in a m//. See perlretut and perlre, the former of which uses this exact problem as its example.

Re: Recognizing duplicates
by perlfan (Parson) on Oct 19, 2007 at 13:45 UTC
    Can't you use a hash?
    my $string = "aabbbcc"; my @array = split('',$string); my %hash = (); foreach (@array) { print "dup found!\n" if (exists($hash{$_})); $hash{$_}++; }
Re: Recognizing duplicates
by Anonymous Monk on Oct 19, 2007 at 14:55 UTC
    note that the dot metacharacter matches any character (except newline, unless the  /s regex switch is used). if, instead, you want to check for the duplication of any letter, i.e., an alpha character, you might try  /(\w).*\1/ or  /([a-zA-Z]).*\1/.
Re: Recognizing duplicates
by RaduH (Scribe) on Oct 19, 2007 at 21:29 UTC
    I'd use this function:

    index(STRING, SUBSTRING, POSITION) -- Returns the position of the first occurrence of SUBSTRING in STRING at or after POSITION. If you don't specify POSITION, the search starts at the beginning of STRING

    You'll be looking for the first occurrence of the CURRENT character in the string AFTER the current position. Any way you look at it, it is O(n^2) on the length of your string. At least this solution doesn't use additional memory like the solution with the hash.
      Actually, the hash based solution is roughly O(length) time. It's (approximately) a constant amount of time to insert/index into the hash, and you only iterate over the string once.