Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Unique List of Common Characters Between Two Strings

by Limbic~Region (Chancellor)
on Jan 11, 2005 at 19:47 UTC ( [id://421394]=CUFP: print w/replies, xml ) Need Help??

Given string1 and string2 as command line arguments, the snippet will return a unique list of characters both strings have in the order they appear in the second string.
#!/usr/bin/perl use strict; use warnings; my ($str1, $str2) = @ARGV; die "Usage: $0 <string1> <string2>" if ! defined $str2; print join '', In_Common($str1, $str2); sub In_Common { my ($str1, $str2) = @_; my %u; $u{ $_ } = 1 for split //, $str1; return grep { $u{ $_ } ? ( $u{ $_ }-- , $_ ) : () } split //, $str +2; }

Replies are listed 'Best First'.
Re: Unique List of Common Characters Between Two Strings (s///)
by tye (Sage) on Jan 11, 2005 at 20:59 UTC

    I still like eval "\$one =~ tr/\Q$two\E//cd" as part of a solution. But there is also this:

    sub CommonChars { my $comm = shift @_; $comm =~ s/[^\Q$_\E]//g for @_; my %seen; return grep !$seen{$_}++, $comm =~ /(.)/gs; }

    Which works for any number of strings.

    - tye        

•Re: Unique List of Common Characters Between Two Strings
by merlyn (Sage) on Jan 11, 2005 at 21:01 UTC
Re: Unique List of Common Characters Between Two Strings
by davido (Cardinal) on Jan 12, 2005 at 07:29 UTC

    Let's bloat it up with the ultra-cool Quantum::Superpositions. Performing a comparision using all( any( list1 ), any( list2 ) ) returns a quantum superposition of the two lists, as a wierd sort of scalar. Take that scalar's eigenstate and you end up with a bare unique list of characters found only in both lists.

    use strict; use warnings; use Quantum::Superpositions; my ($str1, $str2) = @ARGV; die "Usage: $0 <string1> <string2>" if ! defined $str2; print In_Common($str1, $str2), "\n"; sub In_Common { my ($str1, $str2) = @_; return join '', eigenstates( all( any( split( //, $str1 ) ), any( split( //, $str2 ) ) ) ); }

    Of course this module is mostly a proof of concept, but it's a pretty cool concept.

    By the way, I was just considering how one might modify the code to allow for any number of strings on the commandline. For example: ("abc", "bcd", "cde") would result in the output of "c". Or ("abcd", "bcde", "cdef", "defg") would output "d". I thought of a few strategies, but they all feel inelegant

    Update: Ah, I got it. ...Here's a version of the preceeding script that allows for any number of strings on the command line:

    use strict; use warnings; use Quantum::Superpositions; print In_Common( @ARGV ), "\n"; sub In_Common { return join '', eigenstates( all( map { any( split( //, $_ ) ) } @_ ) ); }

    With this script, ("abcd", "bcde", "cdef", "defg") would return "d".


    Dave

Re: Unique List of Common Characters Between Two Strings
by sleepingsquirrel (Chaplain) on Jan 11, 2005 at 22:38 UTC
    Or, if you're a regex fan, you might like...
    sub In_Common2 { my ($str1, $str2) = @_; (my $common = reverse $str2) =~ s/[^$str1]|(.)(?=.*\1)//g; return scalar reverse $common; }


    -- All code is 100% tested and functional unless otherwise noted.
Re: Unique List of Common Characters Between Two Strings
by Aristotle (Chancellor) on Jan 13, 2005 at 02:14 UTC

    Here's some trickery with the oft-forgotten index.

    sub in_common { my $all = join '', scalar( reverse $_[ 0 ] ), @_[ 1 .. $#_ ]; my $offs = length $_[ 0 ]; my $minpos = 0; return grep { my $pos = index( $all, $_, $offs-- ); $pos >= $minpos++; } split //, $_[ 0 ]; }

    The trick is to fold checking for non-existant characters and already-checked-for characters into a single condition: index returns -1 if it doesn't find the character and a 0-based offset if it does. Therefore in the first iteration, the check is for >= 0. This seen character is then included in the searched string. On the next iteration, the check is for offset >= 1 — ie it must exist in the string (offset > -1) but not be the character already seen (offset >= 1). In the next iteration, the offset must be >= 2 to account for already seen characters.

    In C, this approach would beat all others for speed and would be the most straightforward one too. But in Perl, it is harder to read than the others and slower than most too. sleepingsquirrel's regex is probably the fastest because it doesn't loop explicitly in Perl; merlyn's variation of your own solution is the clearest.

    Makeshifts last the longest.

Re: Unique List of Common Characters Between Two Strings
by de-merphq (Beadle) on Jan 11, 2005 at 19:53 UTC

    Since you throw away the hash I think you can simplify the grep a little:

    grep $u{ $_ }-- >0, split //, $str
    ---
    alter ego of demerphq
      de-merphq,
      ...I think you can simplify the grep a little

      I thought about that too at first but it is flawed thinking. If a letter appears more than once then it can go from 0 to a negative number which will again start returning true values. The intention was to only produce a unique list.

      Cheers - L~R

        Yeah sorry, the >0 should be there.. (i updated.)

        ---
        alter ego of demerphq
Re: Unique List of Common Characters Between Two Strings
by Roy Johnson (Monsignor) on Jan 12, 2005 at 17:06 UTC
    YAWTDI:
    my %u = map {$_ => $_} split //, $str1; return grep defined, delete @u{split //, $str2};
    Or, to bletch it all into one line:
    return grep defined, delete @{{map {$_ => $_} split //, $str1}}{split +//, $str2};

    Caution: Contents may have been coded under pressure.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://421394]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2024-04-24 00:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found