Mostly Harmless

emdub has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Mostly Harmless
by Athanasius (Archbishop) on Dec 28, 2013 at 06:28 UTC

Hello emdub, and welcome to the Monastery!

Here is one way to do this (for a single pair of lines):

#! perl
use strict;
use warnings;

my $string1 = 'CHARLIE ROOT';
my $string2 = 'HARRY NODE';
my @list1   =  split //, $string1;
my @list2   =  split //, $string2;
my %hash1   =  map { $_ => undef } @list1;
my %hash2   =  map { $_ => undef } @list2;

delete $hash1{' '};
delete $hash2{' '};

for (@list1)
{
    print unless exists $hash2{$_};
}

for (@list2)
{
    print unless exists $hash1{$_};
}
[download]

Output:

16:26 >perl 815_SoPW.pl
CLI TY ND
16:26 >
[download]

There are no doubt more elegant/streamlined approaches, but this should give you an idea. (Extending the script to read and compare lines from the two files is left as an exercise!)

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: Mostly Harmless

by atcroft (Abbot) on Dec 28, 2013 at 08:06 UTC

Here is another approach (again, for one pair of words, so extending would be an exercise for the reader):

use strict;
use warnings;

my $str1 = q{CHARLIE ROOT}; 
my $str2 = q{HARRY NODE}; 
my $output = process( $str1, $str2 ) 
    . q{ } 
    . process( $str2, $str1 ); 
print $output, qq{\n}; 

sub process { 
    my ( $word1, $word2 ) = @_; 
    my @letters_to_remove = grep{ ! m/\s+/; } 
                                split //, $word2; 
    my $remove_str = join q{}, 
                         q{[}, 
                         @letters_to_remove, 
                         q{]}; 
    my $result = $word1; 
    $result =~ s/$remove_str//g; 
    return $result; 
}
[download]

Output:

CLI T Y ND
[download]

In this code, a character group is constructed on the fly, and a substitution operation replaces each member globally in the string with nothing, removing the character. (The string is not as efficient as it could be, since it may contain repeated characters, but this was off-the-cuff code, so there we are.)

Hope that helps-if not in usefulness, then at least in seeing different possible approaches to the same problem.

[reply]
[d/l]
[select]

Re: Mostly Harmless
by CountZero (Bishop) on Dec 28, 2013 at 10:07 UTC

Set::Scalar

use Modern::Perl;
use Set::Scalar;

my $first_word  = 'CHARLIE ROOT';
my $second_word = 'HARRY NODE';

my $first  = Set::Scalar->new( split //, $first_word );
my $second = Set::Scalar->new( split //, $second_word );

say $first - $second;    # difference between $first and $second
say $second - $first;    # difference between $second and $first

say $first % $second;    # symmetric difference
[download]

(C I L T)
(D N Y)
(C D I L N T Y)
[download]

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics

[reply]
[d/l]
[select]

Re: Mostly Harmless
by GrandFather (Saint) on Dec 28, 2013 at 09:07 UTC

A single hash (and a couple of temporary hashes) cleans up the code somewhat at the cost of a higher degree of Perl knowledge for understanding:

use strict;
use warnings;

my $str1   = 'CHARLIE ROOT';
my $str2   = 'HARRY NODE';
my @chars1 = split '', $str1;
my @chars2 = split '', $str2;
my %hits;

$hits{$_} = 1 for @chars1;
++$hits{$_} for keys {map {$_ => 1} @chars2};
print $_ for grep {$_ eq ' ' || $hits{$_}++ == 1} @chars1, @chars2;
[download]

Prints:

CLI TY ND
[download]

True laziness is hard work

[reply]
[d/l]
[select]

Re: Mostly Harmless
by Lennotoecom (Pilgrim) on Dec 28, 2013 at 11:26 UTC

$a[0] = 'CHARLIE ROOT';
$a[1] = 'HARRY NODE';
s/[$a[~$i]]//g and print while ($i, $_) = each @a;
[download]

$a[0] = 'CHARLIE ROOT';
$a[1] = 'HARRY NODE';
s/([$a[~$i]])/$1 eq ' '?' ':''/eg and print while ($i, $_) = each @a;
[download]

[reply]
[d/l]
[select]

Re: Mostly Harmless
by kcott (Archbishop) on Dec 29, 2013 at 00:26 UTC

G'day emdub,

Welcome to the monastery.

The following code shows how you could handle multiple pairs of strings. I've hard-coded a space to match your "Output". You haven't indicated exactly what you mean by "letter": I've used any character matching /\p{Alpha}/ (see "perluniprops: Properties accessible through \p{} and \P{}") — your concept of "letter" may differ.

#!/usr/bin/env perl -l

use strict;
use warnings;

my @pairs = (['CHARLIE ROOT', 'HARRY NODE'], ['abc1', 'cde1']);

print get_non_shared(reverse @$_), ' ', get_non_shared(@$_) for @pairs
+;

sub get_non_shared {
    my ($ref, $src) = @_;
    my %ref_char = map { $_ => 1 } grep { /\p{Alpha}/ } split '' => $r
+ef;
    join '' => grep { ! $ref_char{$_} } split '' => $src;
}
[download]

Output:

CLI T Y ND
ab1 de1
[download]

Note that you haven't indicated how you get your pairs of strings nor what combinations are involved in the operations. Does one string equate to one line from a file? Are you operating on just single pairs from matching lines (i.e. ~300 operations) or all possible combinations (i.e. ~90,000 operations)? Depending on the answers to those — and potentially other questions, such as how often do you plan to run the program — some efficiency optimisations may be useful.

-- Ken

[reply]
[d/l]
[select]