in reply to Find the common beginning substring in an array of strings

Assuming a character set with a single-character-to-byte mapping:

>perl -wMstrict -le "use List::Util qw(reduce); ;; my @ra = qw(ABCDxyz ABCfoo ABCDEbar); ;; use vars qw($a $b); my $min_start_seq = reduce { length($a) < length($b) ? $a : $b } map m{ \A \x00* }xmsg, map $ra[0] ^ $_, @ra ; my $common_starting_substring = $ra[0] & ~$min_start_seq; print qq{'$common_starting_substring'}; " 'ABC'

Replies are listed 'Best First'.
Re^2: Find the common beginning substring in an array of strings
by technojosh (Priest) on Jun 06, 2012 at 18:56 UTC
    This is the code that does the trick against all the examples I have so far... I don't have enough time to post it, but there is a 180-element array I have on my laptop that was not working with the code from the other comment. This was just tricky enough that I thought seeing the way some different monks approached it would be fun. I remain open to other solutions, as this is kind of a fun problem to think about.

    Thanks!!

      In that case, though I'm certain the masking method others have posted is the best solution, here's a recursive/substr method, just for fun:

      #!/usr/bin/env perl use Modern::Perl; sub findstr { if( @_ == 1 ){ return shift; } my $x = shift; my $y = shift; my $l = length $x > length $y ? length $y : length $x; for( my $i=$l; $i>0; $i--){ if( substr($x, 0, $i) eq substr($y, 0, $i)){ return findstr(substr($x,0,$i), @_); } } } my @array_of_test_names = ( 'History - Family History - Suite with Patient - Clinic - Select R +emove(-) Icon of Relative. - Clinic', 'History - Family History - Suite with Patient - Clinic - Save aft +er Entering values in all fields - Clinic', 'History - Family History - Suite with Patient - Clinic - Select A +dd(+) Icon of Relative. - Clinic', 'History - Family History - Suite with Patient - Clinic - Multiple + selection of relatives - Clinic', 'History - Family History - Suite with Patient - Clinic - Interact +ion Checking message not provided if procedure code is selected - Cli +nic', 'History - Family History - Suite with Patient - Clinic - Interact +ion Checking message provided if no procedure code is selected - Clin +ic', ); my $match = findstr(@array_of_test_names); say $match;

      Aaron B.
      Available for small or large Perl jobs; see my home node.

      In that case (and also because I'm only now reading this node), there's this based on a common prefix removal whim I had in the past (with partial credits to moritz for tidying up my original code as seen on my scratchpad):
      #!/usr/bin/perl use strict; use warnings; my @array_of_test_names = ( 'History - Family History - Suite with Pati +ent - Clinic - Select Remove(-) Icon of Relative. - Clinic', 'History - Family History - Suite with Pati +ent - Clinic - Save after Entering values in all fields - Clinic', 'History - Family History - Suite with Pati +ent - Clinic - Select Add(+) Icon of Relative. - Clinic', 'History - Family History - Suite with Pati +ent - Clinic - Multiple selection of relatives - Clinic', 'History - Family History - Suite with Pati +ent - Clinic - Interaction Checking message not provided if procedure + code is selected - Clinic', 'History - Family History - Suite with Pati +ent - Clinic - Interaction Checking message provided if no procedure +code is selected - Clinic', ); my @second_sample_array = ( 'WorkCenter - List Patients - To check the +items displayed in Filter By combo box', 'WorkCenter - List Patients - To check Filt +er By combo box when no patients are assigned on initial login', 'WorkCenter - List Patients - Order in whic +h patients are displayed', ); sub getleastcommonprefix { my @searcharray = @_; my $common = $searcharray[0]; foreach my $index (1 .. $#searcharray) { $_ = $searcharray[0] . reverse $searcharray[$index]; m/(.*)(.*)(??{quotemeta reverse $1})/s; if (length $1 < length $common) { $common = $1; } } ## end foreach my $index (1 .. $#searcharray) return $common; } ## end sub getleastcommonprefix print 'Common prefix for first sample [' . getleastcommonprefix(@array +_of_test_names) . "]\n"; print 'Common prefix for second sample [' . getleastcommonprefix(@seco +nd_sample_array) . "]\n";
      Which gives the following output:
      Common prefix for first sample [History - Family History - Suite with +Patient - Clinic - ] Common prefix for second sample [WorkCenter - List Patients - ]