in reply to Challenge: prefix($x, $y) . suffix($x, $z) eq $x

This seems too easy, which usually means I've misinterpreted the rules somewhere?

Updated: Improved the layout of the results.
#! perl -sw use 5.010; use strict; use Data::Dump qw[ pp ]; my $start = time; open WORDS, '<', 'words.txt' or die $!; my %words; chomp, undef $words{ $_ } while <WORDS>; close WORDS; my $word = shift @ARGV or die "Need a word"; die "$word not in dictionary" unless exists $words{ $word }; my( %pre, %suf ); for ( keys %words ) { for my $p ( 1 .. length() -1 ) { push @{ $pre{ substr $_, 0, $p } }, $_ if substr( $_, 0, $p+1 ) ne substr( $word, 0, $p+1 ); push @{ $suf{ substr $_, - $p } }, $_ if substr( $_, -( $p+1 ) ) ne substr( $word, - ( $p+1 ) ); } } for my $p ( 1 .. length( $word ) - 1 ) { my( $pre, $suf ) = ( substr( $word, 0, $p ), substr( $word, - ( length( $word ) - $p ) ) ); if( exists $pre{ $pre } and exists $suf{ $suf } ) { my( $nPre, $nSuf ) = ( scalar @{ $pre{ $pre } }, scalar @{ $suf{ $suf } } ); say "$pre . $suf = $word"; printf "\t$pre ( %s )\n", $nPre <= 10 ? join( ' ', @{ $pre{ $pre } } ) : join( ' ', @{ $pre{ $pre } }[ 0 .. 9 ], '...', $nPre ); printf "\t$suf ( %s )\n", $nSuf <= 10 ? join( ' ', @{ $suf{ $suf } } ) : join( ' ', @{ $suf{ $suf } }[ 0 .. 9 ], '...', $nSuf ); } } printf "Took: %.2f seconds\n", time() -$start; __END__ [21:21:15.55] c:\test>759369 hello h . ello = hello h ( householder hulled hipster hops hikings honorary hacienda +hob hunts harebrained ... 2352 ) ello ( cello violoncello jello bordello ) he . llo = hello he ( heartier hesitation hearings heaver heftinesses heartache +s hectic hepatitides hemisphere heeds ... 550 ) llo ( cigarillo peccadillo hallo hullo armadillo ) hel . lo = hello hel ( helium helpfully helve helpless heliocentric helm held h +elicoptering helps helping ... 46 ) lo ( gigolo polo tangelo buffalo halo tremolo pueblo silo solo + kilo ... 11 ) hell . o = hello hell ( hellion hellhole hellebore hellish hellions hellholes h +ellebores helluva hells hellishly ... 15 ) o ( dido no legato magneto canto stereo wino commando cockatoo + bongo ... 386 ) Took: 3.00 seconds [21:21:50.32] c:\test>759369 today to . day = today to ( tonsils tormentor toiled towellings tortellinis totterer +tost topographies tonal touring ... 531 ) day ( doomsday someday holiday payday workday heyday weekday b +irthday noonday everyday ... 14 ) tod . ay = today tod ( toddy toddies toddled toddling toddlers toddle toddles t +oddler ) ay ( underway swordplay defray overpay driveway shay hooray ru +naway unsay stairway ... 165 ) Took: 2.00 seconds [21:22:16.70] c:\test>759369 tumultuous tumult . uous = tumultuous tumult ( tumults ) uous ( vacuous assiduous ambiguous unambiguous tenuous conspic +uous incongruous strenuous promiscuous innocuous ... 26 ) Took: 2.00 seconds [21:22:22.57] c:\test>759369 freedom fre . edom = freedom fre ( frequented frescoes freaked fretfulness frequent freckli +ng freighters freakishness freshets freshening ... 75 ) edom ( boredom princedom dukedom ) free . dom = freedom free ( freezing freeholds freezes freer freethinking freeware +freewill freebee freebooter freelancers ... 60 ) dom ( fiefdom random stardom officialdom martyrdom chiefdom se +ldom earldom kingdom sheikdom ... 18 ) freed . om = freedom freed ( freedman freedmen ) om ( backroom unbosom darkroom legroom doom gloom venom custom + phenom maelstrom ... 90 ) Took: 2.00 seconds

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Challenge: prefix($x, $y) . suffix($x, $z) eq $x
by Limbic~Region (Chancellor) on Apr 22, 2009 at 22:11 UTC
    BrowserUk,
    I believe you have the requirements spot on. I agree it wasn't much of a "challenge" but I like to keep such posts similarly named for searching purposes later. When blokhead first described the problem he was mentioning possibly using suffix trees and other doodads which made me wonder how other people would approach it.

    Update: The number of matches came as a big suprise to blokhead so I like the way you consolidated results. The reason I limited the matching to a single input was because of the enormous amount of output when doing all possible matches as blokhead first asked for though I coded it to find all matches. Would your approach have changed if you were doing all matches not just one for the input provided?

    Cheers - L~R

      Would your approach have changed if you were doing all matches not just one for the input provided?

      Of necessity it would have to change a bit if it were going to look up more than one word per run. I saved a litte time by pre-filtering the pre/suffix hashes to avoid storing prefixed that would contravene the longest prefix only rules. That means those hashes are specific to the word being dealt with. To match more than one word per run, requires deferring the filtering until later.

      Here's my first attempt at that. It deals with the tennis players file in under 1/3 seconds, so I've made no attempt to optimise it:

      #! perl -sw use 5.010; use strict; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; my $start = time; die "Need words filename" unless @ARGV and -e $ARGV[ 0 ]; open WORDS, '<', $ARGV[ 0 ] or die $!; chomp( my @words = <WORDS> ); close WORDS; my( %pre, %suf ); for ( @words ) { for my $p ( 1 .. length() -1 ) { push @{ $pre{ substr $_, 0, $p } }, $_; push @{ $suf{ substr $_, - $p } }, $_; } } for my $word ( @words ) { for my $p ( 1 .. length( $word ) - 1 ) { my $pre = substr $word, 0, $p; my $prePlus = substr $word, 0, $p+1; my $suf = substr $word, -( length( $word ) - $p ); my $sufPlus = substr $word, -( length( $word ) - ( $p-1 ) ); if( exists $pre{ $pre } and exists $suf{ $suf } ) { my @pre = grep{ !/^$prePlus/ } @{ $pre{ $pre } }; next unless @pre; my @suf = grep{ !/$sufPlus$/ } @{ $suf{ $suf } }; next unless @suf; say "$pre . $suf = $word"; printf "\t$pre ( %s )\n", @pre <= 10 ? join( ' ', @pre ) : join( ' ', @pre[ 0 .. 9 ], '... ' . @pre ); printf "\t$suf ( %s )\n", @suf <= 10 ? join( ' ', @suf ) : join( ' ', @suf[ 0 .. 9 ], '... ' . @suf ); } } } printf STDERR "Took: %.2f seconds\n", time() -$start; __END__ C:\test>759369 players.txt >nul Took: 0.29 seconds C:\test>759369 players.txt a . hn = ahn a ( abramovic adamczak afinogenova aguilar akiki akita alawi a +lbanese albuquerque aleksandrova ... 43 ) hn ( mohn ) aki . ki = akiki aki ( akita ) ki ( dabrowski filipovski jovanovski kitazaki lisicki miyazaki + solanki wozniacki ) aki . ta = akita aki ( akiki ) ta ( costa konta namigata pennetta tananta yokota zanchetta ) ... w . ong = wong w ( wang wannasuk warburton washington webleysmith weidemann w +einhold wejnert welford westbury ... 20 ) ong ( hong jeong keothavong tangphong zhong ) wo . ng = wong wo ( woerle wowchuk wozniacki wozniak ) ng ( chang cheng chuang frilling haring herring huang hwang ka +ng king ... 16 ) woznia . cki = wozniacki woznia ( wozniak ) cki ( lisicki ) woznia . k = wozniak woznia ( wozniacki ) k ( antoniychuk black blank buryachok czink ewijk fink fitzpat +rick gawlik grajdek ... 29 ) x . i = xi x ( xu ) i ( akiki alawi alnabhani andrei ani appineni arai bai balducc +i bartoli ... 93 ) x . ie = xie x ( xu ) ie ( binnie delefortrie elie ) x . u = xu x ( xi xie ) u ( anghelescu begu buzarnescu cadantu daniilidou dulgheru faf +aliou georgatou gerasimou hincu ... 28 ) ... z . hong = zhong z ( zabala zafirova zagorska zahlavova zahlavovastrycova zaja +zakopalova zanchetta zaniewska zecpeskiric ... 17 ) hong ( tangphong ) zh . ong = zhong zh ( zhang zhao zharkova zheng ) ong ( jeong keothavong wong ) zho . ng = zhong zho ( zhou ) ng ( chang cheng chuang frilling haring herring huang hwang ka +ng king ... 16 ) zh . ou = zhou zh ( zhang zhao zharkova zheng ) ou ( daniilidou fafaliou georgatou gerasimou ) zho . u = zhou zho ( zhong ) u ( anghelescu begu buzarnescu cadantu dulgheru hincu hisamats +u hsu liu lu ... 24 ) zo . ric = zoric zo ( zotter zovko ) ric ( majeric njiric zecpeskiric ) Took: 3.79 seconds

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.