comment on

If you have thousands of lines to process this may be quicker than some other approaches. It will process 100,000 lines (of 2 to 4 elements each) in under 4 seconds.

I won't attempt to categorise the algorithm because it relies on using hashes for lookups, and there is always the argument about whether they are O(1) or 0(logN).

Update: On emperical evidence this is O(N) (at ~4secs/100,000) upto the limit of memory.

#! perl -slw
use strict;

our $LINES ||= 1000;

## Gen some test data to a simulated file
my @a = qw[ one two three four five six seven eight nine ];;
my @lines = map{ join '_', @a[ map{ rand @a } 1 .. 2+rand 3 ] } 1 .. $
+LINES;

warn "Processing " . @lines . " start: " . localtime;

## Index the lines
my %index;
for my $idx ( 0 .. $#lines ) {
    $index{ $_ }{ $idx }++ for split '_', $lines[ $idx ];
}

my @keys = keys %index;

while( my $first = pop @keys ) {
    for my $second ( @keys ) {
        print "$first/$second: ", join "\n\t", 
            @lines[ 
                grep{ 
                    exists $index{ $second }{ $_ } 
                } keys %{ $index{ $first } } 
            ];
    }
}

warn "Processing " . @lines . " stop:  " . localtime;

__END__
P:\test>392861 -LINES=100000 >nul
Processing 100000 start: Wed Sep 22 19:58:46 2004 at P:\test\392861.pl
+ line 10.
Processing 100000 stop:  Wed Sep 22 19:58:50 2004 at P:\test\392861.pl
+ line 31.
[download]

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

In reply to Re: nested combinations: algorithm advice? by BrowserUk
in thread nested combinations: algorithm advice? by revdiablo

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.