mozart has asked for the wisdom of the Perl Monks concerning the following question:

Hello all the forum users, I want to construct a matrix for different IDs having "1" values at variables lying in the interval of the given data and "0" otherwise. Example: THe input data will be like this

A 1 2 A 7 10 A 15 20 B 3 5 B 11 15 C 5 10 D 10 20

The headers will be different values of no relation to each other, I have input them in the @array e.g @array = (1 , 2, 3 , 5 , 7, 10 , 11, 15 , 20)

I want all the IDs (A, B, C, D) to be in one output file, but each in only one line: Aim

ID 1 2 3 5 7 10 11 15 20 A 1 1 0 0 1 1 0 1 1 B 0 0 1 1 0 0 1 1 0 C 0 0 0 1 1 1 0 0 0 D 0 0 0 0 0 1 1 1 1

So, all the vlaues will be "map" to the arrays but do not know how to look further to see if the Id is coming again, hence put 1 in the corrsponding variables instead of 0? (Id could be repeated with different values for up to 100 times! or more no fixed rule)

I tried to work it out but seems too hard for me!! Thanks,

Replies are listed 'Best First'.
Re: construct a matrix by different intervals
by GrandFather (Saint) on Aug 19, 2007 at 21:26 UTC

    Interesting problem. One key is to use grep to filter a range to generate entries just for the ids that have been given. Because you don't know what ids you will end up with until all the data has been parsed, you need to generate the table data as a second step. Consider:

    use warnings; use strict; my %data; my %ids; while (<DATA>) { chomp; my ($rid, $start, $end) = split; push @{$data{$rid}}, [$start, $end]; @ids{$start, $end} = (); } my @idList = sort {$a <=> $b} keys %ids; for my $rid (keys %data) { my @pairs = @{$data{$rid}}; my %fields; $data{$rid} = []; @fields{grep exists $ids{$_}, $_->[0] .. $_->[1]} = () for @pairs; @{$data{$rid}} = map {exists $fields{$_}} @idList; } my $format = '%-3s' . ('%3d ' x keys %ids) . "\n"; printf $format, 'ID', sort {$a <=> $b} keys %ids; printf $format, $_, @{$data{$_}} for sort keys %data; __DATA__ A 1 2 A 7 10 A 15 20 B 3 5 B 11 15 C 5 10 D 10 20

    Prints:

    ID 1 2 3 5 7 10 11 15 20 A 1 1 0 0 1 1 0 1 1 B 0 0 1 1 0 0 1 1 0 C 0 0 0 1 1 1 0 0 0 D 0 0 0 0 0 1 1 1 1

    DWIM is Perl's answer to Gödel
Re: construct a matrix by different intervals
by jwkrahn (Abbot) on Aug 20, 2007 at 00:29 UTC
    Another solution is to use bitmaps:
    $ perl -e' $x = q[A 1 2 A 7 10 A 15 20 B 3 5 B 11 15 C 5 10 D 10 20 ]; open my $fh, q[<], \$x or die "Cannot open \$x: $!"; my ( $mask, %data, $max ); while ( <$fh> ) { my ( $key, @vals ) = split; for my $val ( @vals ) { vec( $mask, $val, 1 ) = vec( $data{ $key }, $val, 1 ) = 1; $max = $val if $max < $val; } } my @vals; for my $val ( 0 .. $max ) { push @vals, $val if vec $mask, $val, 1; } my $width = length $vals[ -1 ]; printf q[ID] . ( " %${width}d" x @vals ) . "\n", @vals; for my $key ( sort keys %data ) { printf q[%-2s], $key; for my $val ( @vals ) { printf q[ %*d], $width, vec $data{ $key }, $val, 1; } print "\n"; } ' ID 1 2 3 5 7 10 11 15 20 A 1 1 0 0 1 1 0 1 1 B 0 0 1 1 0 0 1 1 0 C 0 0 0 1 0 1 0 0 0 D 0 0 0 0 0 1 0 0 1
Re: construct a matrix by different intervals
by fmerges (Chaplain) on Aug 19, 2007 at 19:34 UTC

    Hi,

    You can use a HoH

    my $result = { 'A' => { 1 => 1, 2 => 1, 7 => 1, 10 => 1, ... }, 'B' => { ... }, ... };

    Using something like this for generating it:

    open my $fh, '<', $filename; while (<$fh>) { chomp; my ($id, @numbers) = split; for my $number (@numbers) { $result->{$id}{$number} = 1; } }

    Update:

    Later if you want to output it, you could do something like this:

    my %unique_numbers = (); for (values %$result) { for my $number (keys %$_) { $unique_numbers{$number} = 1; } } my @numbers = sort {$a <=> $b} keys %unique_numbers; open my $out_fh, '>', $output_filename; print $out_fh "IDs @numbers\n"; for my $id (sort keys %$result) { my @fields = $id; for my $number (@numbers) { if (exists $result->{$id}{$number}) { push @fields, 1; } else { push @fields, 0; } } my $string = join " ", @fields; print $out_fh "$string\n"; }

    I know, it's not so nice, but... if it's a throw away problem...

    For the thing of having a fancy output, you can use format or take a look on CPAN, I remember to see some modules for producing a tabular ASCII output

    Update 2:

    Yes GrandFather, you're right, I just saw the "interval" thing :-)

    Regards,

    fmerges at irc.freenode.net

      Consider what is generated for the OP's data line 'D 10 20' and contrast with OP's desired result. The task is not quite so simple. ;)


      DWIM is Perl's answer to Gödel
Re: construct a matrix by different intervals
by dwm042 (Priest) on Aug 20, 2007 at 03:06 UTC
    This problem interested me in the sense that the data could be stored as ordered pairs. I was curious if a function could be written to efficiently process these data as ordered pairs. I didn't worry as much about possible variable output, though if someone thinks of a way to use a map to generate the final output (so that output could handle arbitrary data ranges) I'd be curious.

    My solution is:

    #!/usr/bin/perl use warnings; use strict; package main; my %ranges = (); while (<DATA>) { my ( $key, $initial_value, $final_value ) = split /\s+/, $_; push @{$ranges{$key}}, [ $initial_value, $final_value ]; } my @headers = (1, 2, 3, 5, 7, 10, 11, 15, 20); my @keys = ( 'A', 'B', 'C', 'D' ); my $value = sprintf "ID %2d %2d %2d %2d %2d %2d %2d %2d %2d\n", @heade +rs; print $value; foreach my $k (@keys) { my @results = map { &in_range($k, $_) } @headers; $value = sprintf "%-2s %2d %2d %2d %2d %2d %2d %2d %2d %2d\n", $k, + @results; print $value; } sub in_range { my $letter = shift; my $test_value = shift; if ( defined( $ranges{$letter} )) { foreach my $ref ( @{$ranges{$letter}} ) { if (($ref->[0] <= $test_value) && ($ref->[1] >= $test_value)) { return 1; } } } return 0; } __DATA__ A 1 2 A 7 10 A 15 20 B 3 5 B 11 15 C 5 10 D 10 20
    and the output is:

    ~/perl/monks$ ./ranges.pl ID 1 2 3 5 7 10 11 15 20 A 1 1 0 0 1 1 0 1 1 B 0 0 1 1 0 0 1 1 0 C 0 0 0 1 1 1 0 0 0 D 0 0 0 0 0 1 1 1 1
      Thanks for you all Guys,
      This was of great help.
      Mozart