in reply to Creating a binary matrix
You failed to mention what part of the problem you are having trouble with. I'm going to make the assumption that you already know enough Perl to open files, and so my solution assumes that you've already got the files in an array of some sort. Because of this assumption (a consequence of your lack of specifying sufficient detail), you will have to adapt this solution to your needs.
my @genes = qw( Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 ); my @raw_files = ( "Gene1 Gene2 Gene3", "Gene2 Gene3 Gene4", "Gene3 Gene4 Gene5", ); my @gene_in_files = map { my %content; @content{ split " ", $_ } = (); \%content; } @raw_files; my @gene_matrix = map { my $gene = $_; [ map { ~~exists $_->{$gene} } @gene_in_files ] } @genes; print "Gene", $_+1, " @{$gene_matrix[$_]}\n" for 0 .. $#gene_matrix;
This solution puts the contents of each file into a hash so that it can be quickly determined if Gene1 can be found in File1. Then it just iterates over the genes, and tests each file to see if the gene is found in the file. If so, it flips a flag in the gene matrix on. Otherwise, it sets the flag to zero.
If your requirement is that you use actual bits rather than an array of 1's and 0's, that too is pretty simple, but I'm going to assume that you know how to read the documentation for vec, and are able to adapt the solution to fit that need.
Here is the output from my example script:
Gene1 1 0 0 Gene2 1 1 0 Gene3 1 1 1 Gene4 0 1 1 Gene5 0 0 1 Gene6 0 0 0
Also, I suggest that when you're trying to show us tabular input and output, that you simply wrap it in <code></code> tags; it's easier to maintain fixed column widths when you don't have to worry about how HTML gobbles up duplicated whitespace, and you won't have to put <br /> after each line of tabular data. See Writeup Formatting Tips. By way of example, when I posted my sample output, I did this:
<code> [shift-insert, to paste output from my terminal] </code>
Update: Simplified the solution by eliminating temp variables holding various stages of the data transform.
Update2: And here's my "just for fun" version:
my @genes = qw( Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 ); my @raw_files = ( "Gene1 Gene2 Gene3", "Gene2 Gene3 Gene4", "Gene3 Gene4 Gene5" ); my $gene_num = 1; print "Gene", $gene_num++, " @{$_}\n" for sub { my @in_file = map { { map { $_ => 0 } split " ", $_ } } @{+shift}; map { my $gene = $_; [ map { ~~exists $_->{$gene} } @in_file ] } @{+shift}; }->( \@raw_files, \@genes );
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Creating a binary matrix
by perl_user123 (Initiate) on Mar 21, 2014 at 06:23 UTC | |
by davido (Cardinal) on Mar 21, 2014 at 14:32 UTC |