Re^3: Getting data from a file (Operons and Genes).

I chose not to assume that an operon was unique thoughout the file, since it wasn't explicitly stated in the spec.

If I can assume uniqueness, then I'd remove the push and change to a HoA:

$hash->{$fields[0]} = \@genes;
[download]

Where do you want *them* to go today?

Comment on Re^3: Getting data from a file (Operons and Genes). Download Code

Replies are listed 'Best First'.
Re^4: Getting data from a file (Operons and Genes). by chrisantha (Initiate) on Jun 14, 2007 at 23:46 UTC
Thanks, I was trying this but failing. #!/usr/bin/perl use strict; my $operon; my %operonHash; while (<>) { chomp; if ( /(\b.+?\b)/ ) { # word boundary + any character at least once, up + to the first word boundary. #print "Matched: \|$`<$&?>$'\|\n"; $operon = $_; #print $& . " " ; $operonHash{$&} = (); } else { print "No match. \n"; } print "\n"; if ( /\w+\\|/ ) { # word boundary + any character at least once, up to +the first word boundary. print "Matched: \|$`<<$&>>$'\|\n"; } else { print "No match. \n"; } } [download] #The problem is, 1. How to get rid of the \| from the expression that was found, and 2. How to get MULTIPLE genes before \| when more than one gene appears on a line?	[reply] [d/l]
Re^5: Getting data from a file (Operons and Genes). by thezip (Vicar) on Jun 15, 2007 at 00:18 UTC
From inspection, I noticed that there always seemed to be four spaces delimiting the columns. In the data set provided here, it does not seem the case (ie. there are tabs instead). If tabs are the actual delimiter, then use this line instead: `my @fields = split(/\t/, $_);` [download] The initial goal is to split each line of data into four separate fields. Where do you want them* to go today?*	[reply] [d/l]