Extract and read different columns from the file

sundeep has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Extract and read different columns from the file by ikegami (Patriarch) on Oct 27, 2010 at 03:47 UTC
`my %h; while (<>) { chomp; my @rec = split; $h{$rec[???]} = [ @rec[???, ???, ???] ]; }` [download] Replace the question marks with the appropriate indexes.	[reply] [d/l]
Re: Extract and read different columns from the file by kcott (Archbishop) on Oct 27, 2010 at 03:53 UTC
That data looks rather familiar. Are you a collegue of nofutur45? :-) Perhaps my solution to his problem (5 minutes ago) may help you. -- Ken	[reply]
Re: Extract and read different columns from the file by aquarium (Curate) on Oct 27, 2010 at 04:03 UTC
Can you please show your attempt at doing this?..which would further clarify what structure you're trying to arrive at. That's much better than asking for somebody to do all your (home)work. the hardest line to type correctly is: stty erase ^H	[reply]
Re^2: Extract and read different columns from the file by sundeep (Acolyte) on Oct 27, 2010 at 04:11 UTC
The entire text file looks something similar like this 3 9606 34 ACADM 187960098 NP_001120800.1 5 9606 37 ACADVL 4557235 NP_000009.1 6 9615 489421 ACAT1 73955189 XP_546539.2 I know how to read get each line as the input. After this, i should store all the desired columns in a hash table with the line number as the hash table index and to perform some string matching operations....	[reply]
Re^3: Extract and read different columns from the file by ikegami (Patriarch) on Oct 27, 2010 at 04:15 UTC
with the line number as the hash table index huh, why not just use an array? `my @a; while (<>) { chomp; my @rec = split; push @a, [ @rec[???, ???, ???] ]; }` [download] i should store all the desired columns You keep saying you only want certain columns, yet you don't say which. Again, just use the index of the columns you want for the question marks.	[reply] [d/l]
Re^3: Extract and read different columns from the file by aquarium (Curate) on Oct 27, 2010 at 04:38 UTC
`while($line=<>) { ($key,$num1,$num2,$string,$num3,$stringnum) = split(/\s/,$line); $somehash{"$key"}{"$num1"}{"$num2"}{"$string"}{"$num3"}= $stringnum; }` [download] that puts the data into a "hash", but probably not what you want. whether you use a hash or array structure largely depends on the data available and the logic/processing required. sequential processing and lack of a random access key lends itself to an array structure. when you have a good logical random access key (not a record sequence number) and need to access the records non-sequentially, use a hash. a hash structure, or even a mix of hash and array structure may be suitable. but exactly what structure do you want? both approaches could be out the window if you have millions of records in the file, whereby some much smarter arrangement would be required to achieve the logic/processing required. speaking of which..what is the required logic/processing for these records? the hardest line to type correctly is: stty erase ^H	[reply] [d/l]
Re: Extract and read different columns from the file by umasuresh (Hermit) on Oct 27, 2010 at 12:59 UTC
You can try `cut -d"\t" -f1,4,7 file_name > subset_columns` for e.g. if you need first, fourth and seventh columns in a tab delimited file in a Linux \| Cygwin command line.	[reply] [d/l]
Re: Extract and read different columns from the file by talexb (Chancellor) on Oct 28, 2010 at 15:22 UTC
Can anyone tell me , how to read only the specified required columns...and store into a hash table... Your task specification is incomplete. We don't know which columns are the required ones, and we have no idea what kind of data structure you have in mind. But I'll make a wild guess that the fifth element ('187960098') is going to be the index or key into the hash, and that you want to store the sixth element ('NP_001120800.1') as the value. In that case, the code would be `#!/usr/bin/perl # # use common::sense; use Data::Dumper; { my %h; while(<DATA>) { my @f = split; $h{$f[4]} = $f[5]; } print Dumper ( \%h ); } __DATA__ 3 9606 34 ACADM 187960098 NP_001120800.1` [download] When run, this produces `$ perl -w 867594.pl $VAR1 = { '187960098' => 'NP_001120800.1' }; $` [download] QED. Alex / talexb / Toronto "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds	[reply] [d/l] [select]