retrieving in the correct order

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: retrieving in the correct order by halley (Prior) on Dec 16, 2004 at 19:38 UTC
See my response in an old thread: Re: sort an array according to another array Also, you're initializing `@array2` with `qq()` which makes a string, not a list of values. You want either `qw( list of words )` or `('word', 'word', 'word')` or `(value, value, value)` without the qq. If you think of your identifiers as words, then you should use `lt/eq/gt` instead of </==/> when comparing them, too. -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l] [select]
Re: retrieving in the correct order by VSarkiss (Monsignor) on Dec 16, 2004 at 19:27 UTC
I take it the first array is relatively small? If so, you can just turn your loops inside out: `for my $i (@array2) { for my $line (@array1) { my $key = "gi\|$i\|"; if (substr($line, 0, length $key) eq $key) { print $line; } } }` [download] Notice how you don't even need a regex, just a simple string compare. If `array1` is sorted, you can speed this up a little by remembering where you left off. Do not rebuke them with harsh words ... but rather lead them gently - with URLs - so that they may learn wisdom.	[reply] [d/l] [select]
Re^2: retrieving in the correct order by Anonymous Monk on Dec 16, 2004 at 20:21 UTC
Hi VSarkiss, Thanks for your solution but I can't get it to work! Maybe it's becuase my first array is quite big (~1000 sequences). But wouldn't this just slow it down? Thanks	[reply]
Re^3: retrieving in the correct order by VSarkiss (Monsignor) on Dec 16, 2004 at 20:47 UTC
Well, some detail on what went wrong would help.... When I tried it against the sample data in your original post, I noticed two things: You're testing for `gi\|` at the beginning of the line, but your `@array1` values start with `>gi\|`. I had to remove the >; you'll have to either fix the `$key =` line to match your data, or fix the data to match your test... You're populating a single element in `array2`. If you want each number to be an element of the array, you need to use `qw(...)`, not `qq(...)`. If these are both copy-and-paste artifacts, pleave provide more detail on what the error is. Do not rebuke them with harsh words ... but rather lead them gently - with URLs - so that they may learn wisdom.	[reply] [d/l] [select]
Re^4: retrieving in the correct order by Anonymous Monk on Dec 16, 2004 at 21:24 UTC
Re^5: retrieving in the correct order by VSarkiss (Monsignor) on Dec 16, 2004 at 21:39 UTC
Some notes below your chosen depth have not been shown here
Re^3: retrieving in the correct order by insaniac (Friar) on Dec 16, 2004 at 21:51 UTC
hey, check my code below, i tested it with a file with 3000 lines in it. `time cat gen.txt \| perl -w gen.pl` says: `real 0m0.139s user 0m0.109s sys 0m0.000s` [download] if you're still looking for an answer... -- to ask a question is a moment of shame to remain ignorant is a lifelong shame	[reply] [d/l] [select]
Re: retrieving in the correct order by Animator (Hermit) on Dec 16, 2004 at 19:14 UTC
A possible way is to build a hash of the first array, where the key is the id of the element. If that's done you can easily use a hash slice to get an array with the values in the order of the second array.	[reply]
Re: retrieving in the correct order by nedals (Deacon) on Dec 16, 2004 at 19:52 UTC
# If the files are not too large... # Read in the sequence file putting the data into a hash use strict; my %hash; while (<DATA>) { chomp $_; my ($id,$protein) = /^gi\\|(.+?)\\|.+\\|(.+)$/; ## Save what you nee +d $hash{$1} = $2; } # Now use the second file to print out the hash my @array2 = qw(13470319 13470331 15460001 13490216); map { print "$hash{$_}\n"; } @array2; __DATA__ gi\|13490216\|ref\|NP_101899.1\|protein for 216 gi\|13470331\|ref\|NP_101896.1\|protein for 331 gi\|15460001\|ref\|NP_101898.1\|protein for 001 gi\|13470319\|ref\|NP_101897.1\|protein for 319 [download]	[reply] [d/l]
Re^2: retrieving in the correct order by Animator (Hermit) on Dec 16, 2004 at 20:21 UTC
A hash slice, as mentioned in my first post would be faster (I guess)... something like: `print join("\n", @hash{@array2});`	[reply] [d/l]
Re^2: retrieving in the correct order by Animator (Hermit) on Dec 16, 2004 at 21:00 UTC
Another thing (just noticed it now) why would you be using map?? map returns an array (filled with x times 1 (return value of print)), which you aren't using at all... What should be used is for/foreach (or a hash slice ofc).	[reply]
Re^3: retrieving in the correct order by nedals (Deacon) on Dec 16, 2004 at 22:05 UTC
The map method was already sitting in my 'test' template. The foreach method is a better option, but I liked your hash slice method even better. ++	[reply]
Re: retrieving in the correct order by insaniac (Friar) on Dec 16, 2004 at 20:54 UTC
or, if the first array is really a text file, say on a UNIX/LINUX system, you could cat the first file and read it line by line.. no? the scanning perl program gen.pl: --------------------------------- #!/usr/bin/perl use strict; my @array = qw(13470319 13470331 15460001 13490216); my @array2; while(my $line = <> ) { foreach my $id (0..$#array) { $array2[$id]=$line if $line =~ m/^gi\\|($array[$id])\\|/; } } print "order: ", join (" ", @array), "\n"; map {print} @array2; ------------------------------- the text file: gi\|13470331\|ref\|NP_101896.1\| hypothetical protein MFWVTKKALMPFLMLPAGIIFVSAVGYAINWLFSTLFQFQPPLVEGPAGPVTVLIFTITMLLAYDISYYL gi\|13470319\|ref\|NP_101897.1\| hypothetical protein MGAYCQAHPACKVTDRTVIGRRDAAMNAPFVLAIPRTRTFEVVTSAARLAEIAPAWTALWQRAGGLVFQH ------------------------------- the execution: # cat gen.txt \| perl gen.pl order: 13470319 13470331 15460001 13490216 gi\|13470319\|ref\|NP_101897.1\| hypothetical protein gi\|13470331\|ref\|NP_101896.1\| hypothetical protein [download] this just looked like a quick and keep it simple job to me.. UPDATE: updated the code to display correct order.. -- to ask a question is a moment of shame to remain ignorant is a lifelong shame	[reply] [d/l]
Re^2: retrieving in the correct order by Animator (Hermit) on Dec 16, 2004 at 21:02 UTC
I think you are missing the point... If I understand the poster correctly then he (or she) wants to output them in the order they appear in the second array, so in this case first line/element 13470319, after that line/element 13470331 and so on	[reply]
Re^3: retrieving in the correct order by insaniac (Friar) on Dec 16, 2004 at 21:12 UTC
aah.. oops.. indeed i understood wrongly. update: and wisdom came to me :-D check the above code... simple, line by line and in the correct order. this time i tested the code ;-) -- to ask a question is a moment of shame to remain ignorant is a lifelong shame	[reply]