Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^2: Reading values from one .csv, searching for closest values in second .csv, returning results in third .csv?

by pickleswarlz (Initiate)
on Feb 26, 2013 at 19:38 UTC ( #1020747=note: print w/replies, xml ) Need Help??


in reply to Re: Reading values from one .csv, searching for closest values in second .csv, returning results in third .csv?
in thread Reading values from one .csv, searching for closest values in second .csv, returning results in third .csv?

Thank you very much! I am using the first suggestion, as I don't really understand the binary search yet, and so far it doesn't seem to have problems with searching the 500,000 rows.

I am now running into the problem of the results being output on different lines in the resulting CSV file for each print command. Here is my code:

#!/usr/bin/perl use warnings; use strict; open my $GENES, '<', 'chr1data.csv' or die $!; open my $LOCATIONS, '<', 'chr1snps.csv' or die $!; chomp(my @locations = map { (split ',')[2] } <$LOCATIONS>); # If IDs are not already sorted, uncomment the following line: # @locations = sort { $a <=> $b } @locations; for (<$GENES>) { my ($chromosome, $start, $end) = split ','; print "$chromosome,$start,$end"; my $idx = 0; # For $end, start searching where you left for + $start. my $correction = 0; # Needed for Start(-) == Start and End(+) == E +nd. for my $pos ($start, $end) { $idx++ while $locations[$idx] <= $pos - $correction and $idx <= $#locations; die "No numbers around $pos ($idx) \n" if $idx == 0 or $idx > $#locations; print ",$locations[$idx-1],$locations[$idx]"; $correction = 1; } print "\n"; }

Printing print ",$locations[$idx-1],$locations[$idx]"; puts this information on a new line. I'd like it to come out on the same line as print "$chromosome,$start,$end"; for each search. Do I have a \n in the wrong place?

  • Comment on Re^2: Reading values from one .csv, searching for closest values in second .csv, returning results in third .csv?
  • Select or Download Code

Replies are listed 'Best First'.
Re^3: Reading values from one .csv, searching for closest values in second .csv, returning results in third .csv?
by choroba (Archbishop) on Feb 26, 2013 at 22:42 UTC
    Your $end probably contains newline (I split on ' ', which removed it, you split on a comma). Just
    chomp $end;
    before printing it.
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Brilliant, works perfectly! Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1020747]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (1)
As of 2021-12-08 03:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (34 votes). Check out past polls.

    Notices?