in reply to Re: Loading file into memory
in thread Loading file into memory

I appreciate all the help guys! However, I think I should clarify some...

First, I made a typo in my first post. Each record in the file actually looks like this:
ID      NAME      PLK      NUM1      NUM2
daamaya:Daniel R. Amaya,PLK,0000056789,ED97865:10:25:blah:blah

Now, I need every field of that somehow loaded into memory. Then I want to use the $id and $name fields to search through my CSV file. If $name is found (e.g. Daniel Amaya), then I want to print the PLK,num1,num2 from the CVS file to the screen (or a file). I just figure that is a faster way to use one file's fields to search another. I was trying to do it by opening file line-by-line and then searching CSV, but it was super slow.

Replies are listed 'Best First'.
Re^3: Loading file into memory
by TGI (Parson) on Aug 05, 2008 at 21:16 UTC

    You want to process the CSV file and put into a database or some sort. Either a DB file or SQLite. If you use SQLite, make sure you index the fields you'll be searching against. If you use a DB file, (such as the Berkeley DB) you'll want to think carefully about your data structure. MLDBM is also worth looking at. Your query times will improve dramatically.

    So your flow should look something like this:

    my $dbh = Read_Huge_File_Into_DB( $huge_file_path ); my @customers = Process_Customer_Information_File( $dbh, $file_path ); Print_Report(\@customers); sub Process_Customer_Information_File { my $dbh = shift; my $file = shift; open( $info, '<', $file ) or die "Uh oh $!"; my @customers_found; while ( my $line = <$info> ) { my $customer_data = ParseCustomerData($line); my $name = $customer_data->[NAME]; if ( Customer_Found( $dbh, $name ) ) { push @customers_found, $customer_data; } } return @customers_found; }

    If it were me, I'd use SQLite.


    TGI says moo