I am trying to parse the content of car.csv and place it into a directory neatly. My code works perfectly, its just the performance is lagging since our car database has over 5 million lines. I suspect the bottleneck is in the open() and close() calls.
$ cat car.csv
make,model,color,year
Honda,Civic,Red,2008
Toyota,Camry,Blue,2002
Honda,Accord,Red,1992
Nissan,Sentra,Blue,2009
Ford,Focus,Green,2009
Honda,Civic,Red,2003
Toyota,Corolla,Green,2002
Honda,Civic,Red,1992
Honda,Civic,Green,2008
Toyota,Camry,Orange,2002
Honda,Accord,Black,1992
Nissan,Sentra,White,2009
Ford,Focus,Green,2007
#!/usr/bin/perl -w
use strict;
#Run the script like, tail -n +2 car.csv | ./foo.pl
#I want to ignore the header
my $make; #Car Make
my $model; #Car Model
my $out_dir="/var/tmp/cars"; #Output directory for
while (<>) {
($make,$model) = split(/,/,$_,4);
system("mkdir -p $out_dir/$make");
open FILE, ">>$out_dir/$make/$model" or die $!;
print FILE $_;
close FILE;
}
The output will be:
$ cat /var/tmp/cars/Honda/Civic
Honda,Civic,Red,2008
Honda,Civic,Red,2003
Honda,Civic,Red,1992
Honda,Civic,Green,2008
$ cat /var/tmp/cars/Honda/Accord
Honda,Accord,Red,1992
Honda,Accord,Black,1992
$ cat /var/tmp/cars/Ford/Focus
Ford,Focus,Green,2009
Ford,Focus,Green,2007
Any thoughts?
TIA