in reply to agregating columns in several csv files

I don't see at first glance why your program would only loop once, but it seems to me that $_ is not set to a line from the file but a glob. I don't see why that is though. But you'll encounter other problems anyway.

Most operating systems have a limit on the number of files a program can have open at the same time. This limit is usually around 250 or around 1020, so you'll be hitting that limit with your 3000 files.

I would import all the CSV files into a database, for example SQLite is very convenient for that. If you can't even install SQLite anywhere, you can potentially even get along using a BTree database like DB_File, as your key is only a single column.

If you have the data in an SQL database, calculating the totals etc. becomes trivial, as SQL has the sum() and max() aggregates.

Replies are listed 'Best First'.
Re^2: agregating columns in several csv files
by Utilitarian (Vicar) on Jun 05, 2009 at 07:58 UTC
    Hi Corion, thanks for the quick reply, however
    MaxFDLimit ~= 65K
    ulimit -n 4096
    So I'm not hitting the OS FD limit, was hitting my shells limit until I adjusted ulimit -n.

    If it were in a database this would be easier, however I'm restricted in the tools to hand for this,(Can't take data off the server). It's also worth noting that I can only use core modules as i can't install anything on the server :(

      Anonymous Monk has found why the diamond operator does not work.

      DB_File or any other BTree storage likely is installed already with your Perl, like DBM_File or SDBM_File, and they can be used to conveniently rearrange the data and adress it by a common key.

      Have you considered concatenating all your CSV files into a single file? If you then sort that single file by the columns you want to use as keys, all your calculations become much simpler as you only have one set of variables that you purge whenever your key changes.

      I will mention Yes, even you can use CPAN, but it's only for future reference as DBD::SQLite will need a C compiler, so you can't completely paste it into your script.