DBD::CSV file size limitation?

beachbum has asked for the wisdom of the Perl Monks concerning the following question:

I wrote some scripts that use DBD::CSV 0.22 to read some fields from various flat files. These scripts have been running monthly without a problem until this month, when one of them failed with this error:

Error while reading file .\filename: Bad file descriptor at C:/Perl/si
+te/lib/DBD/CSV.pm line 210, <GEN4> chunk 74885.
[download]

The script that is failing reads the largest of the files, and this month is the largest to date at (only) 54 MB. This is 2 MB larger than last month's file. I split the file in half, and ran the script on each of the two new files without a problem.

Am I missing a file size limitation, or maybe a number of records limitation in this module?

Comment on DBD::CSV file size limitation? Download Code

Replies are listed 'Best First'.
Re: DBD::CSV file size limitation? by jZed (Prior) on Dec 07, 2005 at 18:41 UTC
I (maintainer of DBD::CSV and its prereqs) am not aware of any arbitrary limits on filesize and there are certainly no records limitations. It can be slow and a memory hog for large files but I'm not sure how that would relate to your problem. One thing you might try is reading the file directly with Text::CSV_XS and feeding that line by line to DBD::CSV which would cut down on memory. I also wonder how your script and large file would fare on a different machine. Please let me know how this all turns out.	[reply]
Re^2: DBD::CSV file size limitation? by holli (Abbot) on Dec 07, 2005 at 19:01 UTC
Could it be a newline issue, that's silently solved when the OP cuts and "saves as" the file contents into halfs (e.g. using his favourite text editor)? holli, /regexed monk/	[reply]
Re^3: DBD::CSV file size limitation? by beachbum (Beadle) on Dec 07, 2005 at 19:15 UTC
Nice thought, I'll try it using a hex editor.... nm, I just checked the file, and it does end with `0D 0A`.	[reply] [d/l]
Re^2: DBD::CSV file size limitation? by beachbum (Beadle) on Dec 07, 2005 at 19:01 UTC
Thanks for the quick reply. I have tried running it on 3 different machines ranging from 1.3 Ghz / 1 GB to a new dual processor box with 4 GB. The RAM is not being consumed, however at least on first box, the processor has always maxed out while reading the files. This particular file does have 3 distinct record types in it, each with varying fields and lengths. This hasn't been an issue in the past as my query specifies `WHERE RECORD_TYPE = n`. I would like to try your suggestion on feeding the data into DBD::CSV line by line, but am a little confused as to how to do that. Am I able to issue a `DBI->connect()` to a string instead of a file?	[reply] [d/l] [select]
Re^3: DBD::CSV file size limitation? by jZed (Prior) on Dec 07, 2005 at 21:12 UTC
> This particular file does have 3 distinct record types > in it, each with varying fields and lengths That may be a problem. What record types are they? It's possible that the module assumes a given record type and then treats the other record-types as of the same kind, thereby getting confused about record boundaries and trying to create a single huge record.	[reply]
Re^4: DBD::CSV file size limitation? by beachbum (Beadle) on Dec 07, 2005 at 22:10 UTC
Re: DBD::CSV file size limitation? by beachbum (Beadle) on Dec 09, 2005 at 23:43 UTC
Solution: It appears that having different record types (varying number and size of fields) in a single file was the issue. I generated an even larger file, 67 MB, all with the same record type which was read just fine. Thanks jZed!	[reply]