chuckd has asked for the wisdom of the Perl Monks concerning the following question:

I know this has probably been asked before but I have to ask it again. If selecting text from a table and the text in the table is 120 MB and the column size is a clob. Can Perl return this with the DBI?

Replies are listed 'Best First'.
Re: question about DBI
by kyle (Abbot) on Aug 12, 2008 at 23:31 UTC

    See the DBI documentation under "Handling BLOB / LONG / Memo Fields"

Re: question about DBI
by est (Acolyte) on Aug 13, 2008 at 02:12 UTC
    I use the LOB data type for storing/retrieving binary files to/from a table - but apparently the size is very small, i.e. ~2Kb as the file is just a one page Xerox metacode file.

    The reason I do this because the physical files are archived in daily basis but we still need to have the file quickly on-the-fly for a period of a month - so having the binary in a database is one option. However I believe storing 120Mb of text into a table is bizarre, if I can't say odd :-)

    Btw, here is a snippet of the code:

    * On inserting
    use DBD::Oracle qw(:ora_types); __PACKAGE__->data_type(cr_data => DBI::SQL_BINARY); __PACKAGE__->data_type(cr_data => { ora_type => ORA_BLOB, ora_field => 'column_name' } ); ....
    * On retrieving:
    # Modify DBH LongReadLen for long binary object... my $dbh = __PACKAGE__->db_Main(); my $current_LongReadLen = $dbh->{LongReadLen}; # need to dynamically set to the length of your record # it's hardcoded to 512Kb for example... $dbh->{LongReadLen} = 512 * 1024; ...
    Please refer to the Class::DBI wiki as it has a really good explanation on the subject.

    Cheers!
Re: question about DBI
by leocharre (Priest) on Aug 13, 2008 at 00:07 UTC

    H0l1 c0vv! Why are you storing 120 megs of text outside of a filesystem!!!!

    Are you really saying that the total return of your call will ammount to 120megs of text?

    Something is fishy here. A database is for storing relational data- data you can compare to itself in various ways. Yes, a database can be for other things- off the top of my head I just get a icky feeling in my stomach that you're probably doing something that's going to bite you in the butt in the long run!

    Care to explain a little bit why you're storing that kind of stuff in something other than a regular file? Maybe we can suggest some alternative ideas.

      Hi, The company I work for does document analysis and retrieval. All of our db tables were created by the engineering teams before I started working with the company. One of our primary tables uses a clob type column, and in this column we store all of our text we analyze. For all of our loading we use DBI and sql loader. Sql loader is used for most of the bulk loading, but I want to know if the DBI is going to have major problems inserting and fetching data of this size (120MB)? So basically, can the Perl DBI be used to insert large text values (>100MB)???
        As pointed out above by kyle, the DBI man page describes how to make sure that you can read a large CLOB field. The only remaining question is: how much RAM do you have on the machine that will be running your script, and how many other things might be going on at the same time to compete for active RAM?

        In other words, what happens when you try will be determined by the machine and other factors, rather than by Perl or DBI. The worst that can happen is that once the data gets loaded into process memory from the database table, everything on the machine will be slowed down to crawl because of swapping virtual memory back and forth between RAM and disk.

        If your machine lacks enough RAM to carry the load (which seems doubtful these days -- lots of folks have 1GB RAM or better), and if you have some other method that is known to work for extracting these huge fields, you might want to have your perl script execute that other method via a system() call, in such a way that the data field is stored directly to a local disk file.

        Then the perl script can work from the disk file (assuming you can do what needs to be done without the entire clob unit being held in memory at one time -- e.g. if the clob data can be processed as lines or paragraphs of text). But that's just as a last resort.

Re: question about DBI
by runrig (Abbot) on Aug 13, 2008 at 15:59 UTC
    With what database? With Oracle, you don't have to fetch a *LOB all at once...see DBD::Oracle. If your database can't do this, then you'll need lots of memory.