in reply to Re: DBD:Pg pg_putcopydata
in thread DBD:Pg pg_putcopydata

James,

This is a very interesting approach and I had not considered the possibility. I am hoping you might provide a little insight into the the combine function. I am a little confused by the variable @$row in particular. $csv is a scalar, as is $row, however I believe $csv->combine is expecting an array. I would have guessed some work would have needed to be done to split the scalar into an array.

This is what happens when a T-SQL SPROC junky decides to break out of the safe bubble I had been in for years and learn some new tricks in an open environment!

Also, what situations would you recommend this method over the first example? I personally like the idea of processing it as a CSV due to the fact I built the arrays to mimic a delimited flat file, simply because it was an easy way to express my intended end result.

Thanks!

Replies are listed 'Best First'.
Re^3: DBD:Pg pg_putcopydata
by james2vegas (Chaplain) on Jul 01, 2010 at 18:40 UTC
    Sure, also be sure to check out perlreftut, perlref, perllol, perldata and perldsc for details on references. and their use in data structures.

    @ArrayInMemory is what is called an AoA (array of arrays). In order to include an array inside another without it being converted into a single flat array, you make it an array reference (delimited by [ and ] instead of ( and )).

    In the foreach loop $row gets assigned to an element in the @ArrayInMemory array. That element is an Array Reference, not the array which Text::CSV's combine method requires. Luckily we can 'dereference' an Array Reference back into an array by adding an @ to the beginning (@$row) indicating we are interested in the dererferenced array, not the array reference in $row.

    The reason I would choose this method is that seems, to me, a natural fit, COPY FROM accepts CSV, Text::CSV produces it. This is especially helpful if your data potentially contains data that would need to be escaped before being sent to COPY FROM, Test::CSV handles all the ugly details of that for you. The first solution assumes you already have a string for each row you are submitting to Pg, which is difficult for me to imagine being the case, you are more likely to have a collection of fields (in an AoA or similar structure) which you need to combine. Instead of joining them yourself and worrying about escaping rules (which are different between Perl, Pg and CSV), use a well-tested and recommended module instead.

      The case you have provided for using the TEXT::CSV module is very compelling and gives me new reason to dig deeper into the module pool for tools. Very helpful, much appreciated.

      I managed to overlook the construction of @ArrayInMemory as an AoA which was the source of my confusion. In my zeal to complete a functional prototype I did not construct this array as an AoA, even though it would make sense from a structural standpoint to do so.

      The array I am currently using looks something like this:

      index1 - "field 1"/"field 2"/"field3" index2 - "field 21"/field 22/"field23"

      A fun exercise, for a learner like myself, might be to benchmark the various data structures for my particular application.

      Thank you for all the detailed responses with various methods to move this data. I was shocked to see so few code examples using the pg_putcopydata handle, and I hope others might benefit from the exploration of this simple problem as well. PIPE! How does one (me) so easily forget about pipes and shell scripts? I feel like I should take down my list of UNIX tenants from my wall, in shame...

      I work at a game development studio with many programmers who only have experience with Perl from their days in a classroom. I find communities such as Perl Monks to be invaluable in my pursuit of data wrangling, and thank you all again!