in reply to The future of Text::CSV_XS - TODO

csv() is unexpectedly restrictive, and non-streaming.

It would be nice if the following would work, and processed records one at a time instead of slurping the whole file.

use Text::CSV_XS qw( csv ); csv( in => "a.csv", out => "b.csv", ... );

It does suggest how the call could be rewritten so the above works (nice), but it loads the entire file.

Honestly, I just want jq to support CSV :) (It can generate it, but not read it.)

Replies are listed 'Best First'.
Re^2: The future of Text::CSV_XS - TODO
by Tux (Canon) on Dec 19, 2024 at 17:38 UTC
    1. The function csv () is streaming! (and well documented as such, see e.g. dumping database tables)

    2. Believe me, I have had the same feeling with file-names as arguments to in and out, and I actually started coding on that, but in the end, you do not want it, as it causes too many undocumantable catch-22's.

      csv (in => csv (in => "a.csv, ...), out => "b.csv") is the best alternative and works quite well

    3. My personal fav use of streaming is csv (in => $fh, out => undef, bom => 1, on_in => sub { ... process %_ ... });. If files are really large, this can be used for streaming file to file too using a second handle.


    Enjoy, Have FUN! H.Merijn

      Believe me, I have had the same feeling with file-names as arguments to in and out

      It's the inconsistency of it.

      in
      Data StructureSubFile NameFile Handle
      out[Absent]WorksWorksWorksWorks
      Undefined ScalarWorksWorksRun-time errorRun-time error
      File NameWorksWorksInstructs the programmerRun-time error
      File HandleWorksWorksRun-time errorRun-time error

      The instruction to the programmer is doubly weird because it could simply do what it would do if the programmer used the suggested code.

        Thanks for this trigger!

        The new strict_eol made it possible to vercome all the old blockers, so the next release will have "Works" in all cells \o/

        Still working on tests and documentation.

        Github is up to date, so you can test and comment :)

        Let that be my new-year gift.


        Enjoy, Have FUN! H.Merijn

      The function csv () is streaming!

      You can encode as a stream.

      You can decode as a stream.

      But you can't transform as a stream. The inner call (csv (in => "a.csv, ...)) loads the entire file into memory.