No I/we started of by making 4 different extremely simple versions of CSV parsing core code, just to see how ell approaches would work.

  1. Chucks of interest
  2. State machine
  3. Grammar based
  4. Brute force

The first three are still alive, and I personally only develop in the chunks version, which is - for me - the easiest to develop.

I did not want to put people off in the initial post, but speed is about the most serious drawback at the moment. Not having CPAN can be worked around using use Inline::Perl5;. Examples of how to do that are available on the git repo, that includes working with XS modules (including DBI)! (passing IO arguments is work-in-progress)

When I started in October 2014, my initial version was 6700 times slower than the XS version. Meanwhile is is "just" 1010 time slower. Some of that is because I learn to code more efficient in perl6, but most of that is because the perl6 core gets faster. We're not there yet. Here is a compare:

Perl5 Text::CSV::Easy_XS 0.016 These two have no options and only parse v +alid CSV Text::CSV::Easy_PP 0.016 Text::CSV_XS 0.039 Highly optimized XS with many options Text::CSV_PP 0.514 Pure perl version Pegex::CSV 1.356 Ingy's Pegex parser Perl6 csv.pl 8.133 John's state machine csv-ip5xs 8.950 Text::CSV_XS with Inline::Perl5 csv-ip5pp 9.812 Text::CSV_PP with Inline::Perl5 csv_gram.pl 13.426 Using a grammar-based parser test.pl 38.733 My first attempt, no options test-t.pl 39.502 Almost compatible with Text::CSV_XS

The numbers shown are the time needed in seconds to parse a valid 10000 line CSV file with 5 columns.

Back to your question. Of course one cannot start with the test suite, but one can start with the test suite as a guide. So after building the initial core parser, feed it the tests and look what works and what does not. Then use the failing tests as a plan to alter the code to make the tests pass: implement error-handling, make all the attributes work, catch all exceptions etc etc

Building a bridge would imho be a waste of time: that will not make you learn perl6 any faster, not will you hit problem areas that one needs to fix in the code eventually. As perl6 is type-checked and passes arguments by reference (all are objects), supporting array-refs to speed things up is counter-productive, so one needs to match the test suite to what is feasible and sane in perl6: don't slow don to match perl5 behavior. Things are changing anyway. I'm not trying to mimic the old CSV syntax, I'm trying to port its versatility and flexibility keeping the complete and safe parsing rules.


Enjoy, Have FUN! H.Merijn

In reply to Re^2: Porting (old) code to something else by Tux
in thread Porting (old) code to something else by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.