Hmm. While I am not suprised about the P::RD solution that I wrote being the slowest, I would like to point a couple of things out. Most of these solutions are buggy or not appropriate for the data set you are using in some way or another (I can feel the venom in BrowserUks eyes already ;-)

And by buggy I mean the following: I dont believe that you can consider a parser correct if it accepts something other than the specification it was intended to parse. Ie, I consider a parser to work correctly if and only if it parses a language specification correctly and nothing more.

Anyway heres my analysis of the different solutions. Additional stuff is welcome.

So first off I would say that when you test code you should test both positive and negative cases. Second off I can't feel bad about having the slowest solution as it quite frankly is the only one that correctly parses or rejects the data it recieves.

To quote from Code Complete (28.2 Code Tuning)

(emphasis added by me.)

However I have no doubt that with a bit of hacking all of the people involved here can fix their stuff and still have it faster than the P::RD approach, but this is a good example of why proper testing of both positive and negative cases is essential. Its also a good example of why P::RD is a cool tool. You are much more likely to produce a correct (but slow) result with it than without it, and to do so in much less time than with another approach.

A last point about all of this. You are reciving packets over a network connection. So I'm guessing that just the network part takes much longer than even the slowest solution. Thus the network time is going to swamp the parse time by a great deal, and to me would suggest theres no point in optimizing this.

I think having a read of the code complete article is a worthy use of time.

Cheers,

--- demerphq
my friends call me, usually because I'm late....


In reply to Re: Re: Parsing Text into Arrays.. by demerphq
in thread Parsing Text into Arrays.. by castaway

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.