perl was extremely slow and it is wiser to use the shell.
I find that hard to believe. Perl is slower than C, but if you're dealing with string manipulation, it's one of the fastest "scripting languages" around. I wouldn't be too surprised if a small awk script could do this task a bit faster than perl - awk was written for exactly this kind of task.

Anyway, you're pitching straight C against perl in your example - I don't see any awk or shell script - and for your particular code, it's not that surprising that the C code is faster - though I'd guess (blindly) that for this kind of task 1/5th of the speed of a C program is attainable.

But your C code isn't equivalent to the perl code: AFAIK the C code just reads a 10,000 bytes, reads the 100th field in those 10.000 bytes and gets the next 10,000 bytes. If I read it correctly, it'll even fail to get the 100th field on the first line if the very first field is empty (my C is rusty) update: my C is indeed rusty: you code will work correctly if all lines are less than 10K in length, and they don't start with an empty field. Your perl code on the other hand reads the file by line and gets the 100th field for that line. As others have stated in this thread: one of the benefits of using perl vs C is that it just takes a lot less time to code a correct program in perl vs C - and computers don't get paid by the hour :-)


In reply to Re: fast way to split and sum an input file by Joost
in thread fast way to split and sum an input file by egunnar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.