in reply to Suggestions for optimizing this code...

First off: I’d simply switch to a less bozotic provider as soon as possible.

That said, for the principle of it,

my $rec; sysopen my $fh, "test.txt", O_RDWR | O_CREAT or die "$!"; sysread $fh, $rec, -s $fh; my $offs = 0; while( $offs < length $rec ) { my $next_eol_offs = index $rec, "\n", $offs; $next_eol_offs = length( $rec ) - 1 if $next_eol_offs == -1; my $str = substr $rec, $offs, $next_eol_offs - $offs + 1; # work on $str; note that it includes the newline substr( $rec, $offs, $next_eol_offs - $offs + 1 ) = $str; $offs += length $str; } seek $fh, 0, 0; syswrite $fh, $rec; close $fh or die "$!";

Note that there are error checks in here that your own code did not include.

This will execute as few “I/O processes” as possible and consume as little memory as possible (well, it could consume a tiny bit less if you use substr as an lvalue instead of making a copy, but that is fraught with bugs), but at the cost of stupidly high CPU consumption and convoluted code. Doing it in a more natural way would consume minimal CPU and memory resources and do no more I/O than this way does, only it would stretch the I/O over more “I/O processes.”

I can’t imagine why any hoster would think forcing their customers to burn a ton of CPU is a good idea, unless either their tech dep’t is clueless (their use of the term “I/O process” makes me inclined to assume this) or their storage subsystem is seriously under-budgeted, so they’re forcing their customers to rewrite their code in harder to maintain fashion to evade the I/O bottleneck by trading CPU time for I/O. (But even that is a far-fetched explanation, and I’m not sure if doing the same amount of I/O in fewer syscalls would be any help. I vote “clueless.”)

Roughly.

Whatever the case, I’d run away from them instead of making my code a damn sight unreadable.

Update: fixed code per BrowserUk’s reply below.

Makeshifts last the longest.

Replies are listed 'Best First'.
Re^2: Suggestions for optimizing this code...
by BrowserUk (Patriarch) on Jan 17, 2006 at 07:34 UTC

    Using your code above as is with the exceptions of adding a shebang line and using $ARGV[0] for the filename and it crunches with

    P:\test>523624-a 1000000.dat syswrite() on closed filehandle $fh at P:\test\523624-a.pl line 26. Bad file descriptor at P:\test\523624-a.pl line 26.

    Which I do not understand at all. Any thoughts?

    Update: Ok. I couldn't see it for looking, but you have a comma instead of a semicolon on the syswrite line.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.