in reply to split and sysread()

Well, $0 is the name of the program, and since your while loop has a regex as condition, with the regex having just 2 sets of parenthesis, at most $1 and $2 will be set.

The code doesn't make it clear to me at all that reading in fixed length buffers is the right approach. Why not read in one line at a time, which you then split using /[|]/ as regex?

Abigail

Replies are listed 'Best First'.
Re: Re: split and sysread()
by relaxed137 (Acolyte) on Apr 18, 2003 at 23:54 UTC
    I'm sorry that I didn't make myself clear. The regex above just is $1 and $2 because I deleted the references up to $31. Reading in 1 line at a time with perl just takes too long (1.5 million lines - 14 to 20 minutes per file) and awk can do it in like 2 minutes. I'm trying to cut the 14 minutes down to ~ 2 minutes as much as possible.
      Well, to speed up your regex as much as possible, you must make it so that there is as little possibility for backtracking as possible. Try something like:
      my $r = join "[|]" => ("([^|]*)") x 31; while (/^$r\n/mg) { ... }

      This (untested) code sets $1 through $31.

      Abigail