Re: split and sysread()

Well, $0 is the name of the program, and since your while loop has a regex as condition, with the regex having just 2 sets of parenthesis, at most $1 and $2 will be set.

The code doesn't make it clear to me at all that reading in fixed length buffers is the right approach. Why not read in one line at a time, which you then split using /[|]/ as regex?

Abigail

Comment on Re: split and sysread() Select or Download Code

Replies are listed 'Best First'.
Re: Re: split and sysread() by relaxed137 (Acolyte) on Apr 18, 2003 at 23:54 UTC
I'm sorry that I didn't make myself clear. The regex above just is $1 and $2 because I deleted the references up to $31. Reading in 1 line at a time with perl just takes too long (1.5 million lines - 14 to 20 minutes per file) and awk can do it in like 2 minutes. I'm trying to cut the 14 minutes down to ~ 2 minutes as much as possible.	[reply]
Re: split and sysread() by Abigail-II (Bishop) on Apr 19, 2003 at 12:03 UTC
Well, to speed up your regex as much as possible, you must make it so that there is as little possibility for backtracking as possible. Try something like: `my $r = join "[\|]" => ("([^\|]*)") x 31; while (/^$r\n/mg) { ... }` [download] This (untested) code sets `$1` through `$31`. Abigail	[reply] [d/l] [select]