sysread failure

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I seem to be making a mess of using sysread. I haven't been using perl long so I thought I'd write a simple routine which manipulates a file. The file is tab delimited with 8 columns

field1 field2 field3 field4 field5 field6 field7 field8
[download]

field6 and field7 are dates with this format. e.g.

Aug 11 2010 1:40PM
[download]

All I want to do is to get this file into an array so I can manipulate it. I tried doing this

local *IN;
open(IN, "/var/tmp/myfile.dat") or die "Can't read /var/tmp/myfile.dat
+: $! \n";

binmode(IN);

my $buf_e = '';
my $buf_d = '';

my $BLOCK_SIZE = 8192;
my @recs;

# Shove the contents of file into an array

while (sysread(IN, $buf_e, $BLOCK_SIZE, length($buf_e))) {

   push(@recs, [ $1, $2, $3, $4, $5, $6, $7, $8 ])
      while ($buf_d =~ s/^(\S+)\t+(\S+)\t+(\S+)\t+(\S+)\t+(\S+)\t+(\S+
+)\t+(\S+)\t+(\S+)\n//s);

}

close(IN);

print @recs;
[download]

But it's not working. No output is produced and I can't figure out why. Any help would be gratefully received

Comment on sysread failure Select or Download Code

Replies are listed 'Best First'.
Re: sysread failure by Fletch (Bishop) on Aug 11, 2010 at 13:26 UTC
I'm going to go out on a limb and guess it's because you read into `$buf_e` then try and parse things from `$buf_d`. That aside, if the file's columns are separated by tabs then reading line by line and using Text::CSV_XS or even split would make more sense. Using sysread for line oriented input isn't the most obvious solution. The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l] [select]
Re: sysread failure by talexb (Chancellor) on Aug 11, 2010 at 13:57 UTC
To follow up on what brother Fletch said, `sysread` is really only something you'll need to use when munging a binary file. For any line-oriented file, just open the file and read it with `open` and the diamond operator `while(<>){ .. }`. And then `close` it, of course. And don't worry that you have to read 8K blocks to get decent throughput -- the part of Perl that reads lines from files has had (from what I've heard) a great deal of attention paid to it over the last twenty or so years, and is probably about as fast as it could possibly be. Also, seeing a repeated pattern in a regular expression should be a sign that there's probably a better way. Again, brother Fletch has suggested `split`. Try it -- I think you'll like it. Alex / talexb / Toronto "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds	[reply] [d/l] [select]
Re^2: sysread failure by ikegami (Patriarch) on Aug 11, 2010 at 15:35 UTC
`sysread` is really only something you'll need to use when munging a binary file. `read` is really only something you'll need to use when munging a binary file. `sysread` is only needed when you need unbuffered IO (e.g. when you use `select`) or when you want partial reads from pipes and sockets. the part of Perl that reads lines from files has had (from what I've heard) a great deal of attention paid to it over the last twenty or so years Actually, I heard it's quite slow, in part due to the minuscule 4k buffer. I'm not saying that reading 8k chunks and breaking them down into lines on the user side is any faster.	[reply] [d/l] [select]
Re^2: sysread failure by Anonymous Monk on Aug 11, 2010 at 14:04 UTC
sysread is really only something you'll need to use when munging a binary file. That is not a reason to use sysopen open vs. sysopen Why does File::Temp use sysopen? sysopen vs. open open vs sysopen	[reply]
Re^3: sysread failure by ikegami (Patriarch) on Aug 11, 2010 at 15:30 UTC
sysread ne sysopen	[reply]