It is more suited for the task, takes a bit less memory, and might even be faster if your stdio is stoned.sub BLOCKLENGTH () { 1 << 12 }; # TIMTOWTDI =) $_ = ''; while(sysread STDIN, $_, BLOCKLENGTH, length){ s///g; # you know syswrite STDOUT, substr($_, 0, -BLOCKLENGTH, ''); # the fourth ar +gument to substr will replace 0 .. -BLOCKLENGTH }; syswrite STDOUT, $_;
Personally, i think that perl -pe 'BEGIN{ $\ = "\n"; $/ = "tag" } chomp; s/tag2/\t/g; print' < infile > outfile is the nicest way.
Update: I thought some explanation was appropriate.
The notion of what is a line is pretty flexible, and has to be (computers in general and specifically in perl). A line, traditionally, ended in a carrige return and a line feed, in one order or another. Windoze still uses that. MacOS uses only CRs, ~UNIX does only LF (i might be confused). The one byte solution is somwhat simpler. But since you need to support two bytes in case they come, why not support everything.
Enters the concept of a record.
Treating a line as a record, with either a fixed length ($\ = \ 123), or one ending with a certain string ($\ = "\n" is for a record which is also a line on your native system) adds the flexibility to do something like you wanted quite easily. You're translating a record format that ends in a certain string, to one that ends with newlines. $\ is the output record seperator, BTW.
In reply to Re: Re: Re: search/replace very large file w/o linebreaks
by nothingmuch
in thread search/replace very large file w/o linebreaks
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |