aixmike has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, Can someone point me in the right direction. I have a stream of text (my sample is 4Mb long) that I need to read in and then write to a seperate file with the max line length of 80 characters. TIA

Replies are listed 'Best First'.
Re: 80 characters long
by ikegami (Patriarch) on Sep 27, 2007 at 18:09 UTC
    What do you want to do with long lines? Wrap them? Text::Wrap.
      Not wrap them, but write them to seperate lines (carriage return at column 81) Please if what I am asking does not make sense, let me know, I will try yo explain it another way.

        That's what wrapping means.

        ...Although Text::Wrap will only break on a word boundary. Did you want to break at colomn 80 unconditionally, even if it's in the middle of a word?

        $text =~ s/(.{80})(?!$)/$1\n/gm;
Re: 80 characters long
by swampyankee (Parson) on Sep 27, 2007 at 21:24 UTC

    Sounds like this guy has to write stuff to make an IBM mainframe or other hunk of old iron happy (been there, done that).

    What you could try (with no guarantees of success ;-) ) is to set local $/ = 80;. This will read the file in chunks of 80 bytes (note that the docs say bytes, but beware that "byte" and "character" are sometimes used interchangeably). You could also try working with sysread, which has an option to read a specific number of bytes.

    I'm sure somebody will (or has) come up with a way to do this in a single line with a regex, but my regex mojo is not in that realm.


    emc

    Information about American English usage here and here.

    Any Northeastern US area jobs? I'm currently unemployed.

      $/ doesn't help. (And don't forget the RHS needs to be a reference: local $/ = \80;)

      local $/ = \10; # Easier to visualize than 80. print "$_\n" while <DATA>; __DATA__ abcdefghijklm ABCDEFGHIJKLM

      outputs

      abcdefghij klm ABCDEF GHIJKLM

      instead of

      abcdefghij klm ABCDEFGHIJ KLM

      read (or unbuffered sysread) can do it with some coaxing.

      my $wrap_len = 10; my $buf = ''; LOOP: for (;;) { while (length($buf) < $wrap_len+1) { my $rv = read(DATA, $buf, $wrap_len+1-length($buf), length($buf) +); die if not defined $rv; last LOOP if not $rv; } ($buf =~ s/^(.*)\n// || $buf =~ s/^(.{0,$wrap_len})//) and print("$1\n"); } print($buf); __DATA__ abcdefghijklm ABCDEFGHIJKLM

      The +1 avoids adding a \n before another \n.

Re: 80 characters long
by Anonymous Monk on Sep 29, 2007 at 03:19 UTC
    Perhaps something like
    perl -pe 's/(.{80})/$1\n/g'
    ?