Becky has asked for the wisdom of the Perl Monks concerning the following question:

I've got a perl program which reads a file, parses it and inserts the relevant information into an Access database. This all works fine except that Access doesn't like newline characters and replaces them all with a little box. This makes my data impossible to read n Access. On the advice of a colleague I tried substituting as follows:

if (!$input =~ /\r\n/){ $input =~ s/\n/\r\n/gi; }

but with no luck. Any other ideas?

Replies are listed 'Best First'.
Re: formatting newline characters
by princepawn (Parson) on Nov 05, 2002 at 15:40 UTC
    Perl is smart about \n and the value of it changes to whatever is used for the line-ending on your operating system.

    There is no need to use an if-then, just try the substitution and if there is nothing to substitute then it wont substitute.

    But more than likely you can just chomp $input.

Re: formatting newline characters
by fglock (Vicar) on Nov 05, 2002 at 15:59 UTC

    If your problem is to remove all newlines you may try this:

    $input =~ s/[\r\n]//g;
      No, the problem is that I really need to keep them but as newlines - Access keeps effectively ignoring them by converting them to boxes. The reason for the substitution was that I was told that Access might prefer \r\n to \n, but this doesn't seem to be the case. Thanks anyway.

        It's true that access prefers \r\n, but only if you're on a unix. On windows, access prefers the \n. The reason for the confusion is that \n isn't really a character, it's a LOGICAL newline. From perlport:

        Perl uses "\n" to represent the "logical" newline, where what is logical may depend on the platform in use. In MacPerl, "\n" always means "\015". In DOSish perls, "\n" usually means "\012", but when accessing a file in "text" mode, STDIO translates it to (or from) "\015\012", depend- ing on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. "\015\012" is com- monly referred to as CRLF.

        So in order to fix unix newlines to dos newlines, you need to ADD a LF to the end of lines:

        my $LF = "\012"; while (<>) } s/$/$LF/g; }

        -- Dan

        Try using the s flag on your regular expression, to match multiple lines. The regex you are now using will not match newlines - it stops at the first one.
        $input =~ s/\n/\r\n/gs;
        Also, you might try some sanity checks:
        print "contains CR's!\n" if $input=~/\r/s; print "contains NL's!\n" if $input=~/\n/s; ...
        Good luck!
        Access uses a font(really a control method) where \n (really all control characters) look like a box. So access stores your \n correctly and there is no way to show two lines, which you evidently want.