in reply to Re: Can't get \n or other character/translation escapes to interpolate if originally read from a data file
in thread Can't get \n or other character/translation escapes to interpolate if originally read from a data file

So, to be able to search for this 2-line string in a document that uses Unix-type line endings:

Reach Holly Smith for help by sending an email
To hollysmith@nosuchdomain.com.

.... the character-specific way to do it (rather than using \n) would have been to store the “old string”~“new string” line in a __DATA__ block, like this, using the Unix-type newline character (code point 10, or 0A in hexadecimal):

__DATA__
Reach Holly Smith for help by sending an email\x{0A}to hollysmith@nosuchdomain.com.~For more information, contact Holly Smith.

...where the preceding line is a single line even though it almost certainly will wrap on this web page; is that what you mean?

I think I’m coming to see the point of being able to use \n in a regular expression — it avoids the need to specify what the characters are that represent a new line in the operating system being used to write and read the file, and it’s a memorable way to avoid having to look up the code point for those characters. Plus a manual entry of a new line at the keyboard while creating a line-by-line data file would gunk up the use of

while ( <DATA> )

  • Comment on Re^2: Can't get \n or other character/translation escapes to interpolate if originally read from a data file

Replies are listed 'Best First'.
Re^3: Can't get \n or other character/translation escapes to interpolate if originally read from a data file
by AnomalousMonk (Archbishop) on Mar 16, 2021 at 21:58 UTC

    Update: davebaker changed this post (without citation!) while I was composing this reply.


    ... a data block or file that contains “old string”~“new string” lines:

    __DATA__
    ... an email\x{0A}to ....~For more information, ....
    (Is that what you mean?)

    No. (Well, at least that's not the point I would make. :)

    The point I would make is that the string you get | read from a __DATA__ or __END__ block or from a regular file is essentially the same as a single-quoted string defined in a script, and such a string can be used directly as a regex search pattern:

    Win8 Strawberry 5.8.9.5 (32) Tue 03/16/2021 17:26:59 C:\@Work\Perl\monks\davebaker >perl -Mstrict -Mwarnings my $s = 'foo bar'; print "A: >>$s<< \n"; my $search = 'foo\nbar'; # note single quotes! print "B: >>$search<< \n"; # \n is '\n' my $replace = "hoo-ray"; # can be single/double quotes $s =~ s/$search/$replace/; # no /g - one replacement only print "C: >>$s<< \n"; ^Z A: >>foo bar<< B: >>foo\nbar<< C: >>hoo-ray<<

    If the search string/pattern is held in a file, the process is similar, except you usually need to chomp the string before you use it:

    Win8 Strawberry 5.8.9.5 (32) Tue 03/16/2021 17:28:26 C:\@Work\Perl\monks\davebaker >type search.dat foo\nbar >perl -Mstrict -Mwarnings my $s = 'foo bar'; print "A: >>$s<< \n"; open my $fh, '<', 'search.dat' or die "opening: $!"; chomp(my $search = <$fh>); print "B: >>$search<< \n"; # \n is essentially '\n' my $replace = "hoo-ray"; $s =~ s/$search/$replace/; print "C: >>$s<< \n"; ^Z A: >>foo bar<< B: >>foo\nbar<< C: >>hoo-ray<<
    I think that if you use '\n' (or the equivalent from a file) in a regex search pattern and if you use default I/O for reading all your files, then you will be able to do automatic text editing in an OS-agnostic way, at least across the Windows/*nix iron curtain. The '\n' sequence in a regex is the universal representation of a default newline.

    (In general, I think use of qr// is definitely best practice for defining search regexes in a script, not single- or double-quoted strings, but if you're reading from a file, you're kinda stuck with what you've got.)


    Give a man a fish:  <%-{-{-{-<

Re^3: Can't get \n or other character/translation escapes to interpolate if originally read from a data file
by LanX (Saint) on Mar 16, 2021 at 13:03 UTC
    I'm not sure I understand your question.

    If you want OS sensitive line breaks in DATA you'll need to translate them.

    I'd suggest using plain "enter" in the input and

    $output = join "\n", <DATA>

    In case of OS problems when reading DATA by line, just adjust $/ before. (Never happened to me *)

    > Is that what you mean?

    I suggested using HERE-docs instead of DATA. They are per default interpolated.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    update

    *) actually, this problem can't arise, because Perl reads DATA like it's own code, its the same filehandle. I.e. the script won't run if there where any problems with OS specific line-endings