marek1703 has asked for the wisdom of the Perl Monks concerning the following question:

Hello all!

First question here in this forum. I am seeking wisdom but I am feeling bad, because this question seems to be obvious. But nevertheless I need your help:

Here a code snippet to illustrate my attempt: How to escape special characters with \ for a LaTeX file (this are: # $ % & ~ _ ^ \ { })

I would be grateful for any hint
Thank you!
marek

#! /usr/bin/perl =start comment replace special LaTeX-characters in $text! this are: # $ % & ~ _ ^ \ { } =end comment =cut use warnings; use strict; my $date_old = "Samstag, 25.05.2013 "; #my @special_characters = q/# $ % & ~ _ ^ \ { }/; while( <DATA> ) { if (/^"([^"]{2,})","(\d)","(.+)"$/) { chomp; my $date_time = $1; my $who = $2; my $text = $3; # my $first = q'&'; # my $second = q'%'; # my $third = q'~'; # my $fourth = q'$'; # my @special_characters = ($first, $second, $third, $fourth); my @special_characters = q{# $ % & ~ _ \\ { } ^}; foreach my $i (@special_characters) { $text =~ s/$special_characters[$i]/\\$&/g; } # $text =~ s![#$%&~_}{^]!\\$&!g; my ($date_new) = $date_time =~ /^([a-z]+, \d\d\.\d\d\.\d{4} )/ +i; # $date_time = ($date_new eq $date_old) ? ($date_time) =~ s/$d +ate_old// : $date_time; if ($date_new eq $date_old) { $date_time =~ s/$date_old// } $date_old = $date_new; $who = ($who == 1) ? '\maaki ' : '\puh '; # print $out $_; print "\\datum{$date_time}\n\n$who $text\n\n"; } else {next;} } __DATA__ "Sonntag, 26.05.2013 - 13:13:27","0","Lieber Herr % , & text text with + some %special characters" "Sonntag, 26.05.2013 - 13:35:47","1","Sehr gerne! ^ _~}# and here some + other characters" "Sonntag, 26.05.2013 - 13:49:15","0",":-))" "Mittwoch, 05.06.2013 - 18:12:09","0","Besten Dank, & hat prima geklap +pt. {Greetings!}" "Mittwoch, 05.06.2013 - 19:16:47","1","% _ text text!" "Mittwoch, 05.06.2013 - 19:17:30","0","~Prima! #Danke!!" "Samstag, 08.06.2013 - 09:21:09","1","_ 10. Jul Text text %" "Samstag, 08.06.2013 - 10:08:45","0","Ja, osaka ffm lh 741 & ffm muc l +h 114 abflug 16:00"

Replies are listed 'Best First'.
Re: Escape special characters for a LaTeX file
by BrowserUk (Patriarch) on Dec 14, 2014 at 10:34 UTC

    You're working much too hard :)

    #! perl -w use strict; while( <DATA> ) { s!([%&_~{}^#\\])!\\$1!g; print; } __DATA__ "% & _ ~ { } ^ # \ " "Sonntag, 26.05.2013 - 13:13:27","0","Lieber Herr % , & text text with + some %special characters" "Sonntag, 26.05.2013 - 13:35:47","1","Sehr gerne! ^ _~}# and here some + other characters" "Sonntag, 26.05.2013 - 13:49:15","0",":-))" "Mittwoch, 05.06.2013 - 18:12:09","0","Besten Dank, & hat prima geklap +pt. {Greetings!}" "Mittwoch, 05.06.2013 - 19:16:47","1","% _ text text!" "Mittwoch, 05.06.2013 - 19:17:30","0","~Prima! #Danke!!" "Samstag, 08.06.2013 - 09:21:09","1","_ 10. Jul Text text %" "Samstag, 08.06.2013 - 10:08:45","0","Ja, osaka ffm lh 741 & ffm muc l +h 114 abflug 16:00"
    C:\test>junk "\% \& \_ \~ \{ \} \^ \# \\ " "Sonntag, 26.05.2013 - 13:13:27","0","Lieber Herr \% , \& text text wi +th some \%special characters" "Sonntag, 26.05.2013 - 13:35:47","1","Sehr gerne! \^ \_\~\}\# and here + some other characters" "Sonntag, 26.05.2013 - 13:49:15","0",":-))" "Mittwoch, 05.06.2013 - 18:12:09","0","Besten Dank, \& hat prima gekla +ppt. \{Greetings!\}" "Mittwoch, 05.06.2013 - 19:16:47","1","\% \_ text text!" "Mittwoch, 05.06.2013 - 19:17:30","0","\~Prima! \#Danke!!" "Samstag, 08.06.2013 - 09:21:09","1","\_ 10. Jul Text text \%" "Samstag, 08.06.2013 - 10:08:45","0","Ja, osaka ffm lh 741 \& ffm muc +lh 114 abflug 16:00"

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Wow! I did not expect an answer so fast :-)) Thank you as fast as possible from my iPad Marek
Re: Escape special characters for a LaTeX file
by RichardK (Parson) on Dec 14, 2014 at 10:22 UTC

    Welcome. You didn't say what's going wrong but changing

    else{next;}

    to

    else { print "skipping : $_ :\n"; }

    will give you a clue.

      And BTW, the
      else{next;}
      statement is pretty much useless in this context, since the program would go to the next DATA record anyway.
Re: Escape special characters for a LaTeX file
by AnomalousMonk (Archbishop) on Dec 14, 2014 at 14:19 UTC

    BrowserUk's approach to your escaping problem is, of course, the way to go. However, in the hope that you may be spared some future headaches, a couple of features of the OPed code other than those already noted deserve comment.



    my @special_characters = q{# $ % & ~ _ \\ { } ^};

    This statement assigns a single element that is a string to an array (which it also creates).

    c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my @special_characters = q{# $ % & ~ _ \\ { } ^}; dd \@special_characters; print 'number of elements in array: ', scalar @special_characters; " ["# \$ % & ~ _ \\ { } ^"] number of elements in array: 1
    See perlintro, List value constructors in perldata.



    foreach my $i (@special_characters) {
         $text =~ s/$special_characters[$i]/\\$&/g;
     }

    This loops over the single element of the array, a string that is not numeric (i.e., doesn't "look like" a number; by default, Perl evaluates such a string to 0), and then uses the string as an index into the array. This should have produced a nice, fat warning.

    c:\@Work\Perl>perl -wMstrict -le "my @special_characters = q{# $ % & ~ _ \\ { } ^}; ;; foreach my $i (@special_characters) { print qq{>>$special_characters[$i]<<}; } " Argument "# $ % & ~ _ \\ { } ^" isn't numeric in array element at -e l +ine 1. >># $ % & ~ _ \ { } ^<<

      This should have produced a nice, fat warning.
      The script never gets there; apparently /^"([^"]{2,})","(\d)","(.+)"$/ never matches because of carriage returns... Specifically,
      (.+)"$
      can't match
      "\r\n (ends of strings from __DATA__ section)

      (perl -E '"foo\r\n" =~ /foo$/ or say "cant match!"').

      perl -i -pe 'tr/\r//d' latex.pl fixes that and the script starts to emit tons of warnings.

      I thought $ works on Windows with carriage returns but it seems it doesn't...

        I thought $ works on Windows with carriage returns but it seems it doesn't...

        It does. The OS-specific line terminator character(s) is/are translated somewhere in Perl's I/O layers to a common  \n newline character (which, IIRC, is 0x0a linefeed) — unless you're using binmode or some other form of raw read/write. The following code should execute identically (I believe) under any OS (I'm running this under Windoze 7):

        c:\@Work\Perl\monks\marek1703>perl -wMstrict -le "use Data::Dump qw(pp); ;; for my $s (qq{foo\r\n}, qq{foo\r}, qq{foo\n}, qq{foo}) { if ($s =~ /foo$/) { print 'matched: ', pp $s; } else { print 'no match: ', pp $s; } } " no match: "foo\r\n" no match: "foo\r" matched: "foo\n" matched: "foo"
        See definition of  $ in Regular Expressions - Metacharacters and its general discussion in perlre and perlretut. The match failures are due to the  \r stuck in the middle of things.

        Also consider the following, which I also expect would work the same on any OS (it certainly works under Windows):

        use warnings; use strict; while( <DATA> ) { if (/^"([^"]{2,})","(\d)","(.+)"$/) { chomp; my ($one, $two, $three) = ($1, $2, $3); print "matched: ``$_'' ('$one', '$two', '$three') \n"; } else { print "no match: ``$_'' \n"; } } __DATA__ "Sonntag, 26.05.2013 - 13:13:27","0","Lieber Herr % , & text text with + some %special characters" xyzzy "Mittwoch, 05.06.2013 - 18:12:09","0","Besten Dank, & hat prima geklap +pt. {Greetings!}"

        Thank you again!

        funny enough, I have had already tried the suggestion of BrowserUK in my script, commented out.
        Now I put it back:

        $text =~ s![#$%&~_}{^]!\\$&!g;

        But only one character is not escaped: "%"
        strange!