this version translates your example well, IMHO there are only the two mentioned TODOs left to be covered.

/usr/bin/perl -w /tmp/plre2el.pl Perlcode: "(.*?)" Elispcode: \"\\(.*?\\)\"

use strict; use warnings; # version 0.2 $\="\n"; #--- flags my $flag_interactive; # true => no extra escaping of backslashes my $RE='\w*(a|b|c)\d\('; $RE='\d{2,3}'; $RE='"(.*?)"'; print "Perlcode: $RE"; #--- hide pairs of backslashes $RE=~s#\\\\#\0#g; # TODO check for suitable long "hidesequence" instead of a simple \0 #--- TODO normalisation of needless escaping # e.g. from /\"/ to /"/, since it's no difference in perl but might +confuse elisp #--- toggle escaping of 'backslash constructs' my $bsc='(){}|'; $RE=~s#[$bsc]#\\$&#g; # escape them once $RE=~s#\\\\##g; # and erase double-escaping #--- replace character classes my %charclass=( w => 'word' , # TODO: emacs22 already knows \w ??? d => 'digit', s => 'space' ); my $kc=join "|",keys %charclass; $RE=~s#\\($kc)#[[:$charclass{$1}:]]#g; #--- unhide pairs of backslashes $RE=~s#\0#\\\\#g; #--- escaping for elisp string unless ($flag_interactive){ $RE=~s#\\#\\\\#g; # ... backslashes $RE=~s#"#\\"#g; # ... quotes } print "Elispcode: $RE";

Cheers Rolf

Please note: xemacs knows "Raw Strings" where escaping is not neccessary, but I doubt that normal Perl RE syntax can be used from within Gnu Emacs Lisp because of the quoting problem, so the conversion can in general only be done with perl!

UPDATE:

-TODO: Just noticed that i still need a special treatment for escape sequences like \t and \n. I'll add this tomorrow.

last version re_pl2el.pl

use strict; use warnings; # version 0.3 $\="\n"; # TODO # * wrap converter to function # * testsuite #--- flags my $flag_interactive; # true => no extra escaping of backslashes my $RE='\w*(a|b|c)\d\('; $RE='\d{2,3}'; $RE='"(.*?)"'; $RE="\0".'\"\t(.*?)"'; print "Perlcode:\t $RE"; #--- encode all \0 chars as escape sequence $RE=~s#\0#\\0#g; #--- substitute pairs of backslashes with \0 $RE=~s#\\\\#\0#g; #--- hide escape sequences of \t,\n,... with # corresponding ascii code my %ascii=( t =>"\t", n=> "\n" ); my $kascii=join "|",keys %ascii; $RE=~s#\\($kascii)#$ascii{$1}#g; #--- normalize needless escaping # e.g. from /\"/ to /"/, since it's no difference in perl # but might confuse elisp $RE=~s#\\"#"#g; #--- toggle escaping of 'backslash constructs' my $bsc='(){}|'; $RE=~s#[$bsc]#\\$&#g; # escape them once $RE=~s#\\\\##g; # and erase double-escaping #--- replace character classes my %charclass=( w => 'word' , # TODO: emacs22 already knows \w ??? d => 'digit', s => 'space' ); my $kc=join "|",keys %charclass; $RE=~s#\\($kc)#[[:$charclass{$1}:]]#g; #--- unhide pairs of backslashes $RE=~s#\0#\\\\#g; #--- escaping for elisp string unless ($flag_interactive){ $RE=~s#\\#\\\\#g; # ... backslashes $RE=~s#"#\\"#g; # ... quotes } #--- unhide escape sequences of \t,\n,... my %rascii= reverse %ascii; my $vascii=join "|",keys %rascii; $RE=~s#($vascii)#\\$rascii{$1}#g; print "Elispcode:\t $RE"; #TODO whats the elisp syntax for \0 ???

In reply to Re^3: [emacs] converting perl regex into elisp regex by LanX
in thread [emacs] converting perl regex into elisp regex by LanX

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.