Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have to place the <para> tag for all lines in a file except <title> tag opening in the start of the line.

I tried in the following way, but not working, can somebody help me regarding this. I am very new for perl.

$file =~ s/(?![<title>](.*?)\n/<para>$1<\/para>\n/isg;

Replies are listed 'Best First'.
Re: Negative look ahead
by graff (Chancellor) on Mar 09, 2007 at 07:58 UTC
    Let me see if I can correctly rephrase the question, to make sure I understand:
    I need to place  <para> ... </para> tags around every line of a file, except when the line happens to contain a  <title> tag. The code I've tried is:
    s/(?!<title>(.*?)\n/<para>$1<\/para>/isg;

    If I have that right, you don't really need negative look-ahead. You just need to have a conditional statement that controls whether or not the regex substitution applies -- in fact, you don't really need a regex substitution at all.

    Since you say that the file is composed of lines, and the tags are supposed to be added on every line (except any with a "title" tag), it will be easier to handle the data line-by-line, rather than in a single "slurped" scalar:

    while (<>) { if ( not /^<title>/ ) { chomp; $_ = "<para>" . $_ . "</para>\n"; } print; }
    (update: added ^ anchor to the regex, because the question mentioned looking for that tag at the start of a line)
Re: Negative look ahead
by Rhandom (Curate) on Mar 09, 2007 at 16:29 UTC
    So - so far all of the other responses have assumed you can read a line at a time. Sometimes you can't. Well - here is one possible option that will let you do what you want (assuming I know what you want) on a single string.
    my $str = "<html> <title>Hi</title> <body> Some other lines of stuff </body> </html> "; $str =~ s{ ^(.+)$ (??{ lc(substr $^N, 0, 7) eq "<title>" ? '(?!)' : '(?=)' }) }{<para>$1</para>}xgm; print $str; # prints <para><html></para> <title>Hi</title> <para><body></para> <para>Some other lines</para> <para>of stuff</para> <para></body></para> <para></html></para>

    This uses the delayed regex feature that lets you put a regex in a later point. It takes whatever is in $^N (the last match) and sees if begins with <title>. If it begins with title we put in '(?!)' which is a negative look ahead that will always fail - otherwise we put in '(?=)' which is a positive look ahead that will always succeed.

    Unfortunately we have to do the testing to see if each line begins with title with with string methods rather than a nested regex. If we used a regex inside the (??{}) we would confuse the regex engine and in many (if not all) cases cause a segfault. But string methods often work just fine.

    my @a=qw(random brilliant braindead); print $a[rand(@a)];
Re: Negative look ahead
by jesuashok (Curate) on Mar 09, 2007 at 09:41 UTC
    perl -ne 'chomp; !/^<title>/ ? print "<para>$_</para>\n" : print "$_\n +"' <input_file>

      See also perlrun, if you want to wrap ++graff's code into a 1-liner, using further options.

      perl -ple '$_="<para>$_</para>" if !/^<title>/'