in reply to regex confusion

The problem is not in this line of code, which with my perl parses fine. It is likely to be something on the line before it. (Especially considering mention of a multiline *-deliminated string starting on line 257 in the error message.)

Besides that, it would probably be better and easier to use HTML::Parser or HTML::TokeParser to do this, rather then dealing with the HTML manually.

Replies are listed 'Best First'.
Re: Re: regex confusion
by Anonymous Monk on Oct 07, 2001 at 08:45 UTC
    well, this is the line previous to the regex:
    $instream =~ s/<\/H[1-6]>\s*/\}\n/ig;
    and i can't use html parsing modules...i need to chew on the html manually to replace it with the correct coding.
      the /x switch allows white space and comments within a regular expression. This lets you break things into little pieces and comment out the pieces until the syntax errors go away.
      use strict; use warnings; use diagnostics; my $instream= 'empty'; $instream =~ s/ <FONT (.+?) COLOR\s?=\s? ('|")? (\#......) \2 (.+?) SIZE \s?=\s? ('|")? ################ (\d++)\5 #d+ not d++ (.+?) FACE \s?=\s? ('|")? (.+?) \8 [^>]*> \s* /genStyleCSF($3,$6,$9) /iegx;
      Also allow me to join the chorus suggesting a HTML parsing module....



      email: mandog

      The error you're seeing is a result of perl getting confused about which parts of your code are inside a substitution, and which are outside. Look for a regex, somewhere before line 257, where you're matching a slash, and forgot to escape it with a backslash.

      When you're matching slashes inside a regex, it's helpful to use a different delimiter for the regex, as in m,</html>, or s!</H1>!}\n! . This saves you from having to use all those backslashes.

      But, as wog said, it really would be best to use an HTML parsing module.