in reply to Parsing using Regex and Lookahead

While you are on the subject of lookaheads, this

$code =~ s/\n//g; #remove all the newlines $code .= "\n"; #add one to the end

could be replaced by this

$code =~ s/\n(?=.)//g # remove all but the last newline

I hope this is of interest.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: Recursive Regex
by JavaFan (Canon) on Mar 11, 2009 at 11:09 UTC
    Your replacement isn't equivalent to the code you claim it can replace. First of all, your code will not remove a newline followed by a newline - you'd need (?=(?s:.)) for that, or the /s modifier. Second, the original code will always let $code end with a newline; regardless whether it ended with a newline. In your replacement, there will only be a trailing newline in $code if there was one originally.

      Good catch re. the s modifier, I missed that. Thanks for the correction.

      With regard to your second point, from the way the OP initialised $code I don't think always ending with a newline was the requirement s/he was addressing. For the more general case you are correct.

      Cheers,

      JohnGG

      A reply falls below the community's threshold of quality. You may see it by logging in.
      So I'm thankful for the help!

      First off: deMize « he »

      Second, I'm not sure what I did was the best way of going about things. What I'm doing is just building a CMS insertion page with as little markup as possible, where each section is contained in it's own div.

      Response:
      The first problem was getting each into their own div, which the lookahead helped. To accomplish this, it just ends the current div when it hits a new section (may be a subsection in the future).

      The reason for removing the line breaks was because I was using that as a delimiter for the last section --- there is probably a better way of doing that with determining the end of the string in the RegEx (maybe $), but I need to replace those line breaks with HTML breaks anyhow.

      The problem:
      This means that it is going to be linear with no sub-divs. I might have to rethink that for later, because I might want to have something like this later:
      [section] [top]Top Data [middle]Mid Data [bottom]Bottom Data [section] [top]Top Data [bottom]Bottom Data

      Should result to:
      <div class="section"> <div class="top">Top Data</div> <div class="middle">Middle Data</div> <div class="bottom">Bottom Data</div> </div> <div class="section"> <div class="top">Top Data</div> <div class="bottom">Bottom Data</div> </div>

      What I plan to do is store either the sub sections in an array or the sections in an array. I could probably use help with a better algorithm.
      The whole purpose of this was so that I could quickly type the data into one input box, without building a whole intricate interface (that can come later).




        What about something like:

        [section] top=Top Data middle=Mid Data bottom=Bottom Data [section] top=Top Data bottom=Bottom Data

        Now you've got a built-in qualitative difference that makes the parsing easier.