in reply to How do I match lines of 40 characters long in a block of text?

This really isn't as hard as everyone has made it out to be. You almost had it, but you need to anchor so as not to match partial lines. You also probably want to avoid matching the empty string. I'm guessing that you really want "1 to 4 lines each consisting of up to 40 characters followed by a newline."

/ ( # Assuming you want capture these lines. (?: # Group each line. ^ # Beginning of the line. .{0,40}\n # 0 to 40 characters followed by a newline. ){1,4} # 1 to 4 lines. (0 will permit an empty match.) ) # Done capturing. /mx; # /m so that ^ anchor works, /x for comments.
-sauoq
"My two cents aren't worth a dime.";
  • Comment on Re: How do I match lines of 40 characters long in a block of text?
  • Download Code

Replies are listed 'Best First'.
Re: Re: How do I match lines of 40 characters long in a block of text?
by BrowserUk (Patriarch) on Sep 25, 2002 at 21:40 UTC

    I think you are probably right about what he actually needs, re: 1 to 4 rather than 0 to 4, but there is a possibility that yours won't cater for: A string containing < 40 chars but no newline..

    It's probably a spurious requirement, but trying to achieve it hung me up for ages.

    (Knowing you, you'll add a 4 character, positively backward, forward-looking, zero-width assertion to your regex and acheive that too:)


    Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
      but there is a possibility that yours won't cater for: A string containing < 40 chars but no newline..

      You're right. Mine doesn't account for it. I guess I assumed they all would end with newlines. That was a bad assumption on my part. Of course, I might blame it on poorly stated requirements. :-)

      (Knowing you, you'll add a 4 character, positively backward, forward-looking, zero-width assertion to your regex and acheive that too:)

      Nah... it should be easier than that. Use a $ to match the end of the line (not including the newline) and then \n? to match an optional newline. So, I tried that:

      / ( # Assuming you want capture these lines. (?: # Group each line. ^ # Beginning of the line. .{0,40}$\n? # 0 to 40 chars, an end-of line and optional newline +. ){1,4} # 1 to 4 lines. (0 will permit an empty match.) ) # Done capturing. /mx; # /m so that ^ anchor works, /x for comments.

      But that didn't work! I was vexed until I realized that looks an awful lot like "match 0 to 40 characters followed by $\ followed by an optional "n". So, then I tried:

      / ( # Assuming you want capture these lines. (?: # Group each line. ^ # Beginning of the line. .{0,40}$ # 0 to 40 characters followed by an end-of-line. \n? # An optional newline. ){1,4} # 1 to 4 lines. (0 will permit an empty match.) ) # Done capturing. /mx; # /m so that ^ anchor works, /x for comments.

      And that worked like a charm.

      That additional requirement did make the whole exercise more fun. There is another workaround. Sometime before I actually figured out why it was breaking, I tried (?:\n|\Z) and that worked as well but I thought it was ugly. So, I'm left wondering whether there is a better way around it than using /x and whitespace.

      Thanks for making this so much more entertaining. :-)

      Update: This was my 300th node! :-)

      -sauoq
      "My two cents aren't worth a dime.";