in reply to Don't understand behavior of this split

As documented in split:

If the PATTERN contains parentheses, additional list elements are created from each matching substring in the delimiter.

You can get your expected behavior by changing to a non-capturing group:

(?:<\/p>)?\s*<p>