in reply to Re: Preserve original text formatting.
in thread Preserve original text formatting.

Thanks to this thread i actually started thinking of some new solutions which i will have to try out to see if they work.

Already having some problem though with the initial split on words instead of spaces since the \w+ also needs to include swedish characters ÅÄÖåäö.

Something along the lines of /\b[A-ZÅÄÖ][a-zåäö]+\b/ only matches some words, and as mentioned above, well my regex is subpar.

Replies are listed 'Best First'.
Re^3: Preserve original text formatting.
by hippo (Archbishop) on Sep 10, 2015 at 16:43 UTC

    You can use a character class instead of a sequence. eg:

    $ perl -MTest::More -e 'ok (/\b\p{XPosixAlpha}+\b/, "$_ matches") for +(qw/A Z Å Ä Ö/); done_testing();' ok 1 - A matches ok 2 - Z matches ok 3 - Å matches ok 4 - Ä matches ok 5 - Ö matches 1..5 $
    A reply falls below the community's threshold of quality. You may see it by logging in.