ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to convert stuff like:

== test==

..to:

==test==

...and convert:

== test ==

..to:

==test==

Example string is:

my $string = q| == Introduction== --sub header-- A Montréal, il y a [[vieux Montréal]]. [[Montréal]] [[testing weird tags]] [[test]] == Brief History== asdfasfd as df asf d asdf as fd asf dasfd ==Geography== ==Regions== ==Cities== ==Sights and Activites== ==Weather== ==Getting There== ==END Getting There== |;
..and I've tried quite a few things:

    $string =~ s|\=\=\s+?([a-z0-9_ &\[\]ÀÂÄàâäÇçÉÊÈËéêèëÏÌÎïìîÖÔÒöôòÜÛÙüûùA-Z?!;«»()"]+)\s+?\=\=|==$1==|ge;

...and:

    $string =~ s|\=\=([\s+])([a-z0-9_ &\[\]ÀÂÄàâäÇçÉÊÈËéêèëÏÌÎïìîÖÔÒöôòÜÛÙüûùA-Z?!;«»()"]+)([\s+])\=\=|==$2==|sig;

..and even:

    $string =~ s|\=\=\s+?([a-z0-9_ &\[\]ÀÂÄàâäÇçÉÊÈËéêèëÏÌÎïìîÖÔÒöôòÜÛÙüûùA-Z?!;«»()"]+)\s+?\=\=|FOO|sig;

..just to see if it matches .

None of these seem to do what I want though :/

Any suggestions?

BTW, the reason I have all the accented charachters, is due to the fact that this is for a French site, so needs to have all those charachters :)

TIA

Andy

Replies are listed 'Best First'.
Re: Regex doesn't wanna find the spaces :/
by Corion (Patriarch) on Jun 24, 2009 at 10:25 UTC
    $string =~ s|\=\=\s+?([a-z0-9_ &\[\]ÀÂÄàâäÇçÉÊÈËéêèëÏÌÎïìîÖÔÒöôòÜÛÙüûù +A-Z?!;«»()"]+)\s+?\=\=|==$1==|ge;

    In your input data there is no string that has both, whitespace at the start and at the end. Maybe you want \s* instead of \s+? ? Note that appending a ? modifier is likely useless your case anyway, as you want to remove all whitespace before/after the ==, so it doesn't make sense to make that removal non-greedy.

      Thanks, works perfectly :)
Re: Regex doesn't wanna find the spaces :/
by davorg (Chancellor) on Jun 24, 2009 at 10:30 UTC
      I tried that, and didn't seem to work =)

      Cheers

      Andy
        You are mistaken. You probably didn't have the "?".
Re: Regex doesn't wanna find the spaces :/
by johngg (Canon) on Jun 24, 2009 at 14:43 UTC

    You can use look-around assertions to avoid the need for captures. You might want to anchor the assertions to the start and end of the string depending on your data.

    $string =~ s{(?: (?<===) \s+ | \s+ (?===) )}{}gx;

    I hope this is of interest.

    Cheers,

    JohnGG