Though it seems to work, I have a hunch this isn't ideal.

Of course, the best way is to use a proper parser. In this case, the original Markdown is a Perl script itself, and the regex extracted from it looks like the following (modified slightly to put it into qr// form). Of course, now there are tons of different Markdown variations and parsers, and probably several that are more robust than this "just a regex" parser, but as long as your Markdown input is simple enough, it should probably be ok. <update> There are caveats to this approach, though - for example, the following regex will also operate on code blocks! As usual, more representative sample data will result in more accurate solutions :-) </update>

my $g_nested_brackets; $g_nested_brackets = qr{ (?> # Atomic matching [^\[\]]+ # Anything other than brackets | \[ (??{ $g_nested_brackets }) # Recursive set of nested brackets \] )* }x; my $anchors = qr{ ( # wrap whole match in $1 \[ ($g_nested_brackets) # link text = $2 \] \( # literal paren [ \t]* <?(.*?)>? # href = $3 [ \t]* ( # $4 (['"]) # quote char = $5 (.*?) # Title = $6 \5 # matching quote )? # title is optional \) ) }xs;

I've taken this regex and modified it to modernize it a bit and only capture the things we're interested in:

use warnings; use strict; my $anchors = qr{ (?(DEFINE) (?<nested_brackets> (?> [^\[\]]+ | \[ (?&nested_brackets) \] )* ) ) \[ (?<text> (?&nested_brackets) ) \] \( (?<link> [ \t]* <? .*? >? [ \t]* (?: (?<titlequote>['"]) .*? \k<titlequote> )? ) \) }xs; my $input = <<'END'; blah blah [click me](click me) more stuff blah [link here](link here) blah blah END my $expect = <<'END'; blah blah [click me](/click-me) more stuff blah [link here](/link-here) blah blah END (my $output = $input) =~ s{$anchors}{ my ($t, $l) = @+{qw/ text link /}; $l =~ s/\s+/-/g; "[$t](/$l)" }ge; use Test::More tests=>1; is $output, $expect;

In reply to Re: Modifying muliple matched strings in text (updated) by haukex
in thread Modifying muliple matched strings in text by nysus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.