Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'ts been way too long since the last time I did this, but I have text that looks like:

*this*

in a string, and I want to change it to:

<em>this</em>

I tried:

$text =~ s\*(.{1,})\*/<em>$1<\/em>/g;

but the matched text tends to keep going past the second "*". I *think* I need to change the "." into a range of some sort, excluding the * character. I'm not terribly worried about nested *'s, and I'm using MacPerl so I'm pretty much unable to use HTML modules that might do this for me. Anyone have some ideas what I'm doing wrong?

- Brad

Replies are listed 'Best First'.
Re: How to bold text with regexp...
by JayBonci (Curate) on Apr 24, 2002 at 08:43 UTC
    No real need to use nested *'s. Remember that the * character can be "greedy" in a regular expression, meaning that it finds more than one possibility, and will take the last one it gets. I used this:
    #!/usr/bin/perl -w use strict; my $txt = "foo *text* bar *bat* bounce"; $txt =~ s/\*(.*?)\*/<em>$1<\/em>/g; print $txt."\n";
    Broken down:
    s/ substitute \* a star (.*?) then a group of arbitrary characters (but only match once, see b +elow) \* then another star / replace it with <em>$1<\/em> what we just got, surrounded by <em>'s /g; do this across the entire string;


    What .*?\* will do for you is match all characters for you before a star, but the ? tells it to stop matching after the first possibility. Without it, your output would look like
    foo <em>text* bar *bat</em> bounce - instead of - foo <em>text</em> bar <em>bat</em> bounce
    Good luck with it. Having MacPerl shouldn't make any difference.     --jb
Re: How to bold text with regexp...
by stephen (Priest) on Apr 24, 2002 at 08:40 UTC

    Simplest way is:

    $text =~ s/\*([^*]+)\*/<em>$1<\/em>/g;
    That starts at any given asterisk and selects till the next.

    stephen

      If I understand the question, we should only change asterisks at the beginning and end of a word, rather than all asterisks two at a time... if that's the case, you might want something more like:
      $text =~ s/\b\*([^*\s]+)\*\b/<em>$1<\/em>/g;
      which will only replace pairs of asterisks at beginning and ending word boundaries, where there are no space characters between the asterisks. I'm not sure if that's what the original poster meant, though.
Re: How to bold text with regexp...
by Molt (Chaplain) on Apr 24, 2002 at 08:46 UTC

    Hi, in the absence of modules the following regexp seems to work. It will need changing if you're going to somehow allow literal *'s in the text, but I'd need to know how you intend to do this really before being able to do it.

    Just for further information the secret is in the ? which forces minimal, ie. 'shortest possible', matching.

    #!/usr/bin/perl -w use strict; my $text = "This is in *Bold* and so is *This*!"; $text =~ s/\*(.*?)\*/<em>$1<\/em>/g; print $text;
Re: How to bold text with regexp...
by Anonymous Monk on Apr 24, 2002 at 18:33 UTC
    Wow, thanks! All these suggestions are great, and were a big help. - Brad