in reply to Get chars between 2 markers using regular expressions

Update: Fixed a couple of problems as pointed out by sauoq and Random_Walk. Thanks guys :)

perlre is your friend here :)

For your first marker, you probably want to use a character class. So you might have something like: He[012].

For the middle part of your expression, it depends what you expect it to contain. If you're confident that it will only be alphanumeric characters and whitespace, then you could use ([\w\s]+)

\w denotes an alphanumeric character, \s denotes any whitespace character. These are wrapped in a character class by using the square brackets "[]", and the + quantifier is used, meaning "match one or more". The whole lot is wrapped in parentheses because you want to "capture" the string.

The end part of the expression is easy, as you said it will always end with "~~"

So, putting it all together you get (untested):

m/He[012]([\w\s]+)~~/

The captured string will be available in the $1 variable afterwards.

One point to note: If you have several such strings in a single line of data, then only the last first match will be returned. You could capture all matches into a list by using the 'g' modifier, like so:

my @strings = m/He[012]([/w/s]+)~~/g;

Hope this helps,
Darren :)

Replies are listed 'Best First'.
Re^2: Get chars between 2 markers using regular expressions
by sauoq (Abbot) on Dec 06, 2005 at 13:09 UTC

    Very good answer overall, but there are a couple nits.

    Firstly, you keep using a slash where you need a backslash; it isn't /w and /s but \w and \s.

    Secondly, you made the statement:

    If you have several such strings in a single line of data, then only the last match will be returned.
    That's incorrect in that it will be the first match returned, not the last one.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re^2: Get chars between 2 markers using regular expressions
by Random_Walk (Prior) on Dec 06, 2005 at 13:08 UTC

    McDarren, try tipping your / slashes to \ slashes in the character class [\w\s]

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!