anelson has asked for the wisdom of the Perl Monks concerning the following question:

Greetings monks,

I need to take a scalar (lets call it $line for fun) . . . so
my $line = "foo blah ditty doo <SOME_THING BLAH=some_other_thing> woob +a blah blah etc";
I need to capitalize "some_other_thing" in $line, without hosing up any of the rest of it. I'm fairly sure there was an easy way to do with a regexp, but can't for the life of me figure it out. Any help appreciated. Yours, -a

Replies are listed 'Best First'.
Re: Regular expression needed (maybe)
by Paladin (Vicar) on Jan 15, 2003 at 04:53 UTC
    Something like this perhaps?
    $line =~ s/BLAH=(\w+)/BLAH=\U$1/;
    Modify BLAH= and \w+ to your needs.
      Perfect! Thanks

      -a
Re: Regular expression needed (maybe)
by mojotoad (Monsignor) on Jan 15, 2003 at 05:45 UTC
    Paladin's answer is adequate, but somewhat tightly bound to the problem domain. Here's a more generic one. Note the 'g' modifier will make this one perform the transformation on every 'tag' in your string. If there is never multiple 'tags' then you can drop the 'g'. Assumes no newlines:

    $line =~ s/<([^=]+)=([^>]+)>/<$1=\U$2>/g;

    Both solutions provided thus far are still vulnerable to spurient '<' or '>' characters appearing outside of the tag construct in your string.

    Matt

Re: Regular expression needed (maybe)
by Aristotle (Chancellor) on Jan 15, 2003 at 08:05 UTC
    A warning: if you're trying to create a semi general purpose solution for HTML or XML files, you should use a proper parser module rather than just a regex. Pattern matching against marked up text is a very fragile approach. For HTML, see HTML::TokeParser::Simple - there's even a bunch of useful examples in the documentation, and chances are you can just lift one of them and modify it to suit your needs.

    Makeshifts last the longest.

      A warning: if you're trying to create a smei general purpose solution for HTML or XML files, you should use a proper parser module rather than just a regex.

      No offense, but you are beginning to sound like a broken record about this. Instead of repeating the same thing over and over again, why don't you just provide a link to one of your other posts about the subject?

      (By the way, I disagree with you. Most of the time, you would want to use a module, but what he is trying to do is simple enough for a regex to handle.)

        And, brother, I disagree with your disagreement. A module represents a black box that has been extensively tested. Equally as important, it contains within it extensive error-checking and error-handling.

        The latter is crucial to the success of any serious development because a mis-typed character will stymie a developer for hours. Those are wasted hours. Wasted hours are wasted dollars.

        A lot of parsing is simple enough for a regex to handle. In fact, regexes are mini-parsers. But, once you start dealing with parsing things that have to balance, that's not simple at all. Much better to leave that kind of work to the experts who are kind enough to give me stuff that works free of charge.

        Be Lazy - let other people do the work for you.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        That's why I added my disclaimer "if you're trying to create a semi general purpose solution" (sorry about the typo). Very simple things can be done with a regex. Also if he knows exactly what his data looks like and this is a one-off job a regex is quite likely to suffice. I assert that unless you have some experience to make a good call it's safer to err on the side of using a parser for X?(HT)?ML where a regex might have sufficed, though.

        But in this case I have no idea if he is even parsing markup at all or just something that happens to look like it. So instead of giving a possibly ill-advised suggestion I chose to just raise awareness about the issue and leave it at that. Sorry to sound like a broken record, but that's because I'm responing to the broken record of a question that keeps coming up.

        Makeshifts last the longest.