Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have just started to learn Perl and i'm working on some string manipulation and pattern matching.
I would like to know how to return a word that is next to a word i am specifying.
for example, in the following string, i would like to return the word 'red' when i search for 'color: '.

| shape: square | color: red | size: 320x320 | id: 0001

I am reading in a file into an array and am looking at each line. Each line is different, in that it could be just text or whitespace (the line i am actually reading is at the footer of the first page). Here is the code i have so far...

use strict; my @file = <STDIN>; my $word = "color: "; for my $lineno ( 0 .. $#file ) { if ($file[$lineno] =~ /\b$word\b/){ } }

Thank you for your help

Replies are listed 'Best First'.
Re: return a word next to the word you give
by liz (Monsignor) on Oct 16, 2003 at 12:19 UTC
    my $word = "color: "; my $lineno = 0; while (<STDIN>) { $lineno++; if (m#$word(\w+)#) { # $1 should now contain the color } }

    There's no need to slurp the whole file into memory if you are sure that there will never be a linebreak between the color: and the actual color name.

    Also, if you're sure that $word will never change its value, you should postfix the regexp with "o" for efficiency reasons:

    if (m#$word(\w+)#o) {

    but only if you're 100% sure that the contents of $word will not change!

    Hope this helps.

    Liz

      What's /o going to win you here? As running it with -Dr will show, Perl is not going to recompile the regex anyway. And what's with the $lineno++? Did $. break? And since the regex doesn't contain any /'s, why m## and not just //?

      Abigail

        What's /o going to win you here?

        I guess I could ask the same question about the use of quotemeta() in Re: return a word next to the word you give. What's that going to win you there? It's clear there are no characters causing special effects in "color: ", are there? But you're trying to teach a newbie some good idiom, even though it doesn't make any sense performance wise or other wise in this particular case.

        The reason I mentioned /o was that it can be a severe performance penalty if the regexp needs to be compiled again and again if it is not necessary. So I too am trying to teach something to a newbie.

        As running it with -Dr will show, Perl is not going to recompile the regex anyway.

        Has Perl suddenly become psychic? Does Perl now know when and when not to recompile regular expressions with embedded variables? Then why isn't /o deprecated? Just for those few cases where you want to be able to change the variable and still keep the initial regex?

        And what's with the $lineno++? Did $. break?

        Good point. That could have been $..

        And since the regex doesn't contain any /'s, why m## and not just //?

        Well, in my experience, I find myself trying match "/" much more often than I find myself trying to match "#". And I like to be consistent as much as I can, so I prefer doing m## almost all of the time over doing // most of the time.

        Furthermore, I prefer a visual feedback about whether I'm trying to match or replace a string, so even with // I personally prefer to write m//.

        Even though this slipped in without much thought from my end (as it sort of is engrained in my fingers nowadays), it also teaches the newbie that there is more than one way to do it.

        Liz

Re: return a word next to the word you give
by Abigail-II (Bishop) on Oct 16, 2003 at 12:17 UTC
    No need to slurp it all into memory. Use something like (untested):
    use strict; use warnings; my $word = quotemeta "color: "; while (<>) { print "Found '$1'\n" while /$word(\w+)/g; # Find all. } __END__

    Abigail

Re: return a word next to the word you give
by Roger (Parson) on Oct 16, 2003 at 12:26 UTC
    You could use the built-in capture variable $1 to extract the word next to the one you are after -

    use strict; my @file = <STDIN>; my $word = "color:"; # no space after colon for my $lineno ( 0 .. $#file ) { if ($file[$lineno] =~ /\b$word\s*(\w+)/){ # notice the capture bra +ckets my $value = $1; # $value here is 'red'. ... } }
    Another possible variant, say, in case there are multiple color attributes on the line, you want to capture all of them in an array.

    my @colors = $file[$lineno] =~ /\b$word\s*(\w+)/g;
Re: return a word next to the word you give
by greenFox (Vicar) on Oct 16, 2003 at 15:14 UTC
    Others have answered the question you asked but I am going to propose a different solution which may be over-kill for what you are doing (ie you only ever want color or you can't guarantee the format of the line)
    use strict; while (<FILE>){ chomp; my %line = map { split /\s*:\s*/ } split /\s*\|\s*/; print "color = $line{color}\n"; #or for my $key (keys %line){ print "$key = $line{$key}\n"; } }
    This line my %line = map { split /\s*:\s*/ } split /\s*\|\s*/; does all the work. First at the far right we split the line up into chunks using the pipe (plus optional whitespace) as a seperator, we pass the chunks one at a time through map to another split which breaks each chunk into a pair based on the colon seperator (again with optional whitespace). The result is a list like (shape,square,color,red,size,320x320,id,0001) which we use to populate the hash. I don't know your experience as a programmer so this may seem very complicated, that's OK read the man pages, experiment, look at other peoples code, and don't be afraid to experiment.

    --
    Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho