Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hello
I have the following data:
TEST:foo:bar:123:A:B
I need to extract the word foo,
I could do this
my ($test, $foo, $bar, $num, $A, $B) = split /:/, $_; print $foo
but there must be an easier way

Replies are listed 'Best First'.
Re: regex needed
by pfaut (Priest) on Mar 14, 2003 at 14:17 UTC

    You have delimited data and want to extract one component. split is usually the correct way to deal with delimited data.

    You don't like entering all those extra variables when you only want one of the values? Then assign to an array and extract the component you want from the array.

    my @array = split /:/; # the $_ is implied my $foo = $array[1];

    You can make this shorter by eliminating the temporary array.

    my $foo = (split /:/)[1];
    --- print map { my ($m)=1<<hex($_)&11?' ':''; $m.=substr('AHJPacehklnorstu',hex($_),1) } split //,'2fde0abe76c36c914586c';
Re: regex needed
by broquaint (Abbot) on Mar 14, 2003 at 14:18 UTC
    Here's a few options
    my $str = 'TEST:foo:bar:123:A:B'; print 'split: ', (split ':', $str)[1], $/; print 'substr: ', substr($str, 5, 3), $/; print 'regex: ', $str =~ /:([a-z]+)/, $/; print 'grep: ', grep(/^f/, split /:/, $str), $/; __output__ split: foo substr: foo regex: foo grep: foo

    HTH

    _________
    broquaint

Re: regex needed
by Chady (Priest) on Mar 14, 2003 at 14:18 UTC
    is foo always the second item?
    print (split /:/)[1];

    Update: I just noticed that this does not actually run, it spits a syntax error, this, however, works:

    print '', (split /:/)[1];
    Anyone can explain this a bit? got anything to do with list|scalar context?
    He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

    Chady | http://chady.net/

      Operator precedence and print's hunger for a filehandle get in the way. perl parses that as

      (print (split /:/))[1];

      And therefore thinks that (split /:/) should be a filehandle (not sure about that) and then you're taking a slice of the whole mess (which definitely won't fly). Extra parentheses are all you need.

      print( (split /:/)[1] );

      print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
      Another way to keep the parentheses from being grabbed by print is to add a + in front of them, like this:
      print +(split /:/)[1];
      perlfunc explains why this happens: Perl built-in functions can be called with or without parentheses, so when perl sees parentheses there, it associates them with the function call, not with the creation of a list.

      -- Mike

      --
      just,my${.02}

      I believe that the
      print (split /:/)[1];
      is not working because it first evaluates the print (split /:/), which returns a scalar value, in this case "1" for a successful print. So it's trying to get a list element of a scalar, which does not work. Same as trying 1[0]; .
Re: regex needed
by bart (Canon) on Mar 14, 2003 at 18:46 UTC
    There's no need to split in so many parts. The next will reduce the amount of work actually done:
    (undef, my $foo) = split /:/;
    as it will split into 3 parts, throwing away the first (leading) and third (rest) parts. Compare:
    $foo = (split/:/)[1];
    will split into as many parts as there are columns. So will
    @data = split/:/; $foo = $data[1];

    Do you insist on a regex? I don't know if anybody gave you one already, as everybody seems to be focussing on split... Here is one:

    ($foo) = /:(.*?):/;
    Note that your requirement makes this easy... making me feel like this is homework. If you wanted another column, like the fourth one, the regex would look really messy.
      it will split into 3 parts

      I'm not sure there's any guarantee perl will do that optimization for you. If you want three parts, just tell it so:

      $foo = (split/:/,$_,3)[1];
      Update: You seem to be right about the optimization. The following benchmark shows nearly identical times:
      cchan.acsys.com.1.256% cat aaa3.pl use Benchmark; my $count = 1000000; our $str = "a:b:c:d:e:ff:g:h:i:j:k:l:m:n:o:p:q:r:s:t:u:v:w:x:y:z"; timethese($count, { 'slice' => sub { (undef, my $foo) = split /:/,$str; }, 'count' => sub { my $foo = (split/:/,$str,3)[1]; }, });
      Benchmark: timing 1000000 iterations of count, slice... count: 2 wallclock secs ( 3.18 usr + 0.01 sys = 3.19 CPU) @ 31 +3479.62/s (n=1000000) slice: 3 wallclock secs ( 3.04 usr + 0.06 sys = 3.10 CPU) @ 32 +2580.65/s (n=1000000)
      However, giving the split limit explicitly is clearer IMHO.
        I'm not sure there's any guarantee perl will do that optimization for you.
        Yes it's garanteed. I'll just have to find the proper documentation for you... Ah, here it is, in perldoc -f split:
        When assigning to a list, if LIMIT is omitted, Perl supplies a LIMIT one larger than the number of variables in the list, to avoid unnecessary work.
        Therefore, with two scalars on the left hand side (of which one variable, and undef), Perl will split into 3 parts.
Re: regex needed
by CukiMnstr (Deacon) on Mar 14, 2003 at 14:27 UTC
    You don't need a regex for that... if you only want the second field in the string,

    my $foo = (split ':', $string)[1];
    will do the trick.
    split() returns a list, so you can explicitely set a list context and only keep the element you want with a subscript.

    hope this helps,

Re: regex needed
by robartes (Priest) on Mar 14, 2003 at 14:21 UTC
    There are many, many ways to do this, but here's two of them:
    my $foo=(split /:/, $_)[1]; (my $foo)=/[^:]*:([^:]*):[^:]*/;
    Undoubtedly, by the time I submit this, there will be countless others posted as well.

    CU
    Robartes-

Re: regex needed
by OM_Zen (Scribe) on Mar 14, 2003 at 16:19 UTC
    Hi ,

    This is one way to do this :

    if($_ =~ /^TEST:/){print $';} __END__ foo