Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I'm in need of a regular expression that will split a single row into 5 columns (i.e.

0x4300004 Universe Collects 0x4300021 1.2.13.44

Above is my single line, and I need to refernce each column by a number from an array

Any suggestions?

Replies are listed 'Best First'.
Re: RegEx needed
by bart (Canon) on Nov 18, 2002 at 23:12 UTC
    split. As in
    $_ = "0x4300004 Universe Collects\t0x4300021 1.2.13.44"; @data = split;
    The effect is the same as
    @data = /(\S+)/g;
    Do note the tab in there, and the double space. It works for any other whitespace sequence.
Re: RegEx needed
by nothingmuch (Priest) on Nov 18, 2002 at 22:59 UTC
    @array = /(\S+)/g

    Should do it, but there are much prettier examples of column extractions in perldoc perlpacktut.

    -nuffin
    zz zZ Z Z #!perl
Re: RegEx needed
by cLive ;-) (Prior) on Nov 19, 2002 at 06:36 UTC
    I can't believe the last 4 answers were all wrong, yet all agreed with each other! Your second term contains a space, so forget split. Let's assume your second term may or may not contain a space (you don't specify - a larger data set than one line might help here!), and that you have the two octals (?) and two floats (but let's assume they may be integers) in the relevant places - it might look like something like this:
    $line =~ /(0x\d{7}) # octal \s+ # space of some kind (.*) # anything (greedy) \s+ # space of some kind (0x\d{7}) # octal \s+ # space of some kind (\d+\.?\d?\.) # float, followed by a period \s+ # space of some kind (\d+\.?\d?) # float /x; # allow these comments @array[0..4] = ($1,$2,$3,$4,$5);
    1) yes, the float match is rough. 2) I haven't run this - I'm not on my box. I can't remember array slice assignment off hand.

    But I'm sure you get the idea.

    .02

    cLive ;-)

    --

Re: RegEx needed
by Chief of Chaos (Friar) on Nov 19, 2002 at 08:46 UTC
    Hi,
    my suggestion on your problem :
    #!/usr/local/bin/perl -w use strict; my @dat; while (<DATA>) { chomp; if (/(0x[a-zA-Z0-9]+)\s ([a-zA-Z0-9 ]+)\s (0x[a-zA-Z0-9]+)\s ([0-9\.]+)/x) { print "1. $1 \n"; print "2. $2 \n"; print "3. $3 \n"; print "4. $4 \n"; @dat = ($1,$2,$3,$4); } } __DATA__ 0x4300004 Universe Collects 0x4300021 1.2.13.44
Re: RegEx needed
by Anonymous Monk on Nov 18, 2002 at 23:17 UTC
    Thanks for the help guys!
Re: RegEx needed
by Anonymous Monk on Nov 18, 2002 at 22:58 UTC
    This is what I'm currently trying to do
    foreach (@data) { ($1, $2, $3, $4, $5) = m/($_\s)/g; print $2; }
      You shouldn't assign to the number variables. It won't work, and if i remember well will even raise an exception. The regex engine does it by itself, when you don't apply list context to m//g.

      You can do something like
      m/(\S+\s)(\S+\s)(\S+\s)(\S+\s)(\S+\s)/;

      To be more along the lines of what you've tried, and $1 .. $5 will have the correct values in them. But that is redundant.

      You also placed $_ in the parens for some odd reason. It attempts to find the string within itself. I think you mean

      $_ =~ m//g

      Anyway, best of luck.

      -nuffin
      zz zZ Z Z #!perl