Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, All,

I almost always use the split function to parse a line of text and then define a variable based on the split result. Here's an example:-

@line_as_list = split (/\s+/,$rptfile[$ln]);
$width = $rptfile[ 3]

$rptfile is the file I have slurped in, and $ln is the current line number in that file.

For the type of parsing I typically do, the above procedure is usually fine. Obviously, though, this is more difficult to manage when the text I need to locate is not in a consistent loction.

What is the easiest way to define a variable after a keyword is found? Here's an example string:-

Startpoint: flop_a (clocked by t_clk)

I want to quickly assign flop_a to a variable and t_clk to another variable, but sometimes there are extra words after "flop_a" and before "clocked", so splitting by \s+ is too unpredictable. The variables I need will always follow "Startpoint:" and "clocked by", though.

My line names are always $rtpfile[$ln], where $ln is the current line number I am working on (I often have to bounce back to previous line numbers...)

Thank You!

Replies are listed 'Best First'.
Re: Parsing text without split
by Roy Johnson (Monsignor) on May 27, 2005 at 21:30 UTC
    Extract from a regex match: ($flop_a, $t_clk) = $rtpfile[$ln] =~ /Startpoint: (\w+).*?clocked by (\w+)/;

    Caution: Contents may have been coded under pressure.

      That has the same problem as split... It will fail for variables with spaces in them.

      If you can rely on the parenthesis, it would work better as:

      @line_as_list = $rtpfile[$ln] =~ /Startpoint: ([^(]+).*?clocked by ([^ +)]+)\)/;

      Update: Doh! Never mind. I misread the OPs specs.

        I took his meaning to be that the "extra words" were not part of the desired capture. You can capture multiple words with spaces by changing (\w+) to ([\w\s]+).

        Caution: Contents may have been coded under pressure.