Re: Problem matching a backslash in a regular expression

Replies are listed 'Best First'.
Re^2: Problem matching a backslash in a regular expression by graff (Chancellor) on Sep 17, 2005 at 00:13 UTC
Note that you can re-edit the text of anything you post here to update it. For example, you can still add "<code>" and "</code>" around the snippet of perl code in the node that I'm replying to here. (And people like to be told when a node has been updated, so include a little extra comment to say you added code tags.) So, if the data you need to put together is coming in on multiple lines, there are a few different ways to deal with this, and the method you'll like best will depend on other factors, like: how much other data is there in the input file? how much of that other data are you using? do you need to pull lots of occurrences of the target info from this one file? do you have a lot of files you need to process this way? do the patterns of information vary from one instance to the next? ... and so on. For a start, let's suppose the input consistently has this sort of layout: `set_input_clock \d+ -someParam -otherParam "paramValue" \ find(string,"quoted_string")` [download] That's two lines of data, and you want the digit string, the params and param value, and the quoted string from the next line. One way would be (updated, based on a more detailed sample of data you list later in this thread): my $record = ''; while (<>) { # read a line of input data s/\s+$/ /; # normalize all line-final whitespace to " " $record .= $_; if ( /\\ $/ ) { # if line ended with a backslash $record =~ s/\s*\\ / /; # remove it from the record, next; # and move on to add next line } # if ( $record !~ /find/ ) { # if record doesn't have this # next; # try adding next line # } # now just figure out what to do with $record -- e.g. my @tokens = split ' ', $record; $record = ''; # in case there's another record coming print "\n", join( ' ', @tokens ), "\n"; # just guessing here, about what might be useful... my %params = (); $params{cmd} = shift @tokens; $params{amount} = shift @tokens if ( $params{cmd} =~ /set/ ); my $lastparam; while ( @tokens ) { $_ = shift @tokens; if ( /^-/ ) { $params{$_} = ''; $lastparam = $_; } elsif ( /find\W+port\W+(\w+)/ ) { $params{port} = $1; last; } else { $params{$lastparam} = $_; } } for ( sort keys %params ) { print "$_ => $params{$_}\n"; } print "\n"; } [download] Another update: looking again at your later reply, about how you need to keep track of clocks that are created and what their various parameters are, I think you'll be able to see how to maintain your overall "%clocks" hash in a manner similar to how I create the "%params" hash for each record in the code above.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: Problem matching a backslash in a regular expression
by graff (Chancellor) on Sep 17, 2005 at 00:13 UTC

So, if the data you need to put together is coming in on multiple lines, there are a few different ways to deal with this, and the method you'll like best will depend on other factors, like: how much other data is there in the input file? how much of that other data are you using? do you need to pull lots of occurrences of the target info from this one file? do you have a lot of files you need to process this way? do the patterns of information vary from one instance to the next? ... and so on.

For a start, let's suppose the input consistently has this sort of layout:

set_input_clock \d+ -someParam -otherParam "paramValue" \
find(string,"quoted_string")
[download]

One way would be (updated, based on a more detailed sample of data you list later in this thread):

my $record = '';

while (<>) {  # read a line of input data
    s/\s+$/ /;  # normalize all line-final whitespace to " "
    $record .= $_;
    if ( /\\ $/ ) { # if line ended with a backslash
        $record =~ s/\s*\\ / /;  # remove it from the record, 
        next;                    # and move on to add next line
    }
#    if ( $record !~ /find/ ) { # if record doesn't have this
#        next;                     # try adding next line
#    }

    # now just figure out what to do with $record -- e.g.
    my @tokens = split ' ', $record;
    $record = '';  # in case there's another record coming

    print "\n", join( ' ', @tokens ), "\n";

    # just guessing here, about what might be useful...

    my %params = ();
    $params{cmd} = shift @tokens;
    $params{amount} = shift @tokens if ( $params{cmd} =~ /set/ );

    my $lastparam;
    while ( @tokens ) {
        $_ = shift @tokens;
        if ( /^-/ ) {
            $params{$_} = '';
            $lastparam = $_;
        }
        elsif ( /find\W+port\W+(\w+)/ ) {
            $params{port} = $1;
            last;
        }
        else {
            $params{$lastparam} = $_;
        }
    }
    for ( sort keys %params ) {
        print "$_ => $params{$_}\n";
    }
    print "\n";
}
[download]

[reply]
[d/l]
[select]