bory has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: How do I extract some records from a textfile
by Melly (Chaplain) on Oct 08, 2003 at 14:52 UTC

    Not completely sure what you want here, but you don't get a "mama" because none of the strings you are matching against are isolated by spaces (e.g. you get 1696-R2.2 not 1696) and even if that wasn't a problem, the case matching will fail as well (e.g. 162x <> 162X).

    Here's some code that does something - whether it's what you want, or whether it gives you some ideas how to get what you want I leave up to you

    my $data_file = 'diana.txt'; my @identifiers = qw(1696 162x 1640); open(FILE, $data_file) || die("Could not open file: $!"); foreach $line(<FILE>){ chomp $line; foreach $identifier(@identifiers){ if($line =~ /$identifier/i){ print "$line match\n"; } } } close FILE;
    Tom Melly, tom@tomandlu.co.uk
Re: How do I extract some records from a textfile
by graff (Chancellor) on Oct 09, 2003 at 01:45 UTC
    Let's suppose that the file data you're using as input are consistent, and always have the form you've shown -- at least, let's assume they consiste of one or more "records" where each record contains multiple lines, and exactly three of these lines that begin with the labels "Equipment:", "solved:" and "Lab:".

    Based on your question, the strings to capture is on the lines with "Equipment" and "solved" labels; they may be fixed length, and/or match a particular pattern expressable by a regex. So maybe something like this (not tested):

    ... while (<FILE>) { if ( /^Equipment:.*?(\d+)-(\S+)/ ) { ( $str1, $str2 ) = ( $1, $2 ); } elsif ( /^solved:\s+(\S+)\s+"([^"]+)/ ) { ( $str3, $str4 ) = ( $1, $2 ); print "Found: str1= $str1 str2= $str2 str3= $str3 str4= $str4\ +n"; } }
    If you don't get what that's doing, read up on perl regular expressions (man pages "perlretut", "perlre", and numerous other reference works). Basically, the "if" is looking for the line that starts with "Equipment:" and contains a string of digits followed immediately by a dash and a set of non-whitespace characters; these latter two things are retained for later use (as $1 and $2) via the paren operaters. Likewise, the "elsif" is looking for the "solved" line, and capturing the next non-whitespace string from that line, along with anything enclosed in double-quotes.
    A reply falls below the community's threshold of quality. You may see it by logging in.