in reply to Re: Re: How do I extract some records from a textfile
in thread How do I extract some records from a textfile

Your "example based" description of the problem ("suppose this and that") is not a practical or usable specification of the task. What do you really need to with all these strings once you extract them? Do they need to be organized into some sort of data structure? Do they need to be integrated or summarized in some way? (I'm not asking you to tell me these things -- you just need to be clear in your own mind about what the goals are.)

Apart from knowing what you intend to do with the data, you need to have some glimmer of a rule-based solution to parsing the file contents. Some chunks are delimited by double-quotes, some begin with "./", some begin with upper-case letters, some begin with digits, and so on.

Work out the rules that describe the patterns and the order of occurrence for the parts you want to extract. Work through each line of the file, figure out what type of information it has, and what you need to do upon recognizing it (save it to a particular variable or array, print something, ignore it and move on, etc).

Since the line-initial labels seem to divide the file into different kinds of information, and each division may be a variable number of lines, you probably want a "state variable" that keeps track of which division you're in at each line -- here's a start:

sub equipSub { if ( / WDM (\w+)-(R\d+\.\d+) / ) { retrun $1, $2; } return; } sub solveSub { # similar code to locate and return strings in this part of data } sub labelSub { # similar code to locate and return strings in this part } sub munge_last_record { # do whatever you need to do with the stuff you extracted from a r +ecord } my $state = ''; my %statesubs = ( 'equip' => \&equipSub, 'solve' => \&solveSub, 'label' => \&labelSub, ); my %recdata; while (<FILE>) { if ( /^(Equipment:)/ ) { # start of new record if ( $state eq 'label' ) { # were we reading a previous recor +d? munge_last_record( \%recdata ); # do something with prev. +record now } %recdata = (); $state = "equip"; } elsif ( /^solved:/ ) { # time to change state die "Oops! bad state transition at $." unless $state eq 'equip +'; $state = "solve"; } elsif ( /^Label:/ ) { # another state change die "Oops! bad transition at $." unless $state eq 'solve'; $state = "label"; } @tokens_found = &{$statesub{$state}}(); if ( @tokens_found ) { # what you need to do here is up to you # (it is bound to depend on the current state, so maybe # you just want to do more stuff in the statesubs) push @{$recdata{$state}}, [ @tokens_found ]; } } munge_last_record( \%recdata ); # finish off the last record
Maybe the data structure being created there in %recdata is more complicated than you need (or at least not exactly what would suit you). Good luck working out the rest...