I'll suggest an alternative, but first I'd like to point out that the input data appears to consist of fixed-length records. Having looked at the cited page, there seem to be three basic types of data lines -- one has digits in columns 1-5, the other two don't; among the latter, there are a few that are "category" headings (e.g. "SYSTEMS ENGINEERING", "COMPUTER SCIENCE", etc), and the rest are "detail" records about a given course/section. (Actually the latter type probably breaks down into two or three sub-types, presenting different sorts of information.
Fixed-width data can be handled either with regex matching (using things like / (.{5}) (.{4}) (.{4})/), or with unpack. The latter is really simpler (even though it seems more complicated when you look it up in the "perlfunc" man page). It would go something like this, in your case:
I hope that will get you started. Note that by using "unpack", the "DAYS" portion of the sub-records will always be taken as a string of six characters, some of which happen to be spaces ("M W F " vs. " T R " etc) -- you could get the same result with a suitable regex instead of unpack, but plain-old split will do it wrong. Personally, I think this is one situation where unpack is relatively easier to do than a regex; it's just a natural for fixed-length ASCII records.my %courses; my $mnemonic; # this is the correct spelling :) while (<COD>) { # let's use $_, shall we? next unless ( /\S/ ); # skip blank lines; my @fields; my ($id, $rest) = unpack("xA5xA*", $_); # break line into 2 pieces if ( $id =~ /^\d{5}$/ ) { # it's the start of a record ($mnemonic,@fields = unpack("A4xA4xA4xA2xA2xA28A*", $rest); # work out what to do with @fields; $mnemonic will retain # it's current value till the next one is encountered, # so sub-records after this one will be added to the # correct hash element. } elsif ( $rest =~ /^\d+-\d+/ ) { # it's a sub-record my ($time,$days,$bldg,$room,$end) = unpack("A9xA6xA4xA4xA*", $re +st); # you need to work out what to do with $end, # and push stuff into the current $courses{$mnemonic} structure } else { # do something else with (or ignore) other stuff } }
In reply to Re: Parsing COD text help
by graff
in thread Parsing COD text help
by dimmesdale
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |