in reply to Parsing COD text help
Well, enough excuses. I came up with something that seems to work pretty well. I guess I should explain why I am doing this. I want to make a program that will ask me what classes I want to take and then tell me all the possible schedule combinations (if any) I can have with those classes. The schedule combinations part I already finished in Java (which I did to teach myself the language, because C/C++, perl, scheme, and Q/PBASIC aren't good enough for UVa -- but that's another discusssion!).
Anyways, the code, for those interested. I ended up using the TokeParse::Simple, which I refrained from at first, not having used it before and wanting to get something tested as quick as possible (a missaplication of laziness I suppose), but the lovely examples helped me through it ...
edit: removed readmore#!/usr/bin/perl use strict; use warnings; use HTML::TokeParser::Simple; use Data::Dumper; # $courses{mneumonic}{sectID} = # [ "Section number", "Credit", "CurrEnroll", "MaxEnroll", # [start time], [end time], [days], [location], [instructor] ] # use constant SECT_NUMBER => 0; use constant CREDIT_HOURS => 1; use constant CURR_ENROLL => 2; use constant MAX_ENROLL => 3; use constant START_TIME => 4; use constant END_TIME => 5; use constant DAYS => 6; use constant LOCATION => 7; use constant INSTRUCTOR => 8; my $file = 'APMA.txt'; my $stream = HTML::TokeParser::Simple->new( $file ); my ($class,$title); my (%courses, $mneumonic, $sectID); # Flag to tell program if last $title was a match against /Day/ # If so, location follows next (no consistent marker otherwise) my $wasJustDays = 0; while( my $t = $stream->get_token ) { if( $t->is_start_tag( 'a' ) and $t->return_attr( 'href' ) =~ m/course_nbr/ ) { # And thus begins a new Course ... $mneumonic = $stream->get_text( '/a' ); } elsif( $t->is_start_tag( 'span' ) ) { $class = $t->return_attr( 'class' ); $title = $t->return_attr( 'title' ); if( defined $title and not defined $class ) { # These would be all the rest of the fields, # ... Schedule number, credit hours, etc. if ($title =~ /Schedule Number/) { $sectID = $stream->get_text( '/span' ); } elsif ($title =~ /Section Number/) { $courses{$mneumonic}{$sectID}->[SECT_NUMBER] = $stream->get_text( '/span' ); } elsif ($title =~ /Credit Hours/) { $courses{$mneumonic}{$sectID}->[CREDIT_HOURS] = $stream->get_text( '/span' ); } elsif ($title =~ /Time/) { $stream->get_text( '/span' ) =~ /(\d+)-(\d+)/; push @{$courses{$mneumonic}{$sectID}->[START_TIME]}, $1; push @{$courses{$mneumonic}{$sectID}->[END_TIME]}, $2; } elsif ($title =~ /Day/) { push @{$courses{$mneumonic}{$sectID}->[DAYS]}, $stream->get_text( '/span' ); $wasJustDays = 1; # See note at variable declaration } elsif ($title =~ /Instructor/) { push @{$courses{$mneumonic}{$sectID}->[INSTRUCTOR]}, $stream->get_text( '/span' ); } elsif ($title =~ m<Enrollment:Authorized/Actual>) { $stream->get_text( '/span' ) =~ m<(\d+)/(\d+)>; $courses{$mneumonic}{$sectID}->[MAX_ENROLL] = $1; $courses{$mneumonic}{$sectID}->[CURR_ENROLL] = $2; } else { if ($wasJustDays == 1) { push @{$courses{$mneumonic}{$sectID}->[LOCATION]}, $title . ": " . $stream->get_text( '/span' ); $wasJustDays = 0; # See note at variable declaration } } } elsif( defined $class and $class eq 'title' ) { # This is the name of the course; e.g., Linear Algebra # Ignore for the time being ... #print $stream->get_text( '/span' ), "\n"; } } }
|
|---|