in reply to Re: Re: Admin-ing a file?
in thread Admin-ing a file?

Actually, there's a preference you can set in your profile that will at least make replies show up. Anyhow, yeah, this doesn't look like the easiest format to parse, because your data is all mixed into the HTML. The easiest way to deal with this, IMO, is to parse the whole thing into memory, and write it all back out if there are changes. Unfortunately, this is also slow, but at least the code will be clean. Here's code to do it:
#!/usr/bin/perl my $data; open(DATF, "pt.dat") || die "Could not read datfile: $!\n"; {local $/;$_ = <DATF>;} close(DATF); my $in_record; # Now we're gonna chop out all the extra HTML and stuff we don't need tr/\n//d; #$data =~ s/<[^>]+>/:/g; s/\Q<!--START:-->\E//g; # You might need to tinker w/ whitespace here s/\Q<!--END:-->\E//g; s/<\/?table[^>]*>//g; s/\Q&nbsp;\E//g; s/<\/?tr>//g; s/<a href[^>]+>//g; s/<\/a>//g; s/<br>//g; s/<\/td>//g; # And now we'll parse it. my $in_record=-1; # So $in_record will hit zero on the 1st loop while($_) # While we haven't eaten it all { s/^\s*//; # Remove heading whitespace if(s/^<\/?td>//){next;} # Remove heading comments # print "D: $_\n"; if(s/^Signed on:(\w+ \d+, \d+)//) { # Begins an entry $in_record++; $record_date[$in_record] = $1; next; } if(s/^Name:([^<]+)//) {$record_name[$in_record] = $1; next;} if(s/^Email: ([^<]*)//) {$record_email[$in_record] = $1;next;} if(s/^Country:([^<]*)//) {$record_country[$in_record] = $1;next;} if(s/State\/Province\/Territory:([^<]*)//) {$record_place[$in_record] = $1;next;} if(s/City:([^<]*)//) {$record_city[$in_record] = $1;next;} if(s/Site rating:([^<]*)//) {$record_rating[$in_record] = $1;next;} if(s/Comments:([^<]*)//) {$record_comments[$in_record] = $1;next;} if(s/(\w):[^<]*//) { print "Add a new handler to remove or parse leading elements + of this: $1\n";next;} }
With that, you can then easily write functions to return or modify arbitrary elements in the record structures, and when someone wants to update it, you'll write it back out in your formatted way. I would suggest though that you get this into a database ASAP -- it'd be too easy for malformed HTML to make it impossible to import cleanly later, and if you use code much like mine above, your site will be really slow. I hope this helps..

Replies are listed 'Best First'.
Re: Re: Re: Re: Admin-ing a file?
by stonecolddevin (Parson) on Apr 25, 2003 at 03:30 UTC
    I agree. This flat file crap isn't cutting it. MySQL is so undbelieveably better. Thanks, I'll take a look at that and I'll get this into a db soon.