I am but a poor and simple coder, late wandered out of the dark wood of error where I was blinded by my service to the Whore of Redmond. I have come to a time and place where the powers that be command that I forge for them from the living Perl a script that might subdivide their bloated SGML files and render up a set of named subfiles that may be used to populate a set of TEXT fields in a great library of SQL.
Though I have labored a full fifth-score hours on this task, poring over many a camel-embossed tome of distracted wisdom, I have come to realize that my work is in vain and I must look to the almighty for my salvation, for my childish reason can not comprehend the sublimity that is the way of the Perl. Therefore, I do remit my opus to thee, in the hope that my crass idiocy might reveal itself thine holy eyne and that the goal of my project might be completed without a further waste of time.
Herein lies the code, built upon the backbone of a program that thou hast helped one of my co-workers build:
#!/usr/bin/perl -w #Purpose: To Take a DOS file wildcard and thus take all the matching c +ustom SGML files in the working directory and subdivide them into new + files whose names are the id's of the divs in those original files. print "What file(s) do you want to run this program on?\n"; $TheFile=<STDIN>; chomp ($TheFile); our $lines = ""; our @InFileNames; our @OutFileNames; our @OutFileContent; #open file and get all text sub OpenFile { open(FILE, @_) or $lines = ""; local $/ = undef; $lines = <FILE>; #remove blank lines $lines =~ s/\n{2}/\n/gms; close(FILE); } #add ¥ to closing div tags sub MarkClose { $lines =~ s/(<\/div>)/\$1¥/gms; } #open output.txt for appending and write results to it sub FileAppend { my $Outfile = ">>" . $_[1] . ".bsd"; my $Content = $_[2]; open(FILE, $Outfile) or die "Can't open $Outfile.\n"; print FILE $Content; print FILE "\n"; close FILE; } #Create an array containing all file in the directory matching the glo +b. sub GetInFileList { my $FileDef = @_; @InFileNames = glob($FileDef); } #Populate an array with the contents of the id attribute of every <div +> tag in the input file. sub GetOutFilesList { @OutFileName = $lines =~ m/<div[^>]*>/gms; foreach $OutFile (@OutFileName){ $OutFile =~ s/<div type=[^>]* id="([^>]*)">/$1/gms; $OutFile =~ s/\./_/gms; } } #Subdivides the File into the subfiles. sub GetOutFileContent { my $LinesString = @_; @OutFileContent = split /¥/, $LinesString; } ### Does the job &GetInFileList($TheFile); if (@InFileNames > 0){ for ($i = 0, $i < @InFileNames, $i++) { &OpenFile($InFileNames[$i]); &MarkClose; &GetOutFilesList; &GetOutFileContent($lines); for ($j = 0, $j < $OutFileNames, $j++){ &FileAppend($OutFileNames[$j], $OutFileContent[$j]); } } } #be nice and say it's done print "Program Finished\n";
edited: Sun Mar 9 14:47:58 2003 by jeffa - title change (orig is now first para)
In reply to Need help with subdividing SGML files by DukeLeto
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |