Re: Need help with subdividing SGML files
by BrowserUk (Patriarch) on Mar 07, 2003 at 20:46 UTC
|
Thoust elegant beseeching falleth not upon ears deaf, but, pray tell, in what way wouldest thou havest aid thee?
(Er...like er.. you know man, like ...What's the problem?)
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] |
|
|
As near as I can tell the first array is never populated. There's probably a considerable number of other fudge-ups.
| [reply] |
|
|
Sorry, but "the first array" means nothing. Is that the first array mention in the program reading from top to bottom? The first array that is used in the runtime order?
The work involved for anyone who doesn't have a set of sgml files with the particular set of <div> tags that your program is looking, to try and divine the format of those files and mock up data to allow them to try and run your program is considerable.
I think that you should consider putting as much effort into describing the problem as you did into your flowery request for help, you then might give us enough information uppon which to begin to advise you.
Try adding a few print tstatements to your program and work out what it is/is not doing. Come back with a clear descripton of the problem and you might get some more help.
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] [d/l] |
|
|
Re: Need help with subdividing SGML files
by hawtin (Prior) on Mar 07, 2003 at 20:56 UTC
|
| [reply] |
|
|
Hmmm... Oh yeah, unlike TextPad, this animal won't stop for line breaks.
| [reply] |
Re: Need help with subdividing SGML files
by tall_man (Parson) on Mar 07, 2003 at 21:04 UTC
|
This be not a role-playing game, Sir DukeLeto. Flowery language and especially flowery but meaningless article titles doth not contribute to our desire to help thee.
Firstly, thou shouldst "use strict". It will find for thee many problems.
Secondly, thy argument-passing skills want improvement.
Instead of:
my $LinesString = @_;
thou shouldst have
my $LinesString = shift;
or
my ($LinesString) = @_;
When using $_[1] notation, remember that the first argument startest at $_[0]
| [reply] [d/l] [select] |
|
|
My apologies for the flowery language. I wanted to establish my non-moronicity whilst proclaiming it. Seems to have backfired.
I wanted to evoke an image of a penitant former MS programmer pleading at the gates of the monastery. I forgot that my favorite screen name can evoke the image of some D&D Paladin.
(I though the fifth-score hour was good. BTW.)
| [reply] |
|
|
Speaking as a former MS programmer, the standard tone for those of us who have left the Borg is somewhere around "Those Bastards! Those Evil Sons of..." you get the idea. I always thought of them as more of a demon horde (managers and above) fleshed out with the soul-less (employees, sacrificing everything in their lives to keep the stock vesting at all costs) and the damned (permatemps). But
Dude, don't worry too much about tone. The Monk metaphor permeates the site, but one reference per post is sufficient.
As for your screen name, I thought it was a Dune reference. Duke Leto Atreides, right?
-Logan
"What do I want? I'm an American. I want more."
| [reply] |
|
|
|
|
With respect to your suggestion of inadequate understanding of Perl variable passing, I plead guilty.
I have to admit though, based on my work with the synoptic languages, this format is VERY odd.
| [reply] |
Re: Need help with subdividing SGML files
by allolex (Curate) on Mar 07, 2003 at 21:44 UTC
|
This is slightly OT, so please forgive me that.
I think the advice about use strict; is truly excellent. And perhaps use warnings; for good measure. But what I'm really writing about is a point of grammar. Kids today just have horrible grammar, using a second person singular form to address a group, hmph! Here are the correct forms/functions:
- Thou forgivest me mine abhorrent code. (Talking to one person - subject)
- You forgive me my bad breath. (Talking to a group - still subject)
- I shall strike thee over the head with the Camel. (object of 'strike')
That Camel an animal/mine animal, a beast/my beast. Get it?
BTW, these are just example sentences, I do not have chronic halitosis, I have no plans of violence towards you, and I do not own anything but a paperback camel (as opposed to the kind that would flatten you), so it's unlikely to be fatal anyway.
Update: A nifty link I found addressing this off-topic. :)
--
Allolex
| [reply] [d/l] [select] |
|
|
This is completely off topic, and your grammar lesson is wrong. "Thou" is second person familiar, not second person singular. And "you" is second person formal, NOT second person plural. Neither has a plural form. Shakespearean characters often berate crowds in the thou form. I'll admit, if I would have thought about it... addressing a crowd in the familiar is condescending and abusive, especially if your screenname has a patent of nobility. On the other hand, the King James Bible addresses the deity in the familiar all the time, and the religious inferiority was the relationship I was shooting for.
| [reply] |
Re: Need help with subdividing SGML files
by DukeLeto (Novice) on Mar 17, 2003 at 17:28 UTC
|
Well, once I figured out how to work the debug mode, it was all downhill from there. I assume you all had a good laugh about the "I just keep getting a 'DB(n)' Prompt that won't take my input." Oh well, you're only ignorant once.
The program is now giving a reasonable approximation of its intended functionality. I'm attaching the code below so everyone can point out the silly looking, rough-hewn parts.
Of course, in the words of the immortal Beeblebrox, "Hey! Don't knock it, it worked."
#!/usr/bin/perl -w
#Purpose: To Take a DOS file wildcard and thus take all the matching c
+ustom SGML files in the working directory and subdivide them into new
+ files whose names are the id's of the divs in those original files.
use strict;
print "Enter the name of a file containing the list of files you want
+to work on.\n";
our $lines = "";
our @InFileNames;
our @OutFileNames;
our @OutFileExtensions;
our @OutFileContent;
my $i = 0;
my $j = 0;
my $k =0;
my $TheFile = <STDIN>;
chomp ($TheFile);
#open file and get all text
sub OpenFile {
open(FILE, $_[0]) or $lines = "";
local $/ = undef;
$lines = <FILE>;
#remove blank lines
$lines =~ s/\n{2}/\n/gms;
close(FILE);
}
#add ¥ to closing div tags
sub MarkClose {
$lines =~ s/(<div type)/¥$1/gms;
$lines =~ s/\A¥//gms;
}
#open output.txt for appending and write results to it
sub FileAppend {
my $Outfile = ">>" . $_[0] . "." . $_[1];
my $Content = $_[2];
open(FILE, $Outfile) or die "Can't open $Outfile.\n";
print FILE $Content;
print FILE "\n";
close FILE;
}
#Create an array containing all file in the directory matching the glo
+b.
sub GetInFileList {
my $FileDef = $_[0];
open (FILE, $FileDef) or die "That isn't a valid file, Wesley!";
local $/ = undef;
$lines = <FILE>;
#remove blank lines
$lines =~ s/\n{2}/\n/gms;
close(FILE);
@InFileNames = split /\n/, $lines;
#If the program can't give a list to Muhammed, than Muhammed will give
+ a list to the program.
}
#Populate an array with the contents of the id attribute of every <div
+> tag in the input file.
sub GetOutFilesList {
$k = 0;
@OutFileNames = $lines =~ m/<div type[^>]*>/gms;
while ($k < (scalar(@OutFileNames))){
$OutFileNames[$k] =~ s/<div type="[^"]*" id="([^"]*)"[^>]*>/$1/gms
+;
$OutFileNames[$k] =~ s/\./_/gms;
$k = $k + 1;
}
$k = 0;
@OutFileExtensions = $lines =~ m/<div type[^>]*>/gms;
while ($k < (scalar(@OutFileExtensions))){
$OutFileExtensions[$k] =~ s/<div[1-9]? type="([^"]*)" id="[^"]*"[^
+>]*>/$1/gms;
$OutFileExtensions[$k] =~ s/\./_/gms;
$k = $k + 1;
}
}
#Subdivides the File into the subfiles.
sub GetOutFileContent {
my $LinesString = $_[0];
@OutFileContent = split /¥/, $LinesString;
}
### Does the job
&GetInFileList($TheFile);
$i =0;
while ($i < (scalar(@InFileNames))) {
&OpenFile($InFileNames[$i]);
&MarkClose();
&GetOutFilesList;
&GetOutFileContent($lines);
$j=0;
while ($j < (scalar(@OutFileNames))){
&FileAppend($OutFileNames[$j], $OutFileExtensions[$j], $OutFil
+eContent[$j]);
$j = $j + 1;
}
$i = $i + 1;
}
#be nice and say it's done
print "Program Finished\n";
(Edited to reflect the code that actually DID work.)
| [reply] [d/l] |
Re: Need help with subdividing SGML files
by DukeLeto (Novice) on Mar 12, 2003 at 18:49 UTC
|
#!/usr/bin/perl -w
#Purpose: To Take a DOS file wildcard and thus take all the matching c
+ustom SGML files in the working directory and subdivide them into new
+ files whose names are the id's of the divs in those original files.
use strict;
print "What file(s) do you want to run this program on?\n";
our $lines = "";
our @InFileNames;
our @OutFileNames;
our @OutFileContent;
my $i = 0;
my $j = 0;
my $TheFile = <STDIN>;
chomp ($TheFile);
#open file and get all text
sub OpenFile {
open(FILE, $_[0]) or $lines = "";
local $/ = undef;
$lines = <FILE>;
#remove blank lines
$lines =~ s/\n{2}/\n/gms;
close(FILE);
}
#add ¥ to closing div tags
sub MarkClose {
$lines =~ s/(<\/div>)/$1¥/gms;
}
#open output.txt for appending and write results to it
sub FileAppend {
my $Outfile = ">>" . $_[0] . ".bsd";
my $Content = $_[1];
open(FILE, $Outfile) or die "Can't open $Outfile.\n";
print FILE $Content;
print FILE "\n";
close FILE;
}
#Create an array containing all file in the directory matching the glo
+b.
sub GetInFileList {
my $FileDef = $_[0];
@InFileNames = glob($FileDef);
}
#Populate an array with the contents of the id attribute of every <div
+> tag in the input file.
sub GetOutFilesList {
my $OutFile;
@OutFileNames = $lines =~ m/<div[^>]*>/gms;
foreach $OutFile (@OutFileNames){
$OutFile =~ s/<div type=[^\s]* id="([^>]*)">/$1/gms;
$OutFile =~ s/\./_/gms;
}
}
#Subdivides the File into the subfiles.
sub GetOutFileContent {
my $LinesString = $_[0];
@OutFileContent = split /¥/, $LinesString;
}
### Does the job
&GetInFileList($TheFile);
for ($i = 0, $i < @InFileNames, $i++) {
&OpenFile($InFileNames[$i]);
&MarkClose();
&GetOutFilesList;
&GetOutFileContent($lines);
for ($j = 0, $j < @OutFileNames, $j++){
&FileAppend($OutFileNames[$j], $OutFileContent[$j]);
}
}
#be nice and say it's done
print "Program Finished\n";
The script now dies at a very specific point: On line 13 or possibly 18, where it gives the following error message:
Use of uninitialized value in open at ##program name censored## line 18, <STDIN> line 1.
I take this to mean that the array entitled @InFileNames has no contents, because the glob function used to fill it on line 45 didn't behave as I thought it would.
Also, the debug mode behaved oddly when it reached the <STDIN> line, it prompted me for input with the line DB(1), and then prompted me again with DB(2) when I had given it its input, and so on ad infinitum. | [reply] [d/l] [select] |
|
|
Why do you bother wasting your time using strict, if all you are going to do is name every undeclared variable at the top of your program. It's pointless.
There are still two lines (at least) in your updated program that contain simple syntax errors that will prevent your program from doing anything like what you want it to do.
Look up the syntax of perl's for statements.
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] |
|
|
1) Because I'm coming from a Visual Basic background where the strict definition of Variables at the top of the procedure/package is considered good practice.
2) The debug program seemed to be prompting me to explicitly define all of the variables, so that's what I did.
I take it that the syntax errors are so simple and that I've offended your sensibilities so severely that you can't be bothered to point them out.
You seem to be implying that I'm using for and foreach incorrectly. I can't see how that would be. Perhaps you think the answer is so embarassing that I would rather figure it out for myself than live in the shame of having it explained to me. Also, you've twice declaimed that you don't know what I want, so it seems odd that you're so sure now.
To put it bluntly, I find your tone to be insulting and if you don't want to give constructive criticism, I can do without your help.
| [reply] |
|
|
|
|
Re: Need help with subdividing SGML files
by DukeLeto (Novice) on Mar 07, 2003 at 23:16 UTC
|
Thank you to everyone who has contributed, and we won't have the slightest idea whether the problem is solved until Wednesday at the earliest: ("Weekend" . "Jury Duty")
How would I go about activiating the debug mode from the (Windows) command line? | [reply] |
|
|
perl -d yourscript.pl
h h
See perldebug for more details.
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] [d/l] |
|
|
| [reply] [d/l] [select] |