Yes I have such files ... parsing XML is not as easy as it might seem, that's why there are the XML modules on CPAN.
E.g. your program seems to have problems with nested tags like <ul><li><ul><li>1.1</li></ul></li></ul>.
So let me just make some general remarks to your program:
- post your code here - in that way more people will have a look at it. Only for really big lumps of code point to some other place. You can 'hide' the code behind <readmore> tags.
- Use a consistent style of indenting - helps to improve the readability of your code. And I would strongly suggest to aline the closing braces as follows:
if ($read) {
if ($more) {
# do something
}
}
- Have a look at the Getopt::... modules, especially Getopt::Long and Getopt::Declare (this one will satisfy your needs for command line processing in every respect - and I really mean *every* :). This makes your command line processing much easier to code and thus to read and understand.
- use warnings. Either by specifying
#!/usr/bin/perl -w
#or
use warnings;
This assist you in locating possible errors.
- Don't use global variables when it is avoidable. It starts getting really messy, once your programs grow ... use my, especially for variables that are only used locally (like e.g. $line)
- You can simplify/optimize some of the following code (this is not a complete list!):
foreach $line (@File_pre_format) {
$file_pre_format .= $line;
}
# better:
$file_pre_format = join '', @File_pre_format;
$file_pre_format =~ s/\n//g;
$file_pre_format =~ s/\t//g;
# better:
$file_pre_format =~ s/\n|\t//g;
# or in this simple case even better:
$file_pre_format =~ tr/\n\t//d;
- and translate your error messages from German ;-)
-- Hofmator