Re: XML::Twig Help
by marto (Cardinal) on Jul 06, 2010 at 15:08 UTC
|
use strict;
use warnings;
in your code (see Use strict warnings and diagnostics or die and Use strict and warnings), you aren't chomping user input and you don't check your file open worked.
Update: Perhaps worth some time working through some of the tutorials under Getting Started with Perl or reading http://learn.perl.org.
| [reply] [d/l] |
Re: XML::Twig Help
by toolic (Bishop) on Jul 06, 2010 at 15:28 UTC
|
$t->parsefile('$file');
Another issue is that single quotes create a literal string. You are passing the string $file to the function, instead of the contents of the $file variable. You really want:
$t->parsefile($file);
| [reply] [d/l] [select] |
Re: XML::Twig Help
by roboticus (Chancellor) on Jul 06, 2010 at 15:42 UTC
|
shravnk:
Wow, I reviewed the code but I'm somewhat at a loss. There are so many things to say that it's difficult to tell where to start.ell, let's see where to start. When I looked it over I found about 12 items in 10 lines of code.
First, I suggest that when you find yourself in the weeds, you should try using the strict, warnings and/or diagnostics modules. Then try to understand and solve each of them. Next, whenever you use a module, it's often very instructive to start with a working example program and make changes incrementally to it, rather than doing it all at once. Making one change at a time to a working example will also help you learn exactly what you're going.
I was thinking of doing a walkthrough item by item, but I don't have that sort of time. Here's a list of issues that you'll need to address:
- Always check your open statements for success
- Be sure you spell a variable name consistently throughout the program
- Double-check the argument lists of functions (hint: reread the docs for XML::Twig's new function)
- Single-quoted strings don't interpolate
- Arrays are 0 based, meaning an $array of two elements contains $array[0] and $array[1]
Other things you'll want to address at some point:
- Use the three-argument form of open
- Use the perl debugger to step through your program and verify that the variables hold what you think they do.
- When you rely on user input, be sure to check it for sanity before blindly using the value. When you check it, be sure to display a useful error message. For example, what happens when the user just presses the Enter key?
I think I'll stop here and let you get some items corrected before continuing...
...roboticus
| [reply] |
|
|
#!/usr/bin/perl
print "Enter an input file, space, and output file\n";
use strict;
use warnings;
chomp(my $enter = <STDIN>);
$_ = $enter;
unless (m#.+\.nml .+\.xml#)
{print "Unreadable input";
}
my @fields = split(/ /, $enter, 9999);
use XML::Twig;
open (FILE, $fields[0]) or die "Can't open file";
my $file = <FILE>;
my $t= XML::Twig->new( twig_roots => { "djn-geo" => 1});
$t->parsefile($file);
open(STDOUT, ">$fields[1]");
$t->print;
Still not sure why the XML::Twig line isn't working, I copied it verbatim from an example, and just changed the tag
Thanks alot,
shravnk | [reply] [d/l] |
|
|
shravnk:
So you're prompting the user for a filename, and verifying that it contains ".nml" or ".xml". If it doesn't contain that, you print an error message and continue on. Hmmm....
You really want to stop your program when you find a problem, so you should use the die function rather than print on line 8.
Another problem is that with the expression you've given "foo.xml.abc.def" is a valid filename.
After you open the file, you then read the first line of the file, and treat it as a filename to pass to parsefile. Unless your file contains the name of the XML file you're trying to process, that's going to be a bit of a problem...
Also, you have *another* open statement you haven't checked yet...
...roboticus
| [reply] |
Re: XML::Twig Help
by mirod (Canon) on Jul 06, 2010 at 16:23 UTC
|
The error you get comes from the fact that you need to quote djn-geo, as it doesn't look like a bareword to Perl.
That said there are many other problems with your code. I suspect what you want is something like this (untested):
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
# get the user input. die if it doesn't look right
my $enter = <STDIN>;
my @fields = split(/ /, $enter);
if( @fields != 2) { die "usage $0 <input_file> <output_file>\n"; }
# no need to open the file, quote djn-geo
my $t= XML::Twig->new( twig_roots => { 'djn-geo' => 1});
$t->parsefile($fields[0]);
# print_to_file was added in XML::Twig 3.23
$twig->print_to_file( $fields[1]);
| [reply] [d/l] |
|
|
mirod:
This has been enormously helpful, and your code works when I process one tree at a time. However, the xml doc that I am trying to work with contains hundreds of trees with the same format. How can I get the XML::Twig module to iterate over many trees within one document. (This will be my last question, I promise)
Thanks again
shravnk
| [reply] |
|
|
There should not be more than 1 root element in a proper XML file. If that is what you mean by "many trees", otherwise mirod has a few posts here on pm regarding how to process XML in chunks. When I started with XML::Twig, I read all his posts here and that answered every question. If you have perldoc, you can look at the twig page and search for flush and purge. I can't seem to reach xmltwig.com for the last week from around here, so I can't link those examples.
| [reply] |
Re: XML::Twig Help
by AndyZaft (Hermit) on Jul 06, 2010 at 15:24 UTC
|
Can you show the XML that gets opened by that Twig? First, lets make sure that whatever gets opened actually has a djn-geo tag in there. This error comes from the set_handler sub, so you could also try just opening the XML without any handlers set and print it. Maybe what you are opening is not what you think you are opening. | [reply] |
|
|
#!/usr/bin/perl
use strict;
use warnings;
chomp(my $enter = <STDIN>);
my @fields = split(/ /, $enter, 9999);
use XML::Twig;
open (FILE, $fields[1]);
my $file = <FILE>;
my $t= XML::Twig->new( twig_roots => { 'djn-geo' => 1});
$t->parsefile($file);
open(STDOUT, ">$fields[2]");
$t->print;
And I get the error:
Unsuccessful stat on filename containing newline at C:/Perl/site/lib/XML/Twig.pm
line 684, <FILE> line 1.
Couldn't open version="1.0" encoding="ISO-8859-1"?>
:
Invalid argument at C:\Documents and Settings\odeab\My Documents\Exec_Extract\tw
iggie.pl line 10
at C:\Documents and Settings\odeab\My Documents\Exec_Extract\twiggie.pl line 10
| [reply] [d/l] |
|
|
| [reply] |
|
|
Well, before we go further, Twig will tell you that it is not well-formed, likely because <\c> is not a valig tag. If your XML is filled with tags like that even if you manage to open the files themselves you will run into problems.
From your code I assume you are trying to type in a few XML filenames and the process them. First problem, if you only have 1 XML as a test, you need to open fields[0], otherwise open will fail with uninitialized value. Then once you fix that you are trying to open a file that's named the first line of the first file. You also don't need to open a file for XML::Twig, the module can do it itself. I would suggest starting simple, like take 1 XML file and create a simple perl script that opens it and does something with it.
Something along the lines of:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $file = "yourxmlfilename.xml";
my $t= XML::Twig->new( twig_roots => { 'djn-geo' => 1});
$t->parsefile($file);
$t->print;
Based on more assumptions looking at your code that you are trying to output to the second file given in the input, you can redirect later or at command line. Although I'm not sure how that behaves on windows nowadays. In any case you can also use $t->print_to_file($anotherfilename) and many other ways. | [reply] [d/l] [select] |