in reply to Segmentation fault from XML::Twig::Elt

The script ran through when I commented out the paste() line, that's how.
And as you advised, I've updated my XML:: module stack, same error after that. So I pinned the script to the test bench, and here it is: segfaults after first sitemap:
#!/usr/bin/perl use strict; use warnings; use HTML::Entities qw(encode_entities); use XML::Twig; use URI::Escape; my @urls; for(1..150000){ push(@urls, { url => "http://www.example.com/some/really/interesting/fil +e.html", priority => 0.5, } ); } for(1..3){ my $outfile = "test_sitemap". $_ .".xml"; print "Working on $outfile, map $_ of 3 (".@urls." remaining URLs) +\n"; ## this sitemap my $cnt; my $sitemap; $sitemap = Sitemap->new( type => 'html', file => $outfile ); # build the xml structure inside out foreach my $i (1..49500){ my $url = shift(@urls); last if @urls == 0; next if $url->{url} eq ''; ## add $sitemap->add({ loc => $url->{url}, priority => $url->{priority} }); $cnt++; } my $output = $sitemap->output(); print " Count: ".($sitemap->{cnt}->{html}||0)." html urls, ".($sit +emap->{cnt}->{video}||0)." video urls.\n"; $sitemap = undef; # check length if(length($output) < 10485760){ print " This sitemap file is OK: (uncompressed under 10MB) ". +(length($output)) ." bytes\n"; }else{ print " This sitemap file is TOO BIG (uncompressed over 10MB). +\n"; } open(my $DATEI, ">$outfile") or die "Could not open output file: $ +!"; binmode($DATEI); print $DATEI $output; close($DATEI); $output = undef; print " $cnt URLs added to sitemap $outfile.\n"; print " Doing gzip compress... "; system "gzip -f9 $outfile"; # f forced overwrite, compression leve +l 9/best print "done.\n"; if(-s "$outfile.gz" < 10485760){ print " This sitemap file is OK (compressed under 10MB) ". (-s + "$outfile.gz") ." bytes.\n"; }else{ print " This sitemap file is TOO BIG (compressed over 10MB).\n +"; } } package Sitemap; use strict; use warnings; use XML::Twig; use open OUT => ':utf8'; use HTML::Entities qw(encode_entities); use Data::Dumper; sub new { my ($class, %cfg) = @_; my $self = bless { xml => undef, type => $cfg{type} || 'html', # sitemaps are either "vid +eo" or "html" cnt => undef, }, $class; if($cfg{type} eq 'video'){ # init XML::Twig for a video-sitemap $self->{xml} = XML::Twig::Elt->new('urlset', { 'xmlns' => 'http://www.sitemaps.org/schemas/sitemap +/0.9', 'xmlns:video' => 'http://www.google.com/schemas/sitemap +-video/1.1', }); }else{ # init XML::Twig for standard sitemap $self->{xml} = XML::Twig::Elt->new('urlset', { 'xmlns' => 'http://www.google.com/schemas/sitemap/0 +.84', 'xmlns:xsi' => 'http://www.w3.org/2001/XMLSchema-instan +ce', 'xsi:schemaLocation' => join(' ', 'http://www.google.com/schemas/sitemap/0.84', 'http://www.google.com/schemas/sitemap/0.84/sitemap.xs +d', ), }); } return $self; } sub add { my $self = shift; my $ref = shift; if($self->{type} eq 'video'){ $self->add_video($ref); }else{ $self->add_html($ref); } } sub add_html { my $self = shift; my $ref = shift; # step over elements in fixed order my @elements; foreach my $key (qw(loc changefreq lastmod priority)){ if($key eq 'loc'){ if($ref->{loc}){ push(@elements, XML::Twig::Elt->new('loc', {}, $ref->{ +loc}) ); }else{ print "Error: 'loc' is a mandatory value and missing!\ +n" } }elsif($key eq 'priority'){ if($ref->{priority}){ push(@elements, XML::Twig::Elt->new('priority', {}, $r +ef->{priority}) ); }else{ print "Error: 'priority' is a mandatory value and miss +ing!\n" } } } # wrap these sub-elements into an "url"-level my $elt = XML::Twig::Elt->new('url', {}, @elements); undef(@elements); # and add this url-sub-element nest to the xml-document $elt->paste(last_child => $self->{xml}); $self->{cnt}->{html}++; } sub add_video { my $self = shift; my $ref = shift; ## omitted } sub output { my $self = shift; my $file = shift; $self->{xml}->set_pretty_print('indented'); my $header = '<?xml version="1.0" encoding="UTF-8"?>'."\n"; if($file){ open(my $fh, ">$file") or die "Could not open output file: $!" +; binmode($fh); print $fh $header . $self->{xml}->sprint(); close($fh); return; }else{ return $header . $self->{xml}->sprint(); } } sub DESTROY { my $self = shift; undef($self->{xml}); undef($self); } 1;

First I thought it was a memory leak problem, that's why I do all these manual/forced undefs, btw. Please don't mock me if it's a really stupid typo or so...

Replies are listed 'Best First'.
Re^2: Segmentation fault from XML::Twig::Elt
by Anonymous Monk on Nov 07, 2011 at 08:20 UTC

    To clarify, I do not believe this to be an XML::Twig::Elt bug, after all, it is pure-perl.

    It is probably a perl bug, or c/xs bug in XML::Parser::Expat, Scalar::Util, XSLoader, Encode...

    In any case, this is not good :/

    If you're on linux, you might give strace a shot, to try to pinpoint this further

Re^2: Segmentation fault from XML::Twig::Elt
by Anonymous Monk on Nov 07, 2011 at 08:13 UTC

    I confirm a serious problem. I get premature end of program. No exceptions (segfault), program just ends for no reason before its finished.

    I has something to do with Scalar::Util::weaken

    I've whittled the proof down to

    #!/usr/bin/perl -- $Devel::Trace::TRACE = 0; # Disable use strict; use warnings; use XML::Twig; Main( @ARGV ); warn "main who\n"; exit( 0 ); sub Main { my $self = {}; $self->{xml} = XML::Twig::Elt->new('urlset', { 'xmlns' => 'http://www.google.com/schemas/sitemap/0.84' +, 'xmlns:xsi' => 'http://www.w3.org/2001/XMLSchema-instance', 'xsi:schemaLocation' => join(' ', 'http://www.google.com/schemas/sitemap/0.84', 'http://www.google.com/schemas/sitemap/0.84/sitemap.xsd', ), }); #~ <url> #~ <loc>http://www.example.com/some/really/interesting/file.html</ +loc> #~ <priority>0.5</priority> #~ </url> #~ for( 0 .. 149500 ){ #~ for( 0 .. 49500 ){ #~ for( 0 .. (49500-39500) ){ #~ for( 0 .. (10000 - 5000) ){ #~ for( 0 .. ( 5000 - 200 ) ){ #~ for( 0 .. ( 5000 - 140 ) ){ #~ for( 0 .. ( 5000 - 170 ) ){ #~ for( 0 .. ( 5000 - 180 ) ){ #~ for( 0 .. ( 5000 - 190 ) ){ #~ for( 0 .. ( 5000 - 183 ) ){ #~ for( 0 .. ( 5000 - 185 ) ){ #~ for( 0 .. ( 5000 - 183 ) ){ for( 0 .. 4817 ){ my @elements; push(@elements, XML::Twig::Elt->new('loc', {}, 'http://www.exa +mple.com/some/really/interesting/file.html' ) ); push(@elements, XML::Twig::Elt->new('priority', {}, '0.5' ) ); my $elt = XML::Twig::Elt->new('url', {}, @elements); $elt->paste(last_child => $self->{xml}); warn "main who $_\n" if $_ > ( 49497 - 3); } binmode STDOUT; #~ print $self->{xml}->sprint();# FAIL,incomplete $self->{xml}->set_pretty_print('indented'); my $header = '<?xml version="1.0" encoding="UTF-8"?>'."\n"; $self->{xml}->print_to_file('2'); # NO cutoff XML s, but still exi +ts on undef; #~ $self->{xml}->purge; #~ Can't call method "purge_up_to" on an undefined value at C:/perl/si +te/5.14.1/lib/XML/Twig.pm line 8087. #~ print $header; #~ $self->{xml}->print(); # XML is cut off #~ print $self->{xml}->sprint(); # XML is cut off, no closing #~ print $self->{xml}->flush; # fail #~ Can't call method "flush_up_to" on an undefined value at C:/perl/si +te/5.14.1/lib/XML/Twig.pm line 8082. warn "about to undef "; $Devel::Trace::TRACE = 1; # Enable undef $self->{xml}; warn "after undef "; return; } __END__

    The Devel::Trace shows sub XML::Twig::Elt::DESTROY checking whether to break circular reference manually, 4820 times (versus 4817 children), and skipping because we're using Scalar::Util::weaken

    undef $self->{xml}; >> C:/perl/site/5.14.1/lib/XML/Twig.pm:7534: { my $elt= shift; >> C:/perl/site/5.14.1/lib/XML/Twig.pm:7535: return if( $XML::Tw +ig::weakrefs); >> C:/perl/site/5.14.1/lib/XML/Twig.pm:7534: { my $elt= shift; >> C:/perl/site/5.14.1/lib/XML/Twig.pm:7535: return if( $XML::Tw +ig::weakrefs); ...

    Adding local $XML::Twig::weakrefs; before undef doesn't change the problem.

    I'm using perl 5.14.1 on win32

    $ perl -d:Modlist -e " use strict; use warnings; use XML::Twig; " Carp 1.23 Config DynaLoader 1.13 Encode 2.44 Encode::Alias 2.15 Encode::Config 2.05 Encode::Encoding 2.05 Exporter 5.65 Exporter::Heavy 5.65 File::Basename 2.82 File::Glob 1.12 File::Spec 3.33 File::Spec::Unix 3.33 File::Spec::Win32 3.33 List::Util 1.23 Scalar::Util 1.23 UNIVERSAL 1.08 XML::Parser 2.41 XML::Parser::Expat 2.41 XML::Twig 3.39 XSLoader 0.15 base 2.16 bytes 1.04 constant 1.21 feature 1.20 overload 1.13 utf8 1.09 vars 1.02 warnings 1.12 warnings::register 1.02

      I can reproduce the problem in win32 (perl 5.12.3), but not on linux (perl5.12.3, 5.12.4 or 5.14.2). So since it's the same code on all 3 versions, I would think it's a bug somewhere in Scalar::Util::weaken or in the core, but only on windows. Which is a bit annoying since I don't develop on windows, I only use it to test modules before a release.

      I am not sure where to go from there honestly. Any idea?

        Most in this thread already is beyond my knowledge of the internal workings,
        but I can contribute that I got this error on a:
        Debian 4.0 box with Perl v5.10.0 built for i486-linux-gnu-thread-multi with these module versions: and on
        Ubuntu 11.04 with Perl v5.10.1 (*) built for i686-linux-gnu-thread-multi.