mksaad has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am using Parse::MediaWikiDump package to process static XML dumps of 3 languages of Wikipedia. For each page in English, the script tries to get the corresponding pages in the other 2 languages. the script run perfectly and successfully get required pages but I got segmentation fault error after a while and the process does not complete. Perl debugger point to the line:

while(defined($frPage = $frPages->next)) {

which cause the segmentation fault !!!.

I read a lot about the reasons of segmentation fault (memory limit, a bug in package, .... etc), there is not a specific way to know what is the reason.

it is very strange that this statement worked thousands of time then cause segmentation fault !. I am afraid it is a problem of memory or it is a bug in cpan parse::MediaWikiDump package.

any comments, tips, will be appreciated

thanks

best regards,

Motaz

Replies are listed 'Best First'.
Re: Parse::MediaWikiDump segmentation fault
by Khen1950fx (Canon) on Jan 03, 2012 at 12:45 UTC

    The documentation says that Parse::MediaWikiDump is being retired and to start using MediaWiki::DumpFile and/or MediaWiki::DumpFile::Compat immediately.

    This software is being RETIRED - MediaWiki::DumpFile is the official successor to Parse::MediaWikiDump and includes a compatibility library called MediaWiki::DumpFile::Compat that is 100% API compatible and is a near perfect standin for this module. It is faster in all instances where it counts and is actively maintained. Any undocumented deviation of MediaWiki::DumpFile::Compat from Parse::MediaWikiDump is considered a bug and will be fixed.

      Thanks, I am trying to install the successor package but I could not. I used the following command to install Parse::MediaWikiDump
      # sudo apt-get install libparse-mediawikidump-perl
      What is the package name for successor software?. Thanks.
Re: Parse::MediaWikiDump segmentation fault
by grondilu (Friar) on Jan 03, 2012 at 12:15 UTC
    Have you checked the memory usage by running top in an other console during execution?
      I watched memory usage using gnome system monitor and it was about 30MB !!!