What you're missing is a grasp of "event-driven" programming. It's a distinct style of programming, just as OOP is (though they are not mutually exclusive; XML::Twig is both OO and event-driven). An event-driven parser is a good example of this programming model. (It is also the norm in GUI programming.) Such a parser has a core functionality (namely, parsing text according to some syntax), but the programmer can customize it by "registering" subroutines with the parser, to be associated with specific parsing events (e.g. finding a closing tag). The parser will then invoke these pre-registered subroutines, with a pre-specified set of arguments, at the appropriate times during the parsing. These subroutines one "registers" with the parser are called "callbacks" or "handlers"1.

The subs topic and extpage are two such handlers. They get invoked by the parser whenever it finishes parsing a Topic or ExternalPage section. They each receive two arguments from the parser: the XML::Twig object and the XML element that the parser just finished parsing. (This answers your first question.)

These two subroutines run separately from each other; in other words, neither of them calls the other one. This rules out direct communication between the two subs. One way around this is for them to communicate through shared variables (i.e. %links). In this case indirect communication is necessary since extpage cannot backtrack over the XML to see what links, if any, were found by topic. In the code I wrote only the keys of %links are used; saving the actual link objects as the values corresponding to these keys is just there for some potential future use. The code would work just as well if those values were all 1, say.

Note that these two subroutines run multiple times during the parsing operation. This is a key point. It is not the case that all the calls to topic happen first, and then all the calls to extpage. The multiple calls to these methods alternate.

...coz I'm really doesn't know how these two subroutine connect with each other or what is the run order of them?

The parser takes care of invoking the subroutines at the right time during the parsing; in this case, they get invoked once the parser finishes parsing a Topic or ExternalPage section, respectively. This all happens as the result of the call to $twig->parsefile( './sample.xml'); it is this call that sets off the whole sequence of events that ultimately cause the handlers to be invoked by the parser.

1Sometimes they are also called "hooks", although I have also seen the term "hook" used to refer to the places in the source code for the parser (or whatever) where the callbacks are invoked. You can think of these "hooks" as places provided by the author of the parser where the programmer using the parser can "hang" custom code from.

Update: The first chapter of HOP has a nice discussion of callbacks.

the lowliest monk


In reply to Re^5: Memory errors while processing 2GB XML file with XML:Twig on Windows 2000 by tlm
in thread Memory errors while processing 2GB XML file with XML:Twig on Windows 2000 by nan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.