Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

So here is the XML::Twig version (warning: not tested, I can't this week, let's see how many bugs you find there!)

#!/bin/perl -w use strict; use XML::Twig; my $MAIN__INDEX = "links_main.html"; # main index, linked to categori +es my $INDEX_SUFFIX = "_links.html"; # used to generate the various f +iles per category my $MAIN_TITLE = "My links"; # Title for the main index my $INDEX_TITLE = "Links for %s"; # printf format for low level in +dex titles my $twig= new XML::Twig); $twig->parsefile( './links.xml'); # load the xml doc in memory my @link= $twig->children( 'link'); # first lets get the categories my %categories; $category{$_->att( 'category')++} foreach (@link); # put the categories in an array, sorted by number of links in descend +ing order my @category= sort { $category{$b} <=> $category{$a} } keys %category # generate the main link page open( MAIN, ">$MAIN_INDEX") or die "$0 cannot open $MAIN_INDEX: $!"; # I know I coulda used CGI.pm... print MAIN qq{<html><head><title>$MAIN_TITLE</title></head> <body><h1>$MAIN_TITLE</h1> <ul>}; foreach my $category (@category) { print MAIN qq{<a href="%s"><li>%s<small> (%s links})</small></a></ +li>}, category_file( $category), $category, $category{$category; } print MAIN qq{</ul></body></html>}; close MAIN; # now let's create the categories # it will be easier if we sort he links by category, # in the same order as the @category list # Hi [merlyn]! @links= map {$_->[1] } sort { {$b->[0] <=> $a->[0] } map { [ $category{$_->att( 'category')}, $_ ] } @link; foreach my $category (@category) { my $category_file= category_file( $category); open( INDEX, ">$category_file") or die "$0 cannot open $category_file: $!"; my $title= sprintf $INDEX_TITLE, $category; print INDEX qq{<html><head><title>$title</title></head> <body><h1>$title</h1> <ul>}; # as the links are ordered we know the links for the # current category are at the beginning of @link my $link= shift @link; while( $link->att( 'category') eq $category) { printf INDEX qq{<li><a href="%s">%s</a> %desc</li>\n", $link->( 'url'), link->( 'name'), $link->att( 'description'); $link= shift @link; } print INDEX qq{ <hr><p align="center"><a href="$MAIN_INDEX">$MAIN_ +TITLE</a></p></body></html>}; close INDEX; } sub category_file { my $category= shift; return lc( $category) . $INDEX_SUFFIX; }

This design does not really allow for a different way of sorting the categories, you would also need to modify it slightly if you want to have next/previous index links.

Now the ObNoE (Obligatory Note on Encodings, yes I know it starts like obnoxious ;--). As you seem to have sites from various countries in your link list, I am pretty sure your system will break as soon as you include an accented description: if you have accented characters in a non-UTF-8 encoding (most likely latin1, aka ISO-8859-1 if my memory serves me well, that's what most Western sites use) in your original XML file you will have to add an XML declaration at the top of your document (something like <?xml version="1.0" encoding="ISO-8859-1"?>). This also means that you will not be able to mix encodings (like getting a link to a Japanese site with a shift-JIS encoded description). The output will be UTF-8 encoded, I hope your browser can display it, otherwise you will have to convert everything back to whatever your favourite encoding is, or use the KeepEncoding option when you ceate the XML:Twig object (if you are using a 1-byte encoding like latin1). Welcome to the beautiful world of XML encoding!


In reply to Re: XML Manipulation by mirod
in thread XML Manipulation by larsen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (8)
As of 2023-02-07 10:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer not to run the latest version of Perl because:







    Results (38 votes). Check out past polls.

    Notices?