Broken headlines

Replies are listed 'Best First'.

Re: Broken headlines
by Aristotle (Chancellor) on Oct 01, 2003 at 14:09 UTC

What XML generators are currently available on PerlMonks?

An RDF feed for the Monastery. It's a little broken, but it should be easy enough for you to parse with perl. Be warned, your newsreader will probably not like this feed.

Makeshifts last the longest.

[reply]

Re: Re: Broken headlines

by Juerd (Abbot) on Oct 01, 2003 at 16:38 UTC

An RDF feed for the Monastery. It's a little broken, but it should be easy enough for you to parse with perl.

I am using Perl. Specifically with XML::RSS. Besides, broken XML is not XML. This site doesn't use XML, it uses something that happens to look like it.

"Don't parse XML with an XML parser, use regexes!". I guess it must be very hard to generate correct XML. After all -- and XML barbie concurs -- XML is *hard*!

I will use an extra Perl script. Not to parse the XML, because that would be extremely silly. But to try to make valid XML from the string I get.

Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

[reply]

Re^3: Broken headlines

by jpfarmer (Pilgrim) on Oct 20, 2003 at 17:03 UTC

Would you consider sharing the script you're using to re-format the XML?

[reply]

Re: Re^3: Broken headlines

by Juerd (Abbot) on Oct 21, 2003 at 06:48 UTC

Re: Broken headlines
by PodMaster (Abbot) on Oct 01, 2003 at 20:08 UTC

HTML::Parser

Re: cblast35

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

[reply]

Re: Re: Broken headlines

by Juerd (Abbot) on Oct 01, 2003 at 21:40 UTC

Try using HTML::Parse

That would mean patching XML::RSS. I'm currently just filtering perlmonks' data to try to make it valid. With success, so far.

PerlMonks--. If you want people to use Perl to parse your headlines, then don't make it look like XML! Just colon separated fields would do a much better job.

Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

[reply]

Re^3: Broken headlines (gift)

by tye (Sage) on Oct 02, 2003 at 06:10 UTC

I'd be happy to take down the broken RSS feed or refund your purchase price if it bothers you that much. Or feel free to just not use it.

[ I didn't write it. There is no infrastructure in place for pmdev to even look at the code. I thank whoever did write it for donating the effort, even if they didn't manage to get it perfect.

If you looked around a bit you'd probably notice that lots of the XML feeds used to have similar problems and such could be resolved by adding a character encoding header.

The whole site is volunteers. I thank those who volunteer answers and discussions. It is a "gift" culture and I donate to it because it is often fun and I often enjoy helping people or producing something of some value.

Someone on pmdev could probably just write code to produce similar but better output based on viewing the current RSS output (and perhaps how the new XML feeds are done, though they don't handle control characters properly -- which are only sent by non-conforming clients and may already be filtered from node titles, but this should, of course, still be fixed). I'd be grateful if someone cared to volunteer to do that (but I certainly don't feel I'm due such a contribution and understand some of the frustration of trying to do that). Such would certainly motivate me to raise the priority on trying out such code and trying to replace the broken feed.

Frankly, ranting is anything but motivating, for me. ]

tye

[reply]

Re: Re^3: Broken headlines (gift)

by Juerd (Abbot) on Oct 02, 2003 at 06:53 UTC

Re^5: Broken headlines (gift)

by tye (Sage) on Oct 02, 2003 at 15:44 UTC

Re^5: Broken headlines (gift)

by Aristotle (Chancellor) on Oct 02, 2003 at 07:41 UTC