Re^3: Extracting tagged data from a XML file

Here is a naughty one liner to extract all occurances of IP addresses between tags <IP-ADDRESS> in any case and across lines. Very inefficient for a large file as it reads it all into memory. Change the word data to the name of your file.

perl -le 'local$/;open F,data;$_=<F>;s/\n//g;while(/<ip-address>(.+?)<
+\/ip/gi){print $1}'
[download]

Here is the same but to grap IP addresses from either <IP-ADDRESS> or <IP_NEIGHBOUR> tags.

perl -le 'local$/;open F,data;$_=<F>;s/\n//g;while(/<ip-(neigbour|addr
+ess)>(.+?)<\/ip/gi){print $2}'
[download]

update

For some reason I got marked -1 on this, if anyone can explain what I did wrong here I'd love to know. I realise the code is naughty for eating the file in one gulp but if the file is small this can't do much harm and it makes for a very simple solution to the possible problem of the addresses being broken accross line breaks.

Anyway, looking at this thread the OP looks to have changed his mind and not want to capture the <IP-NEIGHBOUR> addresses, as well as wanting only unique addresses returned, I will update this space soon with a version that is more mem friendly and possibly redeem myself in the eyes of the monastery.

further update

OK, I have had the error of my ways pointed out, thou shalt not parse XML with regexp. I shall stop sinning now, no further unholy code shall follow.
R.

Comment on Re^3: Extracting tagged data from a XML file Select or Download Code

Replies are listed 'Best First'.
Re^4: Extracting tagged data from a XML file by davorg (Chancellor) on Aug 31, 2004 at 12:11 UTC
if anyone can explain what I did wrong here I'd love to know You tried to parse XML with regular expressions. -- <http://www.dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply]