vandale.pl

Category:	Web Stuff
Author/Contact Info	Juerd
Description:	Because the popular gnuvd is broken, I made this quick hack to query the Van Dale website for dictionary lookups. It's a quick hack, so no production quality here ;) Oh, and please don't bother me with Getopt or HTML::Parser: Don't want to use Getopt because I don't like it, and can't use HTML::Parser because http://www.vandale.nl/ has a lot of broken HTML, and because regexes are easier (after all, it's a quick hack because I can't live without a Dutch dictionary). This probably isn't of much use to foreigners :) Update (200306081719+0200) - works with vandale.nl html updates now.
#!/usr/bin/perl -w use strict; use LWP::Simple; my (@switches, @woorden); while (@ARGV) { $_ = shift; if (/^--$/) { push @woorden, @ARGV; } elsif (/^-/) { push @switches, $_; } else { push @woorden, $_; } } my $all = grep /^(?:-\wa\|--all)$/, @switches; if (grep /^(?:-\wh\|--help)$/, @switches) { print qq{ Usage: $0 [options] word ... options: -a --all List all matches -h --help Display usage information \n}; exit 0; } for my $woord (@woorden) { $woord =~ s/(\W)/sprintf '%%%02x', ord $1/ge; my $page = get "http://www.vandale.nl/opzoeken/woordenboek/?zoekwoord=$wo +ord"; while ($page =~ s{<B><BIG>(.?)</font>.?((?:<DD>.?</DD>)+)}{}si) + { my ($woord, $betekenis) = ($1, $2); for ($woord, $betekenis) { s[</dd>][\n]gi; s/<.?>//g; s/´/'/g; s/&#(\d+);/chr $1/ge; } $betekenis =~ s/^/ /gm; print "$woord\n$betekenis\n"; last if not $all; } }

Comment on vandale.pl Download Code

Replies are listed 'Best First'.
(jeffa) Re: vandale.pl (with Getopt::Declare) by jeffa (Bishop) on Mar 30, 2002 at 20:38 UTC
Regarding Option parsing modules - this is not to bug you into using them, but rather an option for others to decide. I though to myself, "hmmmm ... let's use TheDamian's Getopt::Declare" and proceded to RTFM. I had always wanted to learn this module, and now seemed like the time. After about 40 minutes of racking my brain (:D) i finally came up with this: `#!/usr/bin/perl -w use strict; use LWP::UserAgent; use Getopt::Declare; # -h, -v, --help, --version are included # and these are tabs - not spaces! my $spec = q( -a List all matches --all [ditto] ); my $args = Getopt::Declare->new($spec); my $all = $args->{'--all'} \|\| $args->{'-a'}; for my $woord ($args->unused) { # insert for loop block innards from code above }` [download] But that is 40 minutes of well spent time, because now i see the power of this module. And thanks to the Von Neumann bottleneck of having to retrieve the page from the Internet, the fact that Getopt::Declare is slower than the option parsing code above is negligible. P.S. i also have no quandaries about using regexes to parse HTML, just as long as the coder understands how to use the CPAN HTML parsers. Sometimes using regexes really is easier. Sometimes. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: (jeffa) Re: vandale.pl (with Getopt::Declare) by danger (Priest) on Mar 30, 2002 at 21:51 UTC
In the interest of tmtowtdi, here's an alternate Getopt::Declare scenario that sets up the $all and @words variables in action blocks. Here I decided to make the search words required (but without option description), and just allow for -a as an abbrev. of -all (instead of using the --all version): `#!/usr/bin/perl -w use strict; use LWP::UserAgent; use Getopt::Declare; use vars qw/$all @words/; my $opts = Getopt::Declare->new(<<'EOS'); -a[ll] List all matches {$all = 1} <terms:s>... [required] {@words = @terms} EOS for my $word (@words) { # insert fetch code ... # ... last unless $all; } __END__` [download]	[reply] [d/l]
Re: vandale.pl by cztmonk (Monk) on Jul 18, 2012 at 10:07 UTC
When I use this code, there is no output...	[reply]
Re^2: vandale.pl by marto (Cardinal) on Jul 18, 2012 at 10:14 UTC
This post is ten years old, the code was last updated nine years ago. It's likely the site in question has changed substantially in that time.	[reply]
Re^3: vandale.pl by cztmonk (Monk) on Jul 18, 2012 at 11:03 UTC
You are right, that was a stupid remark..	[reply]
Re^4: vandale.pl by marto (Cardinal) on Jul 18, 2012 at 11:54 UTC


good chemistry is complicated, and a little bit messy -LW
	PerlMonks