Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I do not understand a code ...
while ( <> ) { @D2{ map { my $l = lc; exists $stopwords{$l}?():$l } split /\W+/ } += (); }
can anyone explain what it means? especially the sign of ?():$l thanks

Code tags - dvergin 2003-02-04
Title edit by tye.   $title =~ s/[:ONE:]/[:ELL:]/ by dvergin.

Replies are listed 'Best First'.
Re: questions
by dvergin (Monsignor) on Mar 05, 2003 at 08:23 UTC
    Let's speculate on a context that might help make sense of this snippet:
    #!/usr/bin/perl -w use strict; # List all non-common words in a block of text my %stopwords = (a => 1, an => 1, in => 1, the => 1); my %D2; while ( <DATA> ) { @D2{ map { my $l = lc; exists $stopwords{$l} ? () : $l } split /\W+/ } = (); } print "$_\n" for sort keys %D2; __DATA__ This is a demo of an interesting example in the post from PerlMonks.
    Prints:
    demo
    example
    from
    interesting
    is
    of
    perlmonks
    post
    this

    In answer to your specific question,

        exists $stopwords{$l}?():$l

    uses the ? : trinary operator to say: If $l exists in the %stopwords hash return nothing (specifically, a null list), otherwise return $l itself. (Note that it is dollar-el, not dollar-one.)

    Considering my imagined context for this snippet, we have a list of common (uninteresting) words commonly called stopwords. These we want to ignore. So we read through the file (I used the DATA filehandle) a line at a time. For each line we split out the words, force each one to lowercase, drop the stopwords, and then use the others to specify a hash slice on the hash called %D2. Every key in that slice (whether new or pre-existing at that point) is set to an empty value (I'm skipping over some nuances here). When we have finished with all the data, the %D2 hash contains an entry for every "interesting" word in the data set (ignoring duplicates).

    Update: Trying to stay focused on the stated problem, I avoided the issue of the variable name $l (dollar-ell). Nkuvu is correct to point this out.

    I would also add that this code will return odd results for text containing words with apostrophes or hyphens: "isn't" and "good-by" will be reported as: "isn", "t", "good", "by".

    So this code is not recommended for Real Text in a situation where accuracy of results matters.

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

      Is it just me, or is $l (that's an ell) a really bad variable name? Especially in Perl, where $1 (that's a one) is a valid variable...

Re: What does ?():$l mean?
by jryan (Vicar) on Mar 06, 2003 at 07:24 UTC
    Also of note is the hashslice with the null assign; this is actually a bit of an obfuscated way to remove all duplicated non-stopwords in the data. I wouldn't recommend using this method; its probably not very efficient, and its definately not very obvious.