szabgab has asked for the wisdom of the Perl Monks concerning the following question:

I have been trying to use Net::Google to search using Hebrew text with no success. While searching for the word פרל (Perl) in Hebrew I get back a bunch of results in Hebrew, but they are unrelated to the subject.

Has anyone had any success usin Net::Google or any other module to search Google with non-English text?
Is the some special voodoo I need to do?

I am especially interested in Hebrew or Arabic or other languages with non latin characters.

Replies are listed 'Best First'.
Re: Net::Google and non-English languages
by pg (Canon) on Sep 05, 2005 at 19:45 UTC

    Here is the issue. I don't believe that you really have a word in Hebrew that means Perl - the programming language. I don't know Hebrew, but most likely that's just a mimic of the pronunciation. The best is not to translate those "names".

      Using Google directly does show us the correct results, if nothing else this is the family name of several people in Israel.

      Anyway the word is the phonetical transcription of the sound of Perl. As you guessed. (something about stop banging your head against the wall ?)

Re: Net::Google and non-English languages
by graff (Chancellor) on Sep 05, 2005 at 21:20 UTC
    Maybe if you show us a little bit of your code, or at least make it clear how you are specifying the search terms. If you're not using actual utf8 text data, that might be the problem, since according to man page for Net::Google that's what is needed.

    There are lots of ways to get that wrong...

      The code is the the one I saw in the Synopsis of Net::Google
      use Net::Google; use constant LOCAL_GOOGLE_KEY => "********************************"; my $google = Net::Google->new(key=>LOCAL_GOOGLE_KEY); my $search = $google->search(); use utf8; my $str = "פרל"; $search->query($str); map { print $_->URL()."\n"; } @{$search->results()};
      I even checked using the is_uft8 function of Encode if the string is in utf8 and it reported true.

      Actually gaal and I have even ran tcpdump on the port and it seemed that the string was sent out correctly.

      ps. It seems that the Monastery (or my browser ?) turned the real Hebrew string above into some strange representation.

        I've never used Net::Google myself (or the Google SOAP API either), so I hope others can help more than I can. (E.g. I don't know whether your "local google key" value is literally a string of asterisks, or whether you're just using that to mask some actual private string assigned to you somehow by google.)

        Anyway, does your "ps" comment at the end refer to the string being assigned to "$str" in that snippet, and does this explain why you have  "פרל" there instead of actual utf8 data? Have you tried doing it like this:

        my $str = "\x{1508}\x{1512}\x{1500}";
        If not, try that. (Just a shot in the dark..)