Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Pattern Matching in Cygwin Perl vs. Win32 Perl

by InsolentFlunkey (Beadle)
on Aug 26, 2008 at 22:57 UTC ( [id://707001]=perlquestion: print w/replies, xml ) Need Help??

InsolentFlunkey has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks -

I wrote a simple script that allows the user to enter a pattern of letters, searches through a list of words in a file, and outputs any words that match the pattern (sort of like a crossword solver). Here's the code:

#!/usr/bin/perl use strict; print "Enter word to be solved.\n"; print "Use a dash (-) to indicate a single missing letter.\n"; print "Use a star (*) to match 0 or more letters.\n"; print "Use a plus (+) to match 1 or more letters.\n"; print " WORD --> "; chomp( my $word = <STDIN> ); $word =~ s/\-/\\w/g; $word =~ s/\*/\\w*/g; $word =~ s/\+/\\w+/g; open WORDLIST, "< WORD.LST" || die "Cannot open WORD.LST: $!\n"; while (<WORDLIST>) { chomp( my $newWord = $_ ); if ( $newWord =~ m/^$word$/i ) { print $newWord . "\n"; } } close WORDLIST;

When I run the script on Windows (ActiveState Perl), it seems to work perfectly. However, when I run it under Cygwin I get no results, no matter what pattern I enter - even if I type the exact word.

My concern isn't as frivolous as it seems. I'm developing a much larger script for work that uses a good bit of RegEx pattern matching. It will eventually run on a Unix server, but I'm running it through Cygwin for alpha testing. I need to be sure I'm getting accurate results before I muck up the data on our dev servers.

So - am I missing something painfully obvious, or has anyone else noticed any issues with Perl on Cygwin?

Replies are listed 'Best First'.
Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by ikegami (Patriarch) on Aug 26, 2008 at 23:01 UTC
    open WORDLIST, "< WORD.LST" || die "Cannot open WORD.LST: $!\n";

    means

    open WORDLIST, ("< WORD.LST" || die "Cannot open WORD.LST: $!\n");

    which is the same as

    open WORDLIST, "< WORD.LST";

    (since "< WORD.LST" is true). Try

    open WORDLIST, "< WORD.LST" or die "Cannot open WORD.LST: $!\n";

    If that doesn't help, try printing out $word and $newWord using Data::Dumper and $Data::Dumper::Useqq = 1;. You might have line-ending issues.

      Also, leave the newline ending off the end of the die() statement. It supressses line number information. According to perlfunc, die:

      If the last element of LIST does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied.

      Here is my favorite file opener. I have used it for a couple of years and used to routinely catch a lot of wierd errors due to my typing skills (or lack there of).

      • It doesn't suppress line number info,
      • It uses the three argument style for open,
      • It uses one of those cool quote-like operators (qq{}),
      • It uses the lower precedence or operator to avoid extra parentheses
      • It uses a scalar ($fh) instead of a glob (FH) to hold the file handle and allows me to avoid local when passing file handles,
      • It quotes the file name in the error message in case something wonky is going on with white space (or I screwed up the file name), and
      • It looks good in my editor's syntax highlighter. :)
      open my $fh, '<', $filename or die qq{Cannot open "$filename": $!}; open my $fh, '>', $filename or die qq{Cannot open "$filename": $!}; open my $fh, '>>', $filename or die qq{Cannot open "$filename": $!};
      HTH,
      Charles

        I disagree completely with your first point. If you need to know on what line an I/O error occured, something is horribly wrong with your error handling. Users must not have to dig into a program to find and address the cause of errors under their control (as opposed to a programming error).

        I agree with the other changes, but I didn't want to venture far from the topic until the OP's problem became known. How I write it:

        open(my $fh_log, '>>', $qfn_log) or die(qq{Cannot open log file "$qfn_log": $!\n});
Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by mr_mischief (Monsignor) on Aug 26, 2008 at 23:26 UTC
    There's a good chance you have a path mismatch between Cygwin and AS Perl. When you start Cygwin, you'll rarely be in the directory you were in when you ran cygwin.bat to get into the environment.

    If you fix the error in your code that ikegami pointed out, you might find out this is what pointed out your bug to you. This is a good example of why code that is more portable is more likely to be correct.

Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by ysth (Canon) on Aug 27, 2008 at 02:21 UTC
      chomp removes anything which is in $/, which is set to "\n" by default. So it ought to remove both CR and LF on Windows, I think.
        So it ought to remove both CR and LF on Windows, I think.

        Isn't it better to know?

        C:\>perl -le"print $/" | od -tacx1 0000000 cr nl cr nl \r \n \r \n 0d 0a 0d 0a 0000004 C:\>perl -le"print unpack q,H*,,$/" | od -tacx1 0000000 0 a cr nl 0 a \r \n 30 61 0d 0a 0000004 C:\>
        Furthermore, perlport says
        In most operating systems, lines in files are terminated by newlines. +Just what is used as a newline may vary from OS to OS. Unix tradition +ally uses \012, one type of DOSish I/O uses \015\012, and Mac OS uses + \015. Perl uses \n to represent the "logical" newline, where what is logical + may depend on the platform in use. In MacPerl, \n always means \015. + In DOSish perls, \n usually means \012, but when accessing a file in + "text" mode, STDIO translates it to (or from) \015\012, depending on + whether you're reading or writing. Unix does the same thing on ttys +in canonical mode. \015\012 is commonly referred to as CRLF.
        But "\n" is not both CR and LF. It is a single character, and is LF on almost all systems. On Windows, it can be translated to and from CRLF when reading/writing a file, but that happens during the IO, not at any other time.
Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by InsolentFlunkey (Beadle) on Aug 27, 2008 at 16:44 UTC
    Thanks to everyone for the replies and information! I had already confirmed that opening the dictionary file wasn't the problem. I could print all of the words that the script read from WORD.LST, so I knew that part was working.

    Looks like ysth and Anonymous Monk (man, that guy has a lot of writeups! ;o) ) had the right idea. It hadn't occurred to me that chomp on Cygwin would try to remove the Unix newline character, but the file had Windows newline characters. I replaced chomp with:

    $newWord =~ s/[\r\n]//g;
    and it works like a charm on both Cygwin and Windows.

    Thanks again for everyone's help!

Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by BrowserUk (Patriarch) on Aug 26, 2008 at 23:18 UTC

    Update: Don't upvote this post. It is absolute rubbish! Though no more so than the other "reasons" suggested in this thread!

    Is the name of the file definitely WORD.LST? Not maybe word.lst or Word.lst or some other variation of case?

    I'm not sure how closely Cygwin emulates POSIX, but is it possible that it will only open the file if the case matches exactly?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      If the issue is opening word.lst, would it help to first use a small version of word.lst (containing say 5 words), and print them out as you read the file in? That way you can tell if you are reading in things to match in the first place... Just a thought.

      tubaandy
Re: Pattern Matching in Cygwin Perl vs. Win32 Perl
by hexcoder (Curate) on Aug 27, 2008 at 20:40 UTC
    alternatively you can set $/ explicitly to "\r\n" for a global effect or use the form
    open my $fh, q{<:crlf}, 'WORD.LST' or die "cannot open file:$!\n";
    for a file limited effect.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://707001]
Approved by ikegami
Front-paged by tye
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-04-24 17:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found