Regular Expression

optikool has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Regular Expression by Zaxo (Archbishop) on May 18, 2003 at 05:41 UTC
`while (my $words = <DICT>) { chomp $words; push @matches, $words if $words =~ /a$/ }` [download] or just `my @matches = grep { chomp; /a$/ } <DICT>;` Update: With CountZero's caveat, the second example will work fine on windows. That is Perl's internal grep, not the unix system utility of the same name. Perl chomp removes whatever $/ is from the end of a string - by default that is the native newline. After Compline, Zaxo	[reply] [d/l] [select]
Re: Re: Regular Expression by CountZero (Bishop) on May 18, 2003 at 07:20 UTC
The `grep` solution slurps the whole dictionary file as one list and can make for a huge process if the dictionary is large. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l]
Re: Re: Regular Expression by optikool (Novice) on May 18, 2003 at 21:54 UTC
Thanks for the information Zaxo. The first example cuts down a lot of code I had plan to use. I can't really use the second example because I need this to work on both windows and unix, though eventually it will be unix only. However the words are still missing when the search is started. Here is the code... `use strict; use warnings; my @matches = (); open (DICT, "dictionary.txt") or die "Dictionary.txt: $!\n"; while (my $words = <DICT>) { chomp $words; push @matches, $words . " 1" if $words =~ /a$/; push @matches, $words . " 2" if $words =~ /.[i].[i].[i].[i].[ +i]./gi; push @matches, $words . " 3" if $words =~ /[^aeiou]/gi; my $reverse = reverse($words); push @matches, $words . " 4" if $words eq $reverse; } close (DICT); foreach (@matches) { print $_; }` [download]	[reply] [d/l]
Re: Re: Re: Regular Expression by graff (Chancellor) on May 19, 2003 at 01:43 UTC
Regarding this issue: I need this to work on both windows and unix The safest way to remove line terminations across platforms is: `s/[\r\n]+//;` [download] For that matter, you could probably just do `s/\s+//g;` based on the assumption that the only white space to be found in your dictionary file is the line breaks.	[reply] [d/l] [select]
Re: Re: Re: Regular Expression by hangmanto (Monk) on May 19, 2003 at 00:17 UTC
You might want to check to see if the words are there inside of the while loop. A simple print statement like `print "----$words-----\n";` [download] inserted after the chomp statement may detect the problem.	[reply] [d/l]
Re: Regular Expression by Abigail-II (Bishop) on May 18, 2003 at 09:23 UTC
Could you please show us the code that is failing? (And keep it short). Because `/a$/i` is supposed to match strings that end with `"a\n"`. And that chomping removes the entire word doesn't make any sense. I think you made a mistake in your code, and are blaming the buildins for it. Abigail	[reply] [d/l] [select]
Re: Regular Expression by arthas (Hermit) on May 18, 2003 at 18:25 UTC
chomp() shouldn't really remove the whole word, even a weird setting of $/ wouldn't make that happen. You can try to use chop(), which trims ONLY the last character no matter what it is but, really, a look at the code would be a better choice: it's probably not chomp that eats your words. ;-) Michele.	[reply]
Re: Re: Regular Expression by optikool (Novice) on May 19, 2003 at 07:08 UTC
Thanks a lot for everybodies input. I'm not sure why the chomp function was not working correctly but it is working correctly now which solved most of the other problems I was having. I have one more question... Is there a way to match words with double letters.. ex keenness or mississippi? Also is there a way to match words that are in alphabetical order.. ex abby or abit? Thanks so much for your help. =0)	[reply]
Re: Re: Re: Regular Expression by antirice (Priest) on May 19, 2003 at 07:59 UTC
For double letters: `if ($word =~ /(.)\1/) ...` [download] All letters in alphabetical order (not regex): `if (lc($word) eq join "", sort split //, lc($word));` [download] antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1	[reply] [d/l] [select]