Comparing hash keys and values with Regular Expressions

SayWhat?! has asked for the wisdom of the Perl Monks concerning the following question:

Hello wise ones!

I'm currently trying to compare my hash's keys and values by means of a regex. Here is a sample of my input (a tab separated two column text file:

vriendelik    aardig
polisieman    agent
net-net    amper
gedierte    beest
naak    bloot
homoseksueel    flikker
bronstig    geil
menskop    hoofd
toedraai    inpakken
dierekop    kop
onskuldig    onnozel
perdeteler    paardenfokker
dennebol    pijnappel
[download]

as well as my code thus far:

#!/usr/bin/perl-w
use strict;
use warnings;
use open ':utf8';
use autodie;

open NONMATCHINPUT, "<OutputNonMatchedWords.txt";
open OIC, ">OutputIdenticalCognates.txt";
open ONIC, ">OutputNonIdenticalCognates.txt";
open ONC, ">OutputNonCognates.txt";

my %nonmatchhash;

while (my $line = <NONMATCHINPUT>)
{
    chomp $line;

    #split the line on tab
    my ($nonmatchhashkeys, $nonmatchhashvalues) = split /\t/, $line;
    $nonmatchhash{$nonmatchhashkeys} = $nonmatchhashvalues;

    #if the values of the hash are exactly the same as the keys of the
+ hash
    if ($nonmatchhash{$nonmatchhashkeys} = $nonmatchhash{$nonmatchhash
+values})
    {
        #print both key and value to OutputIdenticalCognates.txt, sepa
+rated by a tab
        print OIC "$nonmatchhashkeys\t$nonmatchhashvalues\n";
    }

    #assign each key in the hash to $AfrColumn1token
    foreach my $AfrColumn1token(keys %nonmatchhash)
    {
        
        #if the Afrikaans word ($AfrColumn1token) contains: anything, 
+followed by 'agtig', followed by 'e' or 'er' or 'ste' (optional), at 
+the end of the string
        if ($AfrColumn1token =~ /(.*)(agtig)(e|er|ste)?$/)
        {
            #then, by using a foreach, assign each value in the hash t
+o $DutColumn2token
            foreach my $DutColumn2token (values %nonmatchhash)
            {
                #And then, if the Dutch word ($DutColumn2token) contai
+ns: anything, followed by 'achtig', followed by 'e' or 'er' or 'ste' 
+(optional), at the end of the string
                if ($DutColumn2token =~ /(.*)(achtig)(e|er|ste)?$/)
                {
                    #print it to OutputNonIdenticalCognates.txt
                    print ONIC "$AfrColumn1token\t$DutColumn2token\n";
                }
            }
        }
        else
        {    
            #else, print it to OutputNonCognates.txt
            print ONC "$AfrColumn1token\t$DutColumn2token\n";
        }
    }
}
[download]

I want to check if the hash's key consists of that which I entered into the regex. If that is true, I want to do a similar check with the hash's values - again with a regex.

To explain:

 if $AfrColumn1token consists of: 
anything, followed by 'agtig', followed by 'e' or 'er' or 'ste' (optio
+nal), at the end of the string, 
#then 
check to see if DutColumn2token consists of: anything, followed by 'ac
+htig', followed by 'e' or 'er' or 'ste' (optional), at the end of the
+ string. 
#then 
that particular key and value must be written to OutputNonIdenticalCog
+nates.txt, 
else write the pair to OutputNonCognates.txt.
[download]

I then want to repeat the complete foreach loop eleven (11) times, because I have 11 different rules I need to implement in my program.

What I would like to know now: is that foreach allowed in perl? If so, what is wrong with it, because the output files OutputNonIdenticalCognates.txt andOutputNonCognates.txt are empty. If it's not allowed, how cant I change it so it does the same thing I's like it to do..?

Thank you in advance! :)

Comment on Comparing hash keys and values with Regular Expressions Select or Download Code

Replies are listed 'Best First'.
Re: Comparing hash keys and values with Regular Expressions by CountZero (Bishop) on Jun 30, 2012 at 17:41 UTC
This looks strange: `if ($nonmatchhash{$nonmatchhashkeys} = $nonmatchhash{$nonmatchhash +values})` [download] You are using the "`=`" assignment operator in a test. If you want to test if key and value are equal you should use "`eq`". By the way: the Dutch word for "onskuldig" is "onnozel" (with a "z" rather than a "s"). CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply] [d/l] [select]
Re^2: Comparing hash keys and values with Regular Expressions by SayWhat?! (Novice) on Jun 30, 2012 at 18:12 UTC
Hi there! I tested `if ($nonmatchhash{$nonmatchhashkeys} = $nonmatchhash{$nonmatchhash{values})` with both '=' and 'eq', and the output was the same both times: correct, which is exactly what I wanted it to be. But the thing is, my problem lies at the 'foreach', actually.. But thanks anyway and thanks for the correction of 'onnozel'. :)	[reply] [d/l]
Re^3: Comparing hash keys and values with Regular Expressions by CountZero (Bishop) on Jun 30, 2012 at 18:21 UTC
Yet, you are doing totally different things: `=` assignment operator `==` numeric test for equality `eq` string test for equality CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply] [d/l] [select]
Re^3: Comparing hash keys and values with Regular Expressions by ww (Archbishop) on Jun 30, 2012 at 19:52 UTC
"with both '=' and 'eq', and the output was the same both times: correct" is NOT a comprehensive test of the truth or falsity of the advice you were given... which is itself correct, while your usage is NOT. You initially addressed your question to "wise ones." It would be consistent to carefully consider their answers, rather applying your logically-inadequate testing and then knocking out a answer that denies their wisdom.	[reply]
Re^3: Comparing hash keys and values with Regular Expressions by Anonymous Monk on Jun 30, 2012 at 19:24 UTC
with both '=' and 'eq', and the output was the same both times: correct, which is exactly what I wanted it to be. That is what they all say `my %foo; my %bar = ( 1, 2 ); if( $foo{1} = $bar{1} ){ warn "assignment is assignment"; } if( $foo{1} = 'any true value' ){ warn "assign any true value, expression is true"; } warn "foo eq bar ", int ( $foo{1} eq $bar{1} ); __END__ assignment is assignment at - line 5. assign any true value, expression is true at - line 9. foo eq bar 0 at - line 11.` [download] Just because the output is the same doesn't mean much, even a broken clock is right twice a day Think of it a different way, you have a problem you can't solve and you're asking for help in solving it --- maybe, just maybe, those you're asking for help know something you don't know	[reply] [d/l]
Re^3: Comparing hash keys and values with Regular Expressions by ww (Archbishop) on Jun 30, 2012 at 20:07 UTC
Update: Created on the basis of my own mistaken belief that the initial reply had hit the bit-bucket thanks to some error on my part. Ignore till reaped. You addressed your op to "wise ones". You have answers from amongst the wisest. It would be consistent if you were to heed the response (and now, responses) from those from whom you sought answers... rather than relying on a logically insufficient test of your mistake as a rebuttal.	[reply]
Re^3: Comparing hash keys and values with Regular Expressions by SayWhat?! (Novice) on Jun 30, 2012 at 20:31 UTC
CountZero Ok, I admit I made a mistake when saying the output was the same.. Sorry 'bout that, but that's how we learn, right? After testing that piece of code again, with '=', I got the output I desired. But when tested with 'eq', I got the following message as well as an empty output: `Useless use of String eq in void context` [download] I also got this error message for every single line of my code: `Use of uninitialized value in String eq` [download] So what's that all about then? :s And how can I correct it by using 'eq' instead of '=' then?	[reply] [d/l] [select]
Re^4: Comparing hash keys and values with Regular Expressions by ww (Archbishop) on Jun 30, 2012 at 21:24 UTC
Re^4: Comparing hash keys and values with Regular Expressions by Anonymous Monk on Jun 30, 2012 at 21:25 UTC
Re: Comparing hash keys and values with Regular Expressions by zentara (Cardinal) on Jun 30, 2012 at 16:11 UTC
Look at dialup spam removal with Net::POP3 and see what are called pre-compiled regexes from an array of strings. Aristotle shows improved code, in his response. But the idea is to make precompiled regexes one time from your strings, then you loop thru your data only one time, checking all lines against the precompiled regexes. I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh	[reply]
Re: Comparing hash keys and values with Regular Expressions by Kenosis (Priest) on Jun 30, 2012 at 22:49 UTC
You've received some excellent advice about your code. At the risk of confusing the issue (and I apologize in advance if it does), consider the following: use Modern::Perl; use open ':utf8'; use autodie; my (%nonmatchhash, @hashvalues); open my $NONMATCHINPUT, '<', 'OutputNonMatchedWords.txt'; do { /(.)\t(.)/; $nonmatchhash{$1} = $2; push @hashvalues, $2 } for <$NONMATCHINPUT>; close $NONMATCHINPUT; while ( my ( $key, $value ) = each %nonmatchhash ) { given ($key) { when (/\A$value\z/) { say "key '$key' eq value '$value'"; } when (/(.)(agtig)(e\|er\|ste)?$/) { say "$key\t$_" for grep /(.)(achtig)(e\|er\|ste)?$/, @hashvalues; } default { say "No match for key '$key'"; } } } [download] From what I could gather from your code, it looks like you're taking an action after testing for three conditions, viz., 1) key/value equality, 2) matching a key, then a value under that key, and 3) no match. The script above handles these three cases, and may assist your coding decisions. Hope this helps!	[reply] [d/l]