Dear Perl Monks,

I am trying to do a pattern matching exercise. What I am trying to do:

Pick words from a file ($wordfile - each line contains a main term and its synonyms separated by a space) and see if those words (or their synonyms) appear in sentences of another file ($textfile - each line contains textID and associated sentences).

Below is my code.

1. I am getting the following error. When I comment "use warning" in the header this error goes off. What is this error and how to remove it?

main::ReadDataInHash() called too early to check prototype at D:\wordm +atch.pl line 19.
2. Also, I wanted to find out how I can improve upon my program so that it runs faster. Is there a way to avoid looping over each key of hash %{$List1Ref}, in, foreach my $p (sort keys (%{$List1Ref})). Any other faster way?

$wordfile SSN3 CDK8 GIG2 NUT7 RYE5 SRB10 UME5 . . $textfile 17170106|Perturbation of the activity of replication origin by meiosis + specific transcription.|We have determined the activity of all ARSs +on the Saccharomyces cerevisiae chromosome VI as chromosomal replicat +ion origins in pre-meiotic S-phase by neutral/neutral 2D gel-electrop +horesis. The comparison of origin activity of each origin in mitotic +and pre-meiotic S-phase showed that one of the most efficient origins + in mitotic S-phase, ARS605, was completely inhibited in pre-meiotic +S-phase. ARS605 is located within the ORF of Msh4 gene that is transc +ribed specifically during an early stage of meiosis. Systematic analy +ses of relationships between Msh4 transcription and ARS605 origin act +ivity revealed that transcription of Msh4 inhibited the ARS605 origin + activity by removing ORC from ARS605. Deletion of UME6 {{UME6}}, a t +ranscription factor responsible for repressing Msh4 during mitotic S- +phase, resulted in inactivation of ARS605 in mitosis. Our finding is +the first demonstration that the transcriptional regulation on the re +plication origin activity is related to changes in cell physiology. T +hese results may provide insights into changes in replication origin +activity in embryonic cell cycle during early developmental stages. . .
Thank you very much.

Raj

My code:

#!/usr/bin/perl use warnings; use strict; if ($#ARGV != 4) { print "usage: run batch file 'run' not this one\n"; exit; } my $wordfile = $ARGV[0]; my $textfile=$ARGV[3]; my $OutPutFile=$ARGV[4]; open (IF1,"$wordfile")|| die "cannot open the file"; open (PF, "$textfile")|| die "cannot open the file"; open (OF,">$OutPutFile")|| die "cannot open the file"; my $List1Ref=ReadDataInHash (*IF1); while (my $line=<PF>) { chomp($line); my @arrAbs=split (/\|/,$line); my $ID=$arrAbs[0]; my $Title=$arrAbs[1]; my $Abs=$arrAbs[2]; @arrAbs=split (/\./,$Abs); print OF"$ID|"; for (my $SentenceNumber=0;$SentenceNumber<=$#arrAbs ;$SentenceNumb +er++) { my $i=$SentenceNumber+1; print OF "<".$i.">"; my $Sentence=$arrAbs[$SentenceNumber]; my @arrAbsSen=split (' ',$Sentence); foreach my $word(@arrAbsSen) { #to match terms in the list, stored in %{$List1Ref}. if (exists(${%{$List1Ref}}{uc($word)})) { print OF "$word "; } else { foreach my $p (sort keys (%{$List1Ref})) { if (exists(${%{${%{$List1Ref}}{$p}}}{uc($word)})) +{ print OF "mainterm:$p:matchedterm:$word "; last; } } } } @arrAbsSen=(); } print OF "\n"; @arrAbs=(); } sub ReadDataInHash() { my $x = shift; my %list1=(); while (my $line =<$x>) { chomp $line; my @arr=split /\s/,$line; for (my $i=0;$i<=$#arr ;$i++) { if ($i==0) { $list1{$arr[$i]}={}; } else{ ${%{$list1{$arr[0]}}}{$arr[$i]} = 1; } } } return {%list1}; }

In reply to Error: "called too early to check prototype" and is word search using nested hash optimal? by newbio

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.