in reply to list lines not found in config (while+if)

Cheers for the help guys, but im still not getting it. Truth be told, i hadn't tried the script with the examples I gave you, just gave those for ease of explaination. But when a few of you said that it worked for you, I copied back what inputs I'd wrote, tried it and I get no output at all. My actual file1 and file2 are 8000 queries and 12000 lines to select from
so:
my script + my (8000) inputs = print everything
my script + made up (5) inputs = print nothing
kennethk's script + my (8000) inputs = print everything
kennethk's script + made up (5) inputs = print nothing

seen as two computers are getting two diff results is there an overall problem? I know a bad workman blames his tools but could it be?....
whats the chance of me missing some module/update/package? (clutching at straws here)

my actual files go to the tune of:

File1 GP_MASA_01F04_c GP_MASA_38C02_c GP_MASA_33B06_c GP_MASA_24D04_c GP_MASA_35A04_c ...to 9000 lines File2 (is a .csv file) 'GP_MASA_01F04_c',681,'ACCACACATCATCTGACTTACGTACGTACG...... 'GP_MASA_38C02_c',273,'ACATCCTTCACAGAAGTTTGT............. 'GP_MASA_33B06_c',288,'ACATACTAACACGGTCTTT............... .....to 12400 lines

also, I intend to have a go with all the other scripts and tips you kind kind people have put up here but its the middle of the night and Im falling asleep where im sitting!

thank you again for all the help

Replies are listed 'Best First'.
Re^2: list lines not found in config (while+if)
by GrandFather (Saint) on Apr 08, 2009 at 00:43 UTC

    Taking your sample data and original code I've generated the following sample code. Note that I added strictures and cleaned up a few other aspects of you code. I also removed the first line of your reference file so that at least one "missing" line would be reported.

    use strict; use warnings; my $File1 = <<END_FILE1; GP_MASA_38C02_c GP_MASA_33B06_c GP_MASA_24D04_c GP_MASA_35A04_c END_FILE1 my $File2 = <<END_FILE2; 'GP_MASA_01F04_c',681,'ACCACACATCATCTGACTTACGTACGTACG...... 'GP_MASA_38C02_c',273,'ACATCCTTCACAGAAGTTTGT............. 'GP_MASA_33B06_c',288,'ACATACTAACACGGTCTTT............... END_FILE2 my $match = 0; open my $dataIn, '<', \$File2; while (<$dataIn>) { chomp ($_); my $dataLine = $_; open my $refIn, '<', \$File1; while (<$refIn>) { chomp ($_); my $str = $_; if ($dataLine =~ /$str/) { $match = 1; } } close ($refIn); if ($match == 0) { print "$dataLine\n"; } $match = 0; }

    Prints:

    'GP_MASA_01F04_c',681,'ACCACACATCATCTGACTTACGTACGTACG......

    Although reparsing the reference file for each line of the data file is exceedingly nasty, the code works. Maybe you can update the sample to demonstrate where you are seeing a problem?


    True laziness is hard work
      Ive gone through and put print's throughout the script and ive finaly found out the problem! its the bloody matching string. there is something in my reference file that matching each time it reads the data file
      i did a:
      if($s_line =~ /$str/){ print "$str -- match\n"; } else { print "$str -- no match\n"; }
      on my test data and it shown that its matching an empty line so there must be something in my reference file thats matching up (ive already checked and its not empty lines). I'm gonna look for it now so i wont keep you hanging on, but cheers for all the improvement suggestions. Im going to go through them all when I have more time and no deadline to catch
      cheers for the help!
      in heinsight, I should have given better examples at the start. sorry. will do next time