dhudnall has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that contains a list of all the table names that I found when searching through SQL files. I have a file that contains ALL the tables ALONG WITH the database that they belong to as well. What I want to do is compare the file that contains JUST the tables I found in the SQL files to the file that contains the tables ALONG WITH the database. If there is a match, I want to pull the ENTIRE line (database, table) so that I can then upload the file into an excel spreadsheet. The code that I have used to pull the tables from the SQL files is as follows:
#This will open the file that will contain the list of tables open(TABLES, '<DBtest.txt') or die "The file DBnames.txt could not be +opened\n"; #This will open the file that will contain the list of SQL statements +to be parsed open(DOUT, '<databaseJoinDan.txt') or die "The file databaseJoinDan.tx +t could no t be opened\n"; #This will open the file that will be written to containing the list o +f tables #in the SQL files open(USED, '>>tableUsed.txt') or die "The file tablesUsed.txt could no +t be writt en to\n"; #This will convert the files into lists to be searched @tableList=<TABLES>; @fileList=<DOUT>; @tableUsed=<USED>; #declarations $Y=0; $T=0; #This will search through the files and see if the tables exist in the + file chomp @tableList; chomp @fileList; for $Y (@tableList){ for $T (@fileList){ if($T =~ /$Y/i){ print USED "$Y\n"; } } }#end of the for loop close(DOUT); close(TABLES); close(USED);
-Therefore, after executing this code I have a list of tables but what I REALLY WANT is a the DB along with the table name. The file that contains an ENTIRE listing of all DB names along with their tables is DatabaseTables.txt. I just want to be able to search through that file using the tables that I found in the first search to pull the ENTIRE listing (DB & TABLENAME). Please help if you can.

Replies are listed 'Best First'.
Re: comparing 2 files
by jdporter (Paladin) on Jun 19, 2007 at 18:00 UTC
    fgrep -i -w -f DBtest.txt databaseJoinDan.txt > tableUsed.txt

    Perl code to do something like this (but without all the complex options of grep) could be as follows:

    # find_strings.pl use strict; use warnings; my $patfile = shift; my @pats = do { local @ARGV = ( $patfile ); <> }; chomp @pats; my $pat = join '|', @pats; while (<>) { /\b($pat)\b/i and print; }

    With the above code, invocation would be slightly different from the fgrep solution.

    perl find_strings.pl DBtest.txt databaseJoinDan.txt
    (for example).

Re: comparing 2 files
by graff (Chancellor) on Jun 20, 2007 at 03:19 UTC
    You didn't happen to mention how the information in the second file was formatted (the one that has table names "ALONG WITH the database that they belong to"). If that second file is arranged as two "columns" (space-separated strings) per "row" (line), you might want to use hashes to make things easier:
    #!/usr/bin/perl use strict; use warnings; my $table_list_file = shift; # first command-line arg my $table_db_list_file = shift; # second arg my %tables; open( I, "<", $table_list_file ) or die "$table_list_file: $!"; while (<I>) { chomp; $tables{$_} = undef; } open( I, "<", $table_db_list_file ) or die "$table_db_list_file: $!"; while (<I>) { my ( $table_name, $db_name ) = split; # or maybe it's the other wa +y around? if ( $tables{$table_name} ) { # seen in first file? print; } }

    And here's a shameless plug for an old post of mine: cmpcol would do this directly, given the following command line, if the files are like what I've assumed above:

    cmpcol -i -l2 table.list table_db.list
    In both cases suggested here, you can redirect STDOUT on the command line to create the output file that you want:
    script.file [-options] file1 file2 > output.list
      Hi, Thank you for getting back to me on my problem. I tried inputting your code into a separate file and compiling and I keep getting the following error and I don't know what it means:
      Global symbol "$tables_db_list_file" requires explicit package name a +t ./playFunction.pl line 15. Execution of ./playFunction.pl aborted due to compilation errors.
      -I saw where you had in two parts of the program and I didn't know if it needed to be TWO different file handles or what. The way I have my program currently running is I call a file called counter15.pl that does lots of stuff (pulling SQL statements, performing various calculations and the SQL files that I separated the SQL statements into, pulling database joins etc.) The last thing I did in my program was I wrote code in which a file that contained a list of ONLY TABLES (not with DB name) was loaded. Then I parsed EACH of my SQL files to see if any instances of the tables were in any of the SQL files. If they were, then I just printed the table name that was found. NOW, I have a list of all the tables names in a separate file called, tableUsed.txt and what I need is a list of table names ALONG WITH the database name. YES, to answer your question in the DB & table name file, the DB & table name are separated by a (space). I just want to eventually have a FINAL file that contains a listing of the databases and table names in which I will upload into an excel spreadsheet. I am VERY new to PERL, infact they made me learn it in a week, therefore I am no expert by any means but the code you have given me should work but again I have that error and I don't know why. Please get back to me when you have time. Thank you so very much for your help. Dan
        Wow. Now this is turning into sort of a train wreck.
        I tried inputting your code into a separate file and compiling and I keep getting the following error and I don't know what it means:
        Global symbol "$tables_db_list_file" requires explicit package name at + ./playFunction.pl line 15.

        The error message means that you made a spelling error when you tried to recreate my sample code. (Why didn't you just download or copy/paste it?) My code does not contain a variable called "$tables_db_list_file" (note the extra "s" in "tables_"), but your version of the code does, and since "use strict" is in there, that is what is causing the error. Just take that "s" out and it should compile.

        As for that long paragraph of yours in italics, I don't know what to say. Your original post in this thread talked about reading and comparing two files. That's what my code does.

        Don't panic. Spend some time reading manual pages. When you don't know how to interpret an error message, paste the msssage into a google search box (at least, parts that do not include specific names of your variables or files), because it'll turn up somewhere on the web with a decent explanation. Probably the best place to search first is in the perldiag manual page.