Search tool

Foggy Bottoms has asked for the wisdom of the Perl Monks concerning the following question:

Hey monks ! I'd like to develop a really quick and useful search tool capable of looking for a file that has specific content (exactly like the Windows search tool - the script'd have to run under Windows-based systems). Hence I came up with 2 options :

Since the tool already exists in Windows, I thought to myself I'd be lazy and use Win32::API to access the functions and reuse them. Easier said than done. I used Dave Roth's excellent Win32 Perl Programming book to dig out some workable examples but in vain. Here's my bit of code :

#!/usr/bin/perl
# test file for window's search 
# reference for API at http://dada.perl.it/#api and 
# http://webclub.kcom.ne.jp/ma/colinp/win32/fn2lib/f.txt

use Win32::API;
use Win32::GUI;

# setup the window
my $window = new Win32::GUI::Window(
                    -name   => "W_cogence",
                    -title  => "Cogence Project Search utility",
                    -left   => 0,  
                    -top    => 0,
                    -width  => 500, 
                    -height => 300
                   );
$window->AddButton($window,
            -name  => "B_Search", 
            -title => "Search",
            -top   => 100,
            -left  => 0,
            -height=> 20);

$window->Show();
my $windowHandle = Win32::GUI::GetActiveWindow();

my $exitcode = Win32::GUI::Dialog();

sub B_Search_Click
{

   # extract the method from the dll
   my $api = new Win32::API('comdlg32.dll','FindText',[P],I);

   # determine number of byte needed for the char string based on whet
+her
   # we are dealing with unicode or not. (pp 339 - Win32 Perl Programm
+ing by Dave Roth)
   my $charSize = 1+Win32::API::IsUnicode();


   # we want to build a structure like the following one
   #typedef struct {
   #     DWORD lStructSize; -> will be long
   #     HWND hwndOwner; -> handle of window (but what type is it ? we
+'ll assume long)
   #     HINSTANCE hInstance; -> long
   #     DWORD Flags; -> long
   #     LPTSTR lpstrFindWhat; -> see below / pointer to null terminat
+ed string
   #     LPTSTR lpstrReplaceWith;-> pointer to null terminated replace
+ string
   #     WORD wFindWhatLen; -> size of find buffer
   #     WORD wReplaceWithLen; ->size of replace buffer
   #     LPARAM lCustData; -> will be long - see below for details
   #     LPFRHOOKPROC lpfnHook; -> will be long (ignore)
   #     LPCTSTR lpTemplateName; -> will be long
   #} FINDREPLACE, *LPFINDREPLACE;
   $struct = pack("L4(c*)2l5",((0) x 4), ($charSize*128) x 2, ((0) x 5
+));
   
   FindText($struct);
}

# hInstance 
# If the FR_ENABLETEMPLATEHANDLE flag is set in the Flags member, 
# hInstance is a handle to a memory object containing a dialog box tem
+plate. 
# If the FR_ENABLETEMPLATE flag is set, hInstance is a handle to a mod
+ule 
# that contains a dialog box template named by the lpTemplateName memb
+er. 
# If neither flag is set, this member is ignored.

# lpstrFindWhat 
# Pointer to a buffer that a FINDMSGSTRING message uses to pass the nu
+ll 
# terminated search string that the user typed in the Find What edit c
+ontrol.
# You must dynamically allocate the buffer or use a global or static a
+rray so 
# it does not go out of scope before the dialog box closes. The buffer
+ should 
# be at least 80 characters long. If the buffer contains a string when
+ you 
# initialize the dialog box, the string is displayed in the Find What 
+edit control. 
# If a FINDMSGSTRING message specifies the FR_FINDNEXT flag, lpstrFind
+What contains 
# the string to search for. The FR_DOWN, FR_WHOLEWORD, and FR_MATCHCAS
+E flags indicate
# the direction and type of search. If a FINDMSGSTRING message specifi
+es the FR_REPLACE
# or FR_REPLACE flags, lpstrFindWhat contains the string to be replace
+d.
[download]

The other option is to find the desired files(*.doc, *.ppt...) and read their data thanks to OLE ->easy, done, but ever so slow. Is there any way to speed it up ?

Thanks for your time and patience...

Heureux qui, comme Ulysse, a fait un beau voyage
Ou comme celui-là qui conquit la Toison,
Et puis est retourné plein d'usage et raison,
Vivre entre ses parents le reste de son âge!
J. du Bellay, poëte angevin

Comment on Search tool Download Code

Replies are listed 'Best First'.
Re: Search tool by tachyon (Chancellor) on Jul 02, 2003 at 13:54 UTC
If you want FAST you don't want to automate the native windows search function. It is in a word woeful. First it recurses the directory tree for every search and second you can only AFAIK pass it a drive or list of dirves to search, thus is your target is C:\something\stuff_here\ you will search everything else on C:\ for no good reason. You can get a recursive search in a couple of lines with File::Find but if you want SPEED you recurse the tree periodically, store the results in a database sturcture and seacrh your DB to find your files. All you need to do is update the database periodically. This is the *nix approach of excellent tools like locate. Locate lives in the findutils GNU package and you can get a Win32 port of it from here amongst other places. For blinding speed you won't do a lot better. To be frank you will never use Win32 native search again. Get a port of grep while you are at it and then all you need to do is: `# update the locate DB C:\>locate -u # find whatever something you want.... C:\>locate some \| grep thing` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]
Re: Re: Search tool by sauoq (Abbot) on Jul 02, 2003 at 15:29 UTC
Although I love locate too, I don't see it as being very helpful to the OP as it only searches an index of file names, not the contents of the files. Keeping a searchable index of file contents isn't trivial (searching it is even less so) and the index itself could easily grow huge. Depending on how often files are updated, keeping the index from growing stale may be an issue, especially if searches must be able to find new data as soon as it is available. Recursing and searching file contents might really be best in his case. Getting a port of grep, as you suggested might well help though. -sauoq "My two cents aren't worth a dime.";	[reply]
Re: Re: Re: Search tool by tachyon (Chancellor) on Jul 03, 2003 at 02:15 UTC
If he wants to do content swish-e is hard to pass up. Native C indexing, stemming and all the goodies, with a perl API wrapper to format the results to boot so you can make it look like whatever you want using the language we love. There is a Win32 port for this too.... cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply]