S_Shrum has asked for the wisdom of the Perl Monks concerning the following question:

Background
=====================================

Ok...the good news is that I wrote a AnyData::FORMAT module called Directory that allows me (for the time being) to access directories and return various file information (path, basename, ext, size, etc) like database results. This work great with the DBI in listing files.

What I'm trying to do
=====================================

Here is my problem. Now that I have successfully returned all the files from a specified folder, I want to do a regex to filter the returned results. SQL::Statement supports a feature called RLIKE that allows for Perl-based regex therefore I'm going that way. Only problem is that I don't know how to create the expression; go figure.

Example
=====================================

I have a list of files like so:

bluff_park_000712-001.jpg
~bluff_park_000712-001.jpg
~bluff_park_000712-001.jpg.caption

This constitutes one complete set of which I have thousands. The first is the full-size image, the second is a thumbnail, and the third is a text file with a caption for the thumbnail.

When I display to my users the flying site location names, "Bluff Park" is what they will see. I need a way to take "Bluff Park", lower case it, replace <space> with <underscore> and add a <tilde> to the front of it...all in the regex (if possible). That is just setup. Then I need to take that and run it against all the entries to have it return only those that begin with that:

Evaluate all filenames for those starting with ~bluff_park

This should return:

~bluff_park_000712-001.jpg
~bluff_park_000712-001.jpg.caption

As for the extension portion, I've got that part figured out.

TIA

======================
Sean Shrum
http://www.shrum.net

Replies are listed 'Best First'.
(podmaster) Re: Need help with tricky regex creation
by PodMaster (Abbot) on May 14, 2002 at 08:33 UTC
    If i understand this correctly, you wanna turn Bluff Park into ~bluff_park? and then get all the files beginning with ~bluff_park?

    For the first part, use perlfunc:lc, use y/// and other goodies from perlop (like qw or q).

    If you want it all in one regex, lookup what .* is, and look into the /e modifier in perlre.

    For the 2nd part, use File::Find, or file globbing, or look into perlfunc:grep and what ^ and $ mean in a regex
     

    Look ma', I'm on CPAN.


    ** The Third rule of perl club is a statement of fact: pod is sexy.
Re: Need help with tricky regex creation
by csotzing (Sexton) on May 14, 2002 at 10:59 UTC
    It sounds like you're just interested in the regex stuff? Can you do this?--
    $site = "Bluff Park"; lc($site); # converts to lowercase $site =~ s/\s/_/g; # substitutes spaces with _ foreach (@entries) { if (/^~$site/) { # have a match! } }

    if you don't want to lowercase (lc) the expr, you could just ignore case in the regex:
    /^~$site/i
    Hope that helps.

      I think everyone is missing the finer detail...

      The regex isn't going to be written in a script...it's being passed via CGI to be handled by a perl script...I am utilizing the RLIKE ability of the SQL::Statement that allows for passing Perl regex SQL WHERE expressions:

           http://myserver.com/cgi-bin/myscript.pl?where=Site RLIKE '^bluff'

      This works to return all Site entries that begin with '~bluff'. Building on this idea:

           ...myscript.pl?where=Site RLIKE 'm/^~Bluff Park/i'

      The sample above seems (logically, based on my limited XP with regex) like it would almost work but it does not handle the SPACE to UNDERSCORE substitutions. It looks for matches begining with the defined text string and ignores case (the '/i', if I am reading the camel book right).

      So what I need to know is if there a way to include a s/// call in the regex expression or via some other function, get the SPACES to be replaced with UNDERSCORES?

      TIA

      ======================
      Sean Shrum
      http://www.shrum.net

        Is this the part where I get to say Danger Will Robinson?

        Don't forget to untaint that or you are going to hurt yourself. Passing arbitrary user input into SQL is like eating things given to you by random strangers. Don't do it.
Re: Need help with tricky regex creation
by S_Shrum (Pilgrim) on May 14, 2002 at 06:49 UTC

    Opps....part of the text dropped out:

    Section: Example

    I need a way to take "Bluff Park", lower case it, replace SPACE with UNDERSCORE and add a TILDE to the front of it...all in the regex (if possible).

    Sorry 'bout that...shoulda caught it in the preview.

    ======================
    Sean Shrum
    http://www.shrum.net