Glob filespec

bangers has asked for the wisdom of the Perl Monks concerning the following question:

I appreciate this is a little off topic, but a colleague and I've googled for most of this morning with no joy. This is my last port of call, so any help would be gratefully appreciated.

I have a process running on Debian which uses glob to scan for incoming files of a certain type. Currently we use a file spec something like *_{process,read}_* This works great on files with names like:
abc_read_today.dat

The trouble is, that for operational reasons, the format is going to change so that a file could now be called:
abc_leave_today+abc_read_today.dat

We only want to process a file if it has 'read' or 'process' before the first '+'. Does anyone have any ideas on this?

We have looked into doing a glob on the file spec, then splitting the file on the '+' and doing a Perl regex on the file spec. This is the plan of last resort as I am not 100% happy that we can reliably convert the file specs to Perl regexs.

As I said, sorry that this isn’t strictly Perl, but it is in relation to a Perl process.

Comment on Glob filespec

Replies are listed 'Best First'.
Re: Glob filespec by blazar (Canon) on May 03, 2006 at 11:16 UTC
glob plainly emulates shell globbing, and does not work with regexen. You can either use File::Find (or its relatives File::Find::Rule and File::Finder) even if you do not need to recurse, or just opendir, readdir and grep on filenames yourself. Or else, now that I think of it, shouldn't `_{process,read}[+_]` work? Well, not exactly, because it would give false positives if `"+"` were not the first one. Maybe it's enough for you, anyway...	[reply] [d/l] [select]
Re^2: Glob filespec by bangers (Pilgrim) on May 03, 2006 at 16:19 UTC
Thanks for your suggestions. Unfortunately File::File etc won’t work as we don’t want to change several 1,000 file specs ( sorry if some of the restriction seem arbitrary, but there are good reasons for them) In the end we decided to use the file spec to pull back a super set of what we wanted. We then converted any '' into '(.?)' and did a regex. If $1 contains a '+' then we exclude the file e.g. `my $spec = ‘_{process,read}_’; my $reg = $spec; $reg =~ s/\/(.?)/g; my @use; for my $file ( glob $spec ) { $file = m/$reg/; push @use, $file unless $1 =~ /\+/; }` [download] Note: That’s a simplification of the code, which works, I haven’t tested or run the code above. It’s just for illustration here. I suppose in the end it was a PERL question after all.	[reply] [d/l]
Re^3: Glob filespec by blazar (Canon) on May 04, 2006 at 10:18 UTC
Thanks for your suggestions. Unfortunately File::File etc won’t work as we don’t want to change several 1,000 file specs ( sorry if some of the restriction seem arbitrary, but there are good reasons for them) To be fair I don't understand your concerns since I don't have the slightest idea about what you mean with "to change several 1,000 file specs". I suspect that you, in turn, did misunderstood the suggestion about File::Find. `my $spec = ‘_{process,read}_’;` [download] Please use real single quotes: what are you using as an editor?!? `my @use; for my $file ( glob $spec ) { $file = m/$reg/; push @use, $file unless $1 =~ /\+/; }` [download] This won't work, since since `{process,read}` does not do what you seem to think it does, in a regex. You probably want `my @use=grep !/[^+]?_(?:process\|read)_/, glob $spec;` [download] But then you should be aware that you're duplicating your efforts, performing two very similar pattern matches one after the other. Although I'm a big advocate of glob whereas I often see people do unnecessary* opendirs and readdirs, in this case I feel like suggesting you to follow that path... I suppose in the end it was a PERL question after all. No, it was not a "PERL" question, since there's not such a thing. Check `perldoc -q 'difference between "perl" and "Perl"'` [download] and while you're there, PERL as shibboleth and the Perl community.	[reply] [d/l] [select]