in reply to Re^2: Reliable glob?
in thread Reliable glob?

Hi Rob,

I took a look at the C source for bsd_glob and it does indeed truncate the input pattern in all cases, regardless of what options you use. That said, you're hitting an unusually short maximum buffer size. But that size is compiled into the C code and is not changeable at runtime. So as you suggest above you'll have to find some way to work around this if you're trying to support such platforms.

If you don't mind a CPAN dependency, there are several Perl-only glob implementations on CPAN you could explore. I tried out Text::Glob::Expand and it handled your input string without a problem. Even added a second trailing braces expansion to make it longer, and it was still okay:

use Text::Glob::Expand; my $x = 'cff_updated/1_lib/{A3DWE.1.Solexa-142587.splice.fastq,A3DWE.1 +.Solexa-142588.splice.fa­stq,A3DWE.1.Solexa-142589.splice.fastq,A3DWE +.1.Solexa-14 2590.splice.fastq,A3DWE.1.Solexa-1­42594.splice.fastq,A3DWE.1.Solexa-1 +42595.splice.fastq,A3DWE.1.Solexa-142596.splice.fastq,A­3DWE.1.Solexa +-142597.splice.fastq,A3DWE.1.Solexa-142598.splice.fastq,A3DWE.1.Solex +a-142599­.splice.fastq,A3DWE.1.Solexa-142600.splice.fastq,A3DWE.1.Sol +exa-142602.splice.fastq,A3DWE.­1.Solexa-142603.splice.fastq,A3DWE.1.S +olexa-142605.splice.fastq,A3DWE.1.Solexa-142606.spli­ce.fastq,A3DWE.1 +.Solexa-142607.splice.fastq,A3DWE.1.Solexa-142608.splice.fastq,A3DWE. +1.Sol­exa-142609.splice.fastq,A3DWE.1.Solexa-142610.splice.fastq,A3DW +E.1.Solexa-142611.splice.fa­stq,A3DWE.1.Solexa-142612.splice.fastq,A3 +DWE.1.Solexa-142613.splice.fastq,A3DWE.1.Solexa-1­42614.splice.fastq, +A3DWE.1.Solexa-142615.splice.fastq,A3DWE.1.Solexa-142616.splice.fastq +,A­3DWE.1.Solexa-142617.splice.fastq,A3DWE.1.Solexa-142618.splice.fas +tq,A3DWE.1.Solexa-142619­.splice.fastq,A3DWE.1.Solexa-142621.splice.f +astq}{.drp,.fna,.lib}'; my @y = map { $_->text } @{Text::Glob::Expand->parse($x)->explode}; print "Number of items: ", scalar @y, $/, join($/,@y);

In any case, hope you find a relatively painless way to deal with this. Cheers.

Replies are listed 'Best First'.
Re^4: Reliable glob?
by hepcat72 (Sexton) on Oct 28, 2014 at 15:02 UTC
    Cool. Thanks for the C-code lookup! Glad I'm not crazy. I have been trying to limit dependencies, but I could look at that module to see how they handle the '{}' expressions. My code should handle multiple occurrences, and assuming that there are no spaces and no nested expressions, I think it should theoretically work in every case. I'm not 100% on that though. Well, it doesn't handle escape curlies, but I'm not even sure a filename could have that... Whoops, yes they can. I just renamed a file to "tmpdelete{test}.txt". I dragged it to my terminal and it pasted it with escape characters. I guess I should make a minor edit to my code:

    #Keep updating an array to be the expansion of a file pattern to #separate files my @expanded = ($nospace_string); #If there exists a '{X,Y,...}' pattern in the string if($nospace_string =~ /(?<!\\)\{.+?(?<!\\)\}/) { #While the first element still has a '{X,Y,...}' pattern #(assuming everything else has the same pattern structure) while($expanded[0] =~ /(?<!\\)\{.+?(?<!\\)\}/) { #Accumulate replaced file patterns in @g my @buffer = (); foreach my $str (@expanded) { #If there's a '{X,Y,...}' pattern, split on ',' if($str =~ /(?<!\\)\{(.+?)(?<!\\)\}/) { my $substr = $1; my $before = $`; my $after = $'; my @expansions = split(/,/,$substr); push(@buffer,map {$before . $_ . $after} @expansions); } #Otherwise, push on the whole string else {push(@buffer,$str)} } #Reset @f with the newly expanded file strings so that we #can handle additional '{X,Y,...}' patterns @expanded = @buffer; } } #Pass the newly expanded file strings through return(wantarray ? @expanded : [@expanded]);


    Although, I just tested that nested expressions are possible too, so I would definitely like to check out that module.