in reply to Fastest way to minimally check that file contains perl code?
For me it is absolutely enough to know that, say, file XXX is a perl source file with a probability of 80%. ... I tried file and ohlohcount utilities as well, but its assumptions is VERY inaccurate.
How inaccurate is file exactly? What test cases did it have trouble with? I would have thought that if you don't need accuracy, something like checking the first couple of lines against a regex that looks for the shebang line and/or some use statements might be enough for ~80%, but I think you'll need to be more specific in your question for more specific answers. Only perl can parse Perl; I think solutions like PPI will likely be slower. You could try PPR:
use PPR; my $perl_code = <<'END'; #!/usr/bin/env perl use warnings; use strict; print "Hello, World!\n"; END if ( $perl_code =~ m{ \A (?&PerlDocument) \Z $PPR::GRAMMAR }x ) { print "It looks like it could parse as Perl\n" } else { print "It doesn't look like Perl\n" }
|
|---|