Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: Efficient regex search on array table

by kcott (Archbishop)
on Dec 16, 2022 at 13:58 UTC ( [id://11148923] : note . print w/replies, xml ) Need Help??


in reply to Re^2: Efficient regex search on array table
in thread Efficient regex search on array table

Thankyou for your kind words. By the way, instead of "failings" (negative); think "opportunities for improvement" (positive).

Take a look at "perlperf - Perl Performance and Optimization Techniques". There's a lot of information on benchmarking and profiling tools. Use these to determine what's fast and what's slow, where bottlenecks occur, and so on. This is a much better approach than going on gut-feeling, anecdotal evidence, and the like.

My $work often involves dealing with biological data (tends to be measured in GB, rather than MB). Functions which return large datasets are a red-flag to me; references to such data are nearly always a better choice.

I had thought that queries like "(hollow log)|(fence)" would result in regexes like "/(?:hollow log|fence)/". There was no indication that anything more complex was involved. Your new information indicates that's not the case. For your keyword searches, I'd still recommend index(); when using anchors (^, \b, etc.), and such like, regexes are probably the correct approach.

I recommend you change "Use PERL regex" to "Use Perl regex": Perl is the language; perl is the program; PERL is not a thing. :-)

Good luck with your continued optimisation efforts; and, of course, do ask if you need further help.

— Ken