fieroboom has asked for the wisdom of the Perl Monks concerning the following question:
Hello wonderful monks, I've gleaned LOADS of great information here on my journey of Perl, but I have a question of my own now... I have a script that (among many other things) reads in a file with a list of words, then creates a regex string from those words. Here is the code I have now, which is working fine, but I wonder if there's a more direct way to do this, rather then filehandle -> array -> join & map array to scalar...
my $blacklist_file = 'BlacklistWords.txt'; open(BLIST, "<$blacklist_file") or die("Can't open $blacklist_file for + reading!!\n\n"); my @blacklist_words = <BLIST>; close(BLIST); chomp(@blacklist_words); my $blacklist_regex = join"|" => map {"(?:$_)"} @blacklist_words; # Cr +eate a regex from blacklisted words print "blacklist regex:\n$blacklist_regex\n\n"; exit;
Here is an example of the regex string I'm after:
blacklist regex: (?:LOL)|(?:XviD-RUBY)|(?:WEB-DL)|(?:H264)|(?:BluRay)|(?:x264)|(?:YIFY) +|(?:DVDRip)|(?:MP3)|(?:ENG)|(?:DvDripaXXo)|(?:BRRiP)|(?:XviD)|(?:AbSu +rdiTy)|(?:WEBRip)|(?:XviDETRG)|(?:XviD-ILLUMINATI)|(?:XviDExtraTorren +tRG)|(?:AC3-3LT0N)|(?:XViD-PLAYNOW)|(?:XVIDSSB)|(?:XViD-SSB)|(?:BDRip +)|(?:XviD-3LT0N)|(?:KillerRG)|(?:XviD-AMIABLE)|(?:x264-AVS720)|(?:Xvi +D-NEUTRINO)|(?:3Li)|(?:DTS)|(?:x2643Li)|(?:GAZ)|(?:XviD-AWESOMENESS)| +(?:XviDSCREAM)|(?:UnKnOwN)|(?:DVDRip_XviD)|(?:AZnTX)|(?:HDTV)|(?:x264 +LOL)|(?:ettv)|(?:R5)|(?:x264-LOL)|(?:PROPER)|(?:x264-2HD)|(?:XviD-AFG +)|(?:x264-mSD)|(?:P2PDL)|(?:x264-DHD)|(?:PublicHD)|(?:x264-MiNDTHEGAP +)|(?:hdtv-lol)|(?:xvid-xor)|(?:psychodrama)|(?:hdtv_xvid-fov)|(?:repa +ck-lol)|(?:rerip)|(?:xvid-ctu)|(?:Lo-Fi)|(?:X264-DIMENSION)|(?:_evid) +|(?:TorrentDay)|(?:XviD-MOMENTUM)
Basically just a list of non-capturing groups. Of course, I suppose I could make it a single non-capturing group for a little more efficiency, but that's another subject... Anyway, the question is, am I doing this the most PERLitically correct way, or is there a better way to go from <BLIST> to $blacklist_regex? Thanks so much!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Best approach to creating a regex from a filehandle
by choroba (Cardinal) on May 18, 2014 at 19:15 UTC | |
|
Re: Best approach to creating a regex from a filehandle
by NetWallah (Canon) on May 18, 2014 at 21:06 UTC | |
|
Re: Best approach to creating a regex from a filehandle
by toolic (Bishop) on May 18, 2014 at 19:15 UTC | |
by smls (Friar) on May 18, 2014 at 20:26 UTC | |
by toolic (Bishop) on May 18, 2014 at 22:15 UTC | |
|
Re: Best approach to creating a regex from a filehandle
by fieroboom (Novice) on May 21, 2014 at 12:12 UTC |