Regex stored in a scalar

OtakuGenX has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Regex stored in a scalar
by Laurent_R (Canon) on Aug 21, 2015 at 19:26 UTC

A regex should be between / / marks, so something like this:
$line =~ /$regex/;
[download]

s/\),\(/\)\n\(/
[download]

So if you want to make substitutions you probably want to capture two inputs from the user or from the command line, the searched pattern and the substitution (not tested).

my $regex = <STDIN>;
chomp $regex;
my $subst = <STDIN>;
chomp $subst;
while (<INPUTFILE>) {
    my $line = $_;
    s/$regex/$subst/gi;
}
[download]

o

Update: Oh, BTW, if you're used to do it in vi, you might consider sed. You should feel at home.

Update 2: I crossed out the first part of my answer, as it was at least very incomplete, as kindly pointed out by AnomalousMonk. Only the second part was really relevant to the OP problem.

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar

by kroach (Pilgrim) on Aug 22, 2015 at 10:49 UTC

$regex = qr/$regex/
[download]

[reply]
[d/l]

Re^3: Regex stored in a scalar

by Laurent_R (Canon) on Aug 22, 2015 at 11:35 UTC

Yes, you're probably right, ++, this was just a quick additional note for speed, not much to to with the OP question.

[reply]

Re^2: Regex stored in a scalar

by AnomalousMonk (Archbishop) on Aug 22, 2015 at 14:34 UTC

A regex should be between / / marks ...

The =~ operator is sufficiently DWIMic that it will take any string as a regex. It will even take all the qr// regex modifiers if they are embedded as (?adlupimsx-imsx) extended patterns.

c:\@Work\Perl>perl -wMstrict -le
"my $regex = '(?xms) ((.) \2{2,})';
 ;;
 for my $s (qw(aeiou aeeiou aeiiiou aeioooou)) {
   print qq{match: captured '$1'} if $s =~ $regex;
   }
"
match: captured 'iii'
match: captured 'oooo'
[download]

... but that won't work ...

The first comment I made is actually rather trivial in the face of your second point; the substitution s/\),$/$\n\(/ is, indeed, a substitution and not a regex — and bang, the whole endeavor hits a brick wall.

Give a man a fish: <%-{-{-{-<

[reply]
[d/l]
[select]

Re^3: Regex stored in a scalar

by Laurent_R (Canon) on Aug 22, 2015 at 16:17 UTC

AnomalousMonk

if (m{pattern}) { # ...
[download]

You're absolutely right, the first part of my comment stood for correction.

[reply]
[d/l]

Re^2: Regex stored in a scalar

by girarde (Hermit) on Aug 22, 2015 at 14:33 UTC

sed

[reply]
[d/l]

Re^3: Regex stored in a scalar

by Laurent_R (Canon) on Aug 22, 2015 at 16:32 UTC

sed

awk

I think that, in general, tests are required to decide the best way to go (if it matters at all, e.g. if your files to be processed are really so large that it will make a significant difference for you).

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar

by itsscott (Sexton) on Aug 25, 2015 at 02:53 UTC

I do like your sed suggestion. I have a snippet I've used for 20 years to either process 1 or many files, and it's fast and low load. I've changed 10's of thousands of files on a server in mear moments (after extensive testing of course!! to save restoring)

This command will find and replace the string 'old' with 'new' in all files with the htm/html extension recursively from where you run the command. be careful, there's no undo! Use your regex as usual. Hopefully someone will find this snippet useful, I sure have 1000's of times!

find . -name '*.htm*' -type f | xargs sed -i 's/old/new/g'
[download]

[reply]
[d/l]

Re^2: Regex stored in a scalar

by poj (Abbot) on Aug 22, 2015 at 16:35 UTC

The OP has )\n( as the substition. I can't see how to enter that with my $subst = <STDIN>;
poj

[reply]
[d/l]
[select]

Re^3: Regex stored in a scalar

by Laurent_R (Canon) on Aug 22, 2015 at 17:02 UTC

\n

$ perl -de 42 foo\\n

Loading DB routines from perl5db.pl version 1.33
Editor support available.

Enter h or `h h' for help, or `man perldebug' for more help.

main::(-e:1):   42
  DB<1> $c = shift;

  DB<2> x $c
0  'foo\\n'
  DB<3> $c =~ s|\\n|\n|;

  DB<4> x $c
0  'foo
'
  DB<5>
[download]

[reply]
[d/l]
[select]

Re^4: Regex stored in a scalar

by AnomalousMonk (Archbishop) on Aug 22, 2015 at 18:15 UTC

Re^5: Regex stored in a scalar

by Laurent_R (Canon) on Aug 22, 2015 at 18:53 UTC

Re: Regex stored in a scalar
by Anonymous Monk on Aug 21, 2015 at 19:36 UTC

If the only thing you're doing inside the loop is one or two regex substitutions, and the way you describe it, the scripts sound like they're throwaways, you may want to look at the -e, -p and maybe also -i switches in perlrun, i.e. write one-liners:

$ cat foo.txt 
one
two
three
$ perl -wMstrict -pe 's/^t(?!h)/th/; s/(.)\1/$1/g' foo.txt > bar.txt
$ cat bar.txt
one
thwo
thre
$ perl -wMstrict -pe 's/th/ph/g' -i.bak bar.txt
$ cat bar.txt
one
phwo
phre
$ cat bar.txt.bak 
one
thwo
thre
[download]

[reply]
[d/l]
[select]

Re: Regex stored in a scalar
by BillKSmith (Monsignor) on Aug 22, 2015 at 03:43 UTC

use strict;
use warnings;
my $string = '(abc),(def),(ghi)';
my $substitution = 's/\),\(/\)\n\(/gi'; 
$_ = $string;
eval "$substitution";
print;
[download]

OUTPUT:

(abc)
(def)
(ghi)
[download]

Bill

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar

by BillKSmith (Monsignor) on Aug 22, 2015 at 18:53 UTC

use strict;
use warnings;
my $regex=<STDIN>;                    #Entering s/\),\(/\)\n\(/gi 
chomp $regex;
open (INPUTFILE, "< $filein");
while (<INPUTFILE>) {
    my $line=$_;
    #$line =~ $regex;
    eval "\$line =~ $regex";
};
[download]

Bill

[reply]
[d/l]

Re: Regex stored in a scalar
by atcroft (Abbot) on Aug 22, 2015 at 06:15 UTC

I wanted to do something similar to this recently, but with the left and right-hand patterns stored in a database. The problem I ran into, however, was if I tried to use capture variables, such as the following (contrived) example:

Read more... (1240 Bytes)

Any suggestions?

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar

by Athanasius (Cardinal) on Aug 22, 2015 at 07:41 UTC

Hello atcroft,

The only way I can find to do this is to pull the substitution apart into its component steps and perform these separately:

#! perl
use strict;
use warnings;

my $c = q{asdfghjk};

my @regex =
(
    { lh => q{(gh)}, rh => q{__$1__}, }, 
    { lh => q{(h_)}, rh => q{_h!$1!}, }, 
);

print q{Original: }, $c, "\n";

for my $i (0 .. $#regex)
{
    if ($c =~ /$regex[$i]{lh}/)
    {
        my $s =  $1;
        my $d =  $regex[$i]{rh};
           $d =~ s/\$1/$s/;
           $c =~ s/$regex[$i]{lh}/$d/;
    }
}

print q{Final: }, $c, "\n";
[download]

Output:

17:37 >perl 1352_SoPW.pl
Original: asdfghjk
Final: asdf__g_h!h_!_jk

17:39 >
[download]

This is far from elegant, and I keep thinking there must be a simpler way involving s///ee — but I haven’t found it.

Anyway, hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^3: Regex stored in a scalar

by AnomalousMonk (Archbishop) on Aug 22, 2015 at 13:55 UTC

One way to one step per regex:

c:\@Work\Perl>perl -wMstrict -le
"my $c = q{asdfghjk};
 print qq{    original: '$c'};
 ;;
 my @regex = (
   { lh => q{(gh)}, rh => q{__$1__}, },
   { lh => q{(h_)}, rh => q{_h!$1!}, },
   );
 ;;
 for my $hr_s (@regex) {
   $c =~ s[ (?-x)$hr_s->{lh}]{ qq{qq{$hr_s->{rh}}} }xmsgee;
   print qq{intermediate: '$c'};
   }
 ;;
 print qq{       final: '$c'};
"
    original: 'asdfghjk'
intermediate: 'asdf__gh__jk'
intermediate: 'asdf__g_h!h_!_jk'
       final: 'asdf__g_h!h_!_jk'
[download]

s///e

s///ee

eval

AnonyMonk

here

Re: Evaluating $1 construct in literal replacement expression

Give a man a fish: <%-{-{-{-<

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar ( s///eeval )

by Anonymous Monk on Aug 22, 2015 at 09:22 UTC

$_ = 'foo';
$left  = '(.)(.)';
$right = '$1$2$2$1';
s{$left}{"qq{$right}"}ee;
print "$_\n";
s{$left}{eval "qq{$right}"}e;
print "$_\n";
__END__
foofo
foofofo
[download]

first /e turns "" into a string qq{$1$2$2$1}

second /e interpolates qq{$1$2$2$1} at the correct time and substitutes into the original string

string eval is eval so arbitrary code could be executed

So, to make it safer, instead of eval ... use some form of String::Interpolate/String::Interpolate::RE

[reply]
[d/l]

Re^3: Regex stored in a scalar ( s///eeval )

by Athanasius (Cardinal) on Aug 23, 2015 at 03:52 UTC

Thanks Anonymous Monk and AnomalousMonk,

So, the technique is to doubly double-stringify the RHS before doubly evaluating it! Analogous to the trick of using @{ [...] } to interpolate a function-returned list into a string.

I like String::Interpolate (the module, not its documentation!):

#! perl
use strict;
use warnings;
use String::Interpolate qw( interpolate );

my $c = q{asdfghjk};

my @regex =
(
    { lh => q{(gh)}, rh => q{__$1__}, }, 
    { lh => q{(h_)}, rh => q{_h!$1!}, }, 
);

print q{Original: }, $c, "\n";

for my $i (0 .. $#regex)
{
    $c =~ s/ $regex[$i]{lh} / interpolate($regex[$i]{rh}) /ex;
}

print q{Final: }, $c, "\n";
[download]

Output:

13:37 >perl 1352_SoPW.pl
Original: asdfghjk
Final: asdf__g_h!h_!_jk

13:37 >
[download]

Cheers,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re: Regex stored in a scalar
by 1nickt (Canon) on Aug 21, 2015 at 18:57 UTC

What is in $regex?

try

$line =~ /$regex/;
[download]

The way forward always starts with a minimal test.

[reply]
[d/l]

Re: Regex stored in a scalar
by james28909 (Deacon) on Aug 21, 2015 at 19:43 UTC

use strict;
use warnings;

print "Enter left side for search: ";
my $LeftSide = <STDIN>;
print "Enter right side replacement: ";
my $RightSide = <STDIN>;

chomp($LeftSide);
chomp($RightSide);

while(<DATA>){    
    print if s/$LeftSide/$RightSide/g;
    #print "Replaced \"$LeftSide\" with \"$RightSide\" at: line $.: $_
+" if ($_ =~ s/$LeftSide/$RightSide/g);
}

__DATA__
hello
this. is line 2
line, 3 ),(
this is test for line 4 )
testing line ,5
now testing line (6)
[download]

script.pl \),\n\(

C:\Users\James\Desktop>test.pl \),\( \)\n\(
Replaced "\),\(" with "\)\n\(" at: line 3 "line, 3 \)\n\("
[download]

Posting some example input would be helpful :)

[reply]
[d/l]
[select]

Re^2: Regex stored in a scalar

by Laurent_R (Canon) on Aug 21, 2015 at 20:11 UTC

Hum, I was just going to tell something to the effect that your script was not very useful, but as I hit the reply button, I just saw your edited version. This is indeed much more to the point.

[reply]

Re^3: Regex stored in a scalar

by james28909 (Deacon) on Aug 21, 2015 at 20:15 UTC

I tested it then noticed that he was indeed trying to search and replace. Did a ninja edit ; I was also about to change ARGV's to <STDIN>'s and chomp them as well.

[reply]

Re^2: Regex stored in a scalar

by poj (Abbot) on Aug 21, 2015 at 20:20 UTC

__DATA__
(123),(456),(789)
(abc),(def),(ghi)
[download]

(123)
(456)
(789)
(abc)
(def)
(ghi)
[download]

[reply]
[d/l]
[select]

Re^3: Regex stored in a scalar

by james28909 (Deacon) on Aug 21, 2015 at 21:21 UTC

print $file $_ if s/$LeftSide/$RightSide/eegi;
[download]

~~And then left side is just a comma ',' without the quotes and right side is "\n" WITH the quotes~~

EDIT: Using above little snippet, Use left side as \),$ and right side as "$\n\(". And from what I understand the above snippet is the same as doing:

print $_ if s/$LeftSide/eval $RightSide/egi;
[download]

So I also suggest reading PerlDoc: eval.

Heres another more hackish way to get the job done haha:

Create your main script like so, with keywords in the s///

# main.pl

# all this is just a template that creates "run.pl"
use strict;
use warnings;

while(<DATA>){    
    print $_ if s/search_here/replace_here/gi;
}

__DATA__
(123),(456),(789)
(abc),(def),(ghi)
[download]

Then this following script will search and replace the keywords "search_here" and "replace_here" in the script above, with whatever you input and put it in "run.pl"!

# prepare_run.pl

open my $file, '+<', 'main.pl'; #your original script we will replace 
+keywords
open my $run, '+>', 'run.pl'; #newly created script that we will execu
+te below

print "Enter left side of s///: ";
chomp(my $LeftSide = <STDIN>);

print "Enter right side of s///: ";
chomp(my $RightSide = <STDIN>);

while(my $line = <$file>){
print $run $line if $line !~ /.*search_here.*/ || /.*replace_here.*/;
print $run $line if $line =~ s/(.*)search_here(.*)/$1$LeftSide$2/ &&  
+   $line =~ s/(.*)replace_here(.*)/$1$RightSide$2/;
}

close($file);
close($run);

system("run.pl"); #or whatever the the equivalent of your OS.
[download]

Here is the script that the above will create and run:

# run.pl, will be created after running "prepare_run.pl" while using "
+main.pl" as a template.

use strict;
use warnings;

while(<DATA>){    
    print $_ if s/\),\(/\)\n\(/gi;
}

__DATA__
(123),(456),(789)
(abc),(def),(ghi)
[download]

Download all the above and then just run "prepare_run.pl" and it will copy lines from main.pl while replacing keywords with your regex from STDIN and put it all in run.pl for execution. You can use \),$ and $\n\( for STDIN per normal without using any quotes.

Here is the output:

(123)
(456)
(789)
(abc)
(def)
(ghi)
[download]

[reply]
[d/l]
[select]

Re^4: Regex stored in a scalar

by poj (Abbot) on Aug 22, 2015 at 19:22 UTC

Re^5: Regex stored in a scalar

by james28909 (Deacon) on Aug 22, 2015 at 20:03 UTC

Re: Regex stored in a scalar
by anonymized user 468275 (Curate) on Aug 24, 2015 at 14:08 UTC

getopt

One world, one people

[reply]

Re: Regex stored in a scalar
by OtakuGenX (Initiate) on Aug 25, 2015 at 17:08 UTC

OK I ended up going with s/$search/$replace/g . I guess I had hoped to do it the other way simply for ease. Thank you ALL you comments are AWESOME and help a TON!!!

[reply]