thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow Monks,

I am having a problem that I am trying to solve. I am not really good with regex and I am facing a case that I need to solve it only with regex.

I am having a string e.g. 'Thanos1983+|Thanos1983+' that I want to split in 3 pieces, group 1 'Thanos1983+', group 2 '|' and group 3 'Thanos1983+'.

This can be easily done with the following code:

#!/usr/bin/perl use strict; use warnings; use feature 'say'; my $test = 'Thanos1983+|Thanos1983'; say "I found:\t\$1: '$1'\t\$2: '$2'\t\$3: '$3'" if $test =~ /(^[\w+\+]+)(\|)([\w+\+]+$)/; __END__ $ perl test.pl I found: $1: 'Thanos1983+' $2: '|' $3: 'Thanos1983+'

The problem that I am having, in case that the string ends at the column e.g. 'Thanos1983+|' how can I detect the last group? What I would like to see is:

$ perl test.pl I found: $1: 'Thanos1983+' $2: '|' $3: ''

I tried with the following regex /(^[\w+\ +]+)(\|)([\w+\+ ]+$)/ but this only works only if string contains white space e.g. 'Thanos1983+| '.

Unfortunately I need to solve this using only regex, and the reason is that this regex is called from another script that I am not able to change and will only accept a regex.

Any suggestions are greatly appreciated.

Thanks in advance for everyone time and effort, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Replies are listed 'Best First'.
Re: Split string in groups with non white space using regex
by tybalt89 (Monsignor) on Jan 04, 2018 at 11:46 UTC
    #!/usr/bin/perl -l # http://perlmonks.org/?node_id=1206677 use strict; use warnings; for ( 'Thanos1983+|Thanos1983', 'Thanos1983+|' ) { print; /([^|]*)(\|)([^|]*)/ and print "I found:\t\$1: '$1'\t\$2: '$2'\t\$3: + '$3'"; }

    Outputs:

    Thanos1983+|Thanos1983 I found: $1: 'Thanos1983+' $2: '|' $3: 'Thanos1983' Thanos1983+| I found: $1: 'Thanos1983+' $2: '|' $3: ''

      Hello tybalt89,

      Works perfectly, thanks for your time and effort.

      BR / Thanos

      Seeking for Perl wisdom...on the process of learning...not there...yet!

        Note that the string  '([^|]*)(\|)([^|]*)' also successfully parses  '|'  '||'  '|||'

        c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(pp); ;; my $rx_string = '([^|]*)(\|)([^|]*)' ; ;; for my $s ( 'Thanos1983+|', 'Thanos1983+| ', 'Thanos1983+| ', 'Thanos1983+|Thanos1983+', '|Thanos1983+', ' |Thanos1983+', ' |Thanos1983+', '+++|+++', '|', '||', '|||', ) { my $parsed = my @captured = $s =~ $rx_string; if ($parsed) { print qq{'$s' -> }, pp \@captured; } else { print qq{failed to parse '$s'}; } } " 'Thanos1983+|' -> ["Thanos1983+", "|", ""] 'Thanos1983+| ' -> ["Thanos1983+", "|", " "] 'Thanos1983+| ' -> ["Thanos1983+", "|", " "] 'Thanos1983+|Thanos1983+' -> ["Thanos1983+", "|", "Thanos1983+"] '|Thanos1983+' -> ["", "|", "Thanos1983+"] ' |Thanos1983+' -> [" ", "|", "Thanos1983+"] ' |Thanos1983+' -> [" ", "|", "Thanos1983+"] '+++|+++' -> ["+++", "|", "+++"] '|' -> ["", "|", ""] '||' -> ["", "|", ""] '|||' -> ["", "|", ""]


        Give a man a fish:  <%-{-{-{-<

Re: Split string in groups with non white space using regex
by karlgoethebier (Abbot) on Jan 04, 2018 at 12:00 UTC
    "...I am not really good with regex..."

    #MeToo - but didn't you miss a group:

    #!/usr/bin/env perl use strict; use warnings; use feature qw(say); my $string = q'Thanos1983+|Thanos1983+|Thanos1983+'; $string =~ m/(.+)\|(.+)\|(.+)/; say for ($1,$2,$3); __END__

    Untested. And i hope i didn't miss the point.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

      Hello karlgoethebier,

      To be honest I was just about to post another possible solution:

      #!/usr/bin/perl use strict; use warnings; use feature 'say'; my @tests = ('Thanos1983+|Thanos1983+', 'Thanos1983+| ', 'Thanos1983+|'); for (@tests) { say "I found:\t\$1: '$1'\t\$2: '$2'\t\$3: '$3'" if /(.*)(\|)(.*)/; } __END__ $ perl test.pl I found: $1: 'Thanos1983+' $2: '|' $3: 'Thanos1983+' I found: $1: 'Thanos1983+' $2: '|' $3: ' ' I found: $1: 'Thanos1983+' $2: '|' $3: ''

      Thanks for the tip of using m/(.+)(\|)(.+)/ this is also works for the first two cases but not for the third.

      Sample of code:

      #!/usr/bin/perl use strict; use warnings; use feature 'say'; my @tests = ('Thanos1983+|Thanos1983+', 'Thanos1983+| ', 'Thanos1983+|'); for (@tests) { say "I found:\t\$1: '$1'\t\$2: '$2'\t\$3: '$3'" if /(.+)(\|)(.+)/; } __END__ $ perl test.pl I found: $1: 'Thanos1983+' $2: '|' $3: 'Thanos1983+' I found: $1: 'Thanos1983+' $2: '|' $3: ' '

      Thank you for your time and effort.

      BR / Thanos

      Seeking for Perl wisdom...on the process of learning...not there...yet!

        (.+)\|? AKA one or none?

        «The Crux of the Biscuit is the Apostrophe»

        perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Split string in groups with non white space using regex
by lapazzo (Novice) on Jan 04, 2018 at 13:08 UTC
    Brother thanos, in accordance with lao tzu ch.2 i would do `perldoc -f split` and then print join( '-', split( /\|/, $test, -1) ), "\n";

      Hello lapazzo,

      Thank you for the time and effort. The script is working perfectly but in my case I can only apply a regex I can not use split.

      Thanks again for your time and effort.

      BR / Thanos

      Seeking for Perl wisdom...on the process of learning...not there...yet!