in reply to regex match unicode characters in ascii string

Alternatively just select what you want rather than deleting what you don't.

my $string = "Group: Group Name▼▼Role: Role Name";
while ( $string =~ /(Group|Role)\:\s+([\x00-\x7f]*)/g ){ print "$1 = $2\n"; };

Replies are listed 'Best First'.
Re^2: regex match unicode characters in ascii string
by 3dbc (Monk) on Jan 27, 2017 at 20:17 UTC
    Thanks for posting, but I used that regex and it returned:

    Group:= Group Name▼▼Role: Role Name

    Trying to get all the group, role names into a group / role keys within a DBI $hashref which I already have, but want to add these key value pairs that have values without the Group: / Role: identifier and minus any of the extended ascii characters. Kind of cleaning up the values and organizing the data so I can work on it elsewhere.
    - 3dbc

      Ok . try this

      #!perl use strict; use HTML::Entities; use Data::Dump 'pp'; my $string = "Group: Group Name▼▼Role: Role Name"; $string = decode_entities($string); my @f=(); while ( $string =~ /(Group|Role)\:\s+([\x00-\x7f]*)/g ){ push @f,$2; }; pp \@f;
      poj
        Thanks works well, but is only catching the last occurrence... meaning it only matches the role name, not the group name too. Need it to match both and ignore anything else so that I can update the DBI hashref with this info.
        - 3dbc