in reply to Re^7: regex match unicode characters in ascii string
in thread regex match unicode characters in ascii string

Getting closer... Thank you! Sorry if I'm not clear enough, there's a lot of strings I'm matching against, but what I want remains the same. I want the group and role names in the hash. For instance, I used a string like this:

$string = "Role: Role Name▼▼Profile: Unnecessary_Extra_Stuff";

but it returned:

This is the Role Name:Role Name▼▼Profile Name: Unnecessary_Extra_Stuff

FYI, here's my code snippet:
while ( ${$ref}{$k}{'DIRTYDATA'} =~ /(Group|Role)\:\s+([\x00-\x7f]*)/g + ){ ${$ref}{$k}{$1} = $2; } print "\n\nThis is the Group Name: " . ${$ref}{$k}{Group}; print "\nThis is the Role Name: " . ${$ref}{$k}{Role};


I only want it to return Group or Role names, not Profile names and no extended ascii characters.

I did mention an array before, but it's storing fine in my HoH which I'm getting out of a fetchall hashref from DBI, so storage really isn't the problem, the problem is only matching role group names in these dirty little strings.

Thanks.
- 3dbc

Replies are listed 'Best First'.
Re^9: regex match unicode characters in ascii string
by poj (Abbot) on Jan 27, 2017 at 22:17 UTC

    What do you get if you print your string with Data::Dump like this

    #!perl use strict; use HTML::Entities; use Data::Dump 'pp'; my $string = decode_entities( "Role: Role Name▼▼Profile: Unnecessary_Extra_Stuff"); pp $string;
      Thanks for all the help Monks.

      This worked for me... (taken out of context, it's part of a much larger script...)

      ${$ref}{$k}{'DIRTYDATA'} =~ tr/\x09\x0A\x0D\x20-\x7E/ /c; while ( ${$ref}{$k}{'DIRTYDATA'} =~ /(Group|Role[\s2]*)\:\s(\w ++\s\w+)\s*/g ){ ${$ref}{$k}{$1} = $2; } print "\n\nThis is the group: " . ${$ref}{$k}{Group}; print "\nThis is the Role: " . ${$ref}{$k}{Role} . "\n\n";
      - 3dbc