A single sample input is usually not enough to reliably design a regex (see also). <update> Please use <code> tags when posting sample input. Also, can't you get this data in a more parseable format? </update> I have made the following assumptions:
use warnings; use strict; my $str = q{eab12345 id=00000 pgrp=abcdefgh groups=abcdefgh home=/home +/eab12345 shell=/usr/bin/ksh gecos=AB/C/Y0000/ABC/XYZ RTYUI, LMNOP *C +ONTRACTOR* (AS 00000) auditclasses=general,files,TCPIP login=true su= +true rlogin=true daemon=true admin=false sugroups=ALL admgroups= tpat +h=nosak ttys=ALL expires=0 auth1=SYSTEM auth2=NONE umask=00 registry= +AD SYSTEM=AD logintimes= loginretries=5 pwdwarntime=5 account_locked= +false minage=0 maxage=13 maxexpired=0 minalpha=1 minother=1 mindiff=1 + maxrepeats=2 minlen=8 histexpire=13 histsize=8 pwdchecks= dictionlis +t=/abc/def/ghi/jkl default_roles= fsize=-1 cpu=-1 data=-1 stack=65536 + core=000000 rss=65536 nofiles=2000 time_last_login=1512632113 time_l +ast_unsuccessful_login=1505304923 tty_last_login=ssh tty_last_unsucce +ssful_login=ssh host_last_login=0.000.000.000 host_last_unsuccessful_ +login=0.000.000.000 unsuccessful_login_count=0 roles= }; my $REGEX = qr{ (?| # treat beginning of string as a key only \A \s* (?<key> \w+ ) \s* | # otherwise, a normal key=value pair (?<= \s ) # key must be preceded by a space (?<key> \w+ ) \s* = \s* (?<value> # a value may not look like another key=value (?: (?! \s* \w+ = ) [^=] )* ) \s* ) }msx; pos($str)=undef; while ( $str =~ /\G$REGEX/gc ) { print "<", $+{key}, "> = <", $+{value}//'undef', ">\n"; } die "failed to parse at pos ".pos($str) unless pos($str)==length($str);
<eab12345> = <undef> <id> = <00000> <pgrp> = <abcdefgh> <groups> = <abcdefgh> <home> = </home/eab12345> <shell> = </usr/bin/ksh> <gecos> = <AB/C/Y0000/ABC/XYZ RTYUI, LMNOP *CONTRACTOR* (AS 00000)> <auditclasses> = <general,files,TCPIP> <login> = <true> <su> = <true> <rlogin> = <true> <daemon> = <true> <admin> = <false> <sugroups> = <ALL> <admgroups> = <> <tpath> = <nosak> <ttys> = <ALL> <expires> = <0> <auth1> = <SYSTEM> <auth2> = <NONE> <umask> = <00> <registry> = <AD> <SYSTEM> = <AD> <logintimes> = <> <loginretries> = <5> <pwdwarntime> = <5> <account_locked> = <false> <minage> = <0> <maxage> = <13> <maxexpired> = <0> <minalpha> = <1> <minother> = <1> <mindiff> = <1> <maxrepeats> = <2> <minlen> = <8> <histexpire> = <13> <histsize> = <8> <pwdchecks> = <> <dictionlist> = </abc/def/ghi/jkl> <default_roles> = <> <fsize> = <-1> <cpu> = <-1> <data> = <-1> <stack> = <65536> <core> = <000000> <rss> = <65536> <nofiles> = <2000> <time_last_login> = <1512632113> <time_last_unsuccessful_login> = <1505304923> <tty_last_login> = <ssh> <tty_last_unsuccessful_login> = <ssh> <host_last_login> = <0.000.000.000> <host_last_unsuccessful_login> = <0.000.000.000> <unsuccessful_login_count> = <0> <roles> = <>
In reply to Re: Splitting a long row with multiple delimiters.
by haukex
in thread Splitting a long row with multiple delimiters.
by dipit
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |