in reply to regex anchoring issue
G'day penguin-attack,
Welcome to the monastery.
Firstly, your data description seems a little ambiguous: you say "end character" then describe <SOH> (5 chars), ^A (2 chars) and Ctrl-A (1 char). If, by <SOH>, you mean the ASCII character - that is the same character as Ctrl-A (i.e. the character with the ASCII value of 1).
Your main problem in your regexp is the use of a character class (i.e. [...]) - see Character Classes and other Special Escapes under perlre - Regular Expressions for details. You also don't need the 'g' modifier in either the match (m/.../) or the split function.
The following script does what I think you want (in terms of identifying the line endings). If not, please provide some sample data with expected output to remove the ambiguity I mentioned at the start.
#!/usr/bin/env perl use 5.010; use strict; use warnings; my $soh_string = 'soh_string<SOH>'; my $caret_a_string = 'caret_a_string^A'; my $ctrl_a_string = 'ctrl_a_string' . chr(1); my $test_string = join('', $soh_string, $caret_a_string, $ctrl_a_string, $caret_a_string, $ctrl_a_string, $soh_string, $ctrl_a_string, $soh_string, $caret_a_string ); my $string_re = qr{(?><SOH>|\^A|\cA)}; say for split $string_re => $test_string;
Output:
$ pm_soh_split.pl soh_string caret_a_string ctrl_a_string caret_a_string ctrl_a_string soh_string ctrl_a_string soh_string caret_a_string
-- Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: regex anchoring issue
by BillKSmith (Monsignor) on Feb 15, 2013 at 14:01 UTC | |
by kcott (Archbishop) on Feb 16, 2013 at 06:35 UTC | |
|
Re^2: regex anchoring issue
by smls (Friar) on Feb 15, 2013 at 11:23 UTC | |
by kcott (Archbishop) on Feb 16, 2013 at 05:58 UTC |