Re^2: Improved regexp sought

Point taken merlyn.

I have a file with multiple lines in. Each line consists of a variable number of variable-length fields separated by a + character. Each line is terminated by a ' character. Sometimes, a field might have a ' character in it - if so, the ' is preceeded by a question-mark. Here are some example lines:

0010+2+O'Reilly'
023++++234+35+White+++17+'
g?'day mate+++'
[download]

I want to break each line up into its constituent fields. I can do it with brute force, but would prefer elegance.

Thanks
Myomancer

Comment on Re^2: Improved regexp sought Download Code

Replies are listed 'Best First'.
Re^3: Improved regexp sought by duff (Parson) on Oct 27, 2004 at 14:39 UTC
Mayhap you want to take a multi-step approach. `$string =~ s/'$//; $string =~ s/\?'/'/g; @fields = split /\+/, $string;` [download] I want to break each line up into its constituent fields. I can do it with brute force, but would prefer elegance. I usually choose "working" over "not working" :-) duff	[reply] [d/l]
Re^3: Improved regexp sought by diotalevi (Canon) on Oct 27, 2004 at 14:54 UTC
use Text::CSV_XS. I guessed that your lines were terminated with apostrophe + newline. Alter the code to fit. `use Text::CSV_XS; my $parser = Text::CSV_XS->new( { eol => "'\n", escape_char => "'", sep_char => "+" } ); while ( my $line = <$fh> ) { $parser->parse( $line ); print join( ", ", $parser->fields ) . "\n"; }` [download]	[reply] [d/l]
Re^3: Improved regexp sought by Limbic~Region (Chancellor) on Oct 27, 2004 at 14:43 UTC
myomancer, I can do it with brute force, but would prefer elegance I hope you aren't confusing conciseness with elegance. There are not always related. See the following: `my $str = "0010+2+O?'Reilly'"; my @field = map {s/\?'/'/g; $_ } split /\+/ , substr($str,0, (length $ +str) - 1); print "[$_]$field[$_]\n" for 0 .. $#field;` [download] IMO, the code would be more elegant broken out into multiple lines - perhaps with comments. Cheers - L~R	[reply] [d/l]