A regex question with 2 pieces of data

bilbozilla has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: A regex question with 2 pieces of data by Abigail-II (Bishop) on Apr 08, 2003 at 00:02 UTC
`use strict; use warnings; use Regexp::Common; $_ = '<span class="style">28.56</span><span class="style">-1.22</span> +' . '<span class="style">-4.1</span><span class="style">04/02</span>' +; my ($f, $s) = /($RE{num}{decimal}).*?($RE{num}{decimal})/; print "$f  $s\n"; __END__ 28.56  -1.22` [download] Abigail	[reply] [d/l]
Re: Re: A regex question with 2 pieces of data by bilbozilla (Initiate) on Apr 08, 2003 at 00:24 UTC
The first 2 sets of numbers will vary. How do I pull those? I should have been more clear .Could you please help?	[reply]
Re: Re: Re: A regex question with 2 pieces of data by The Mad Hatter (Priest) on Apr 08, 2003 at 00:45 UTC
Just use Abigail-II's regex (courtesy of Regexp::Common) to pull them out of your data. If `$data` contains your data you want to get the numbers from, then something like `my ($one, $two) = ($data =~ /($RE{num}{decimal}).*?($RE{num}{decimal})/);` [download] will put the first two numbers, regardless of what they are, into `$one` and `$two`. (Make sure to `use Regexp::Common;` in your script though before using the regex above!)	[reply] [d/l] [select]
Re: Re: Re: Re: A regex question with 2 pieces of data by diotalevi (Canon) on Apr 08, 2003 at 00:54 UTC
Re: Re: Re: Re: A regex question with 2 pieces of data by Anonymous Monk on Apr 08, 2003 at 01:05 UTC
Re: Re: Re: Re: Re: A regex question with 2 pieces of data by The Mad Hatter (Priest) on Apr 08, 2003 at 01:12 UTC
Re: A regex question with 2 pieces of data by graff (Chancellor) on Apr 08, 2003 at 02:16 UTC
While the module suggested initially is really cool and useful, the alternative, without using a special module, would be: `my $string = <<EOT; <span class="style">28.56</span><span class="style">-1.22</span><span +class="style">-4.1</span><span class="style">04/02</span> EOT my $decimal = qr/[-+]?(?:\d+\.?\d\|\d\.\d+)/; my ( $frst, $scnd ) = ( $string =~ /($decimal).*?($decimal)/ ); print "$frst, $scnd\n";` [download] Here, the "qr" operator is used to save a regex pattern in the scalar "$decimal" (see the perlop man page for "qr"); then, that regex variable is used twice on the target string, and the match is done in a list context (assigning to two scalars), so the two parenthesized matches are assigned to "$frst" and "$scnd". As for the $decimal regex itself, it's looking for a pattern where there may or may not be an initial hyphen or plus sign, then either one or more digits (with optional period and zero or more digits) or else a period with one or more digits (see the perlre man page regarding the "(?:...)" syntax and related tricks). Note that this will not handle variants like "2.4e7", and other possible "rare" forms -- although it would certainly be possible to add the necessary conditions. But that is why we like to use modules for this sort of thing, because the module will normally cover all that without requiring us to make our own coding more complicated.	[reply] [d/l]
Re: A regex question with 2 pieces of data by pg (Canon) on Apr 08, 2003 at 01:51 UTC
As this is XML, it also makes sense to use some sort of XML parser, which is more flexible. For example: (code is tested) `use XML::Simple; use Data::Dumper; use strict; my $parser = new XML::Simple(); my $ref = $parser->XMLin("<xml>".'<span class="style">28.56</span><spa +n class="style">-1.22</span><span class="style">-4.1</span><span clas +s="style">04/02</span>'."</xml>"); print $ref->{span}[0]{content}, "\n"; print $ref->{span}[1]{content};` [download] If your string could be long, then use some parser that does not slurp.	[reply] [d/l]