Re: How to extract a pattern in Perl regex?

Replies are listed 'Best First'.
Re^2: How to extract a pattern in Perl regex? by SergioQ (Scribe) on May 01, 2020 at 03:08 UTC
Yes, I'm looking at the recommended methods, and that "^" was a typo. However part of my question was how do I extract in one statement what's in between the "title tags". The way I worked around it was: `$result = =~ /(<title>.*<\/title>)/mgi; my $newresult = $1; $newresult =~ s/<title>//i; $newresult =~ s/<\/title>//i;` [download] Surely there's a simpler way?	[reply] [d/l]
Re^3: How to extract a pattern in Perl regex? by marto (Cardinal) on May 01, 2020 at 09:27 UTC
Using Mojo::DOM (pulling live data use Mojo::UserAgent): `#!/usr/bin/perl use strict; use warnings; use feature 'say'; use Mojo::Util 'trim'; use Mojo::UserAgent; # get perlmonks my $ua = Mojo::UserAgent->new; my $dom = $ua->get('https://perlmonks.org')->res->dom; say 'Title: ' . trim( $dom->at('title')->text ); say 'Image src: ' . trim( $dom->at('img')->attr->{'src'} ); say 'Image alt: ' . trim( $dom->at('img')->attr->{'alt'} );` [download] Output: `Title: PerlMonks - The Monastery Gates Image src: //promote.pair.com/i/pair-banner-current.gif Image alt: Beefy Boxes and Bandwidth Generously Provided by pair Netwo +rks` [download] Mojo::DOM makes parsing fun and simple.	[reply] [d/l] [select]
Re^4: How to extract a pattern in Perl regex? by haukex (Archbishop) on May 02, 2020 at 09:23 UTC
Mojo::DOM makes parsing fun and simple. Agreed, and ojo makes it even more fun `;-)` `$ perl -Mojo -e 'say g("https://perlmonks.org")->dom->at("title")->all +_text=~s/^\s+\|\s+$//gr' PerlMonks - The Monastery Gates` [download]	[reply] [d/l] [select]
Re^5: How to extract a pattern in Perl regex? by marto (Cardinal) on May 02, 2020 at 09:26 UTC
Re^6: How to extract a pattern in Perl regex? by haukex (Archbishop) on May 02, 2020 at 09:31 UTC
Some notes below your chosen depth have not been shown here
Re^3: How to extract a pattern in Perl regex? by hippo (Archbishop) on May 01, 2020 at 09:10 UTC
Surely there's a simpler way? Just capture what you want. Let's change the task to remove the elephant in the room of parsing HTML with regex which you now know you shouldn't do. Instead suppose you want to extract everything between 'foo' and 'bar' and ignore all the rest. Here's the simple approach: `use strict; use warnings; use Test::More tests => 1; my $in = 'abcfooHellobarxyz'; my $want = 'Hello'; my ($have) = ($in =~ /foo(.*)bar/); is $have, $want, "Extracted $want";` [download] The only real caveat to this is to remember to use the `/s` modifier if the text you are extracting might contain `\n`.	[reply] [d/l] [select]
Re^4: How to extract a pattern in Perl regex? by SergioQ (Scribe) on May 01, 2020 at 22:24 UTC
Thank you! Yes, this was the main part of what I was looking for. I remember going through a rather large Perl handbook, and it ended the Regex chapter (or started it) by saying that "there is so much to Regex that whole books are written on it." I really see why now.	[reply]
Re^4: How to extract a pattern in Perl regex? by AnomalousMonk (Archbishop) on May 01, 2020 at 10:33 UTC
... caveat ... is to remember to use the `/s` modifier if the text you are extracting might contain `\n`. Simpler still is to always use `/s` (along with `/x` and `/m` in a consistent `/xms` modifier tail) on every `qr// m// s///` you write. Then the rule is simply "Dot matches all." Period. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^5: How to extract a pattern in Perl regex? by hippo (Archbishop) on May 01, 2020 at 10:50 UTC
Re^6: How to extract a pattern in Perl regex? by AnomalousMonk (Archbishop) on May 01, 2020 at 11:52 UTC
Some notes below your chosen depth have not been shown here
Re^3: How to extract a pattern in Perl regex? (updated) by AnomalousMonk (Archbishop) on May 01, 2020 at 03:58 UTC
`c:\@Work\Perl\monks>perl -wMstrict -le "my $result = '<title>The Rain in Spain</tItLe>'; my ($newresult) = $result =~ m{ <title> (.?) </title> }xmsi; print qq{'$newresult'}; " 'The Rain in Spain'` [download] Update:* Or, going a step further: `c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(dd); ;; my $result = 'yada <title>The Rain in Spain</tItLe> blah <TITLE>How N +ow Brown Cow</TitlE> foo'; my @titles = $result =~ m{ (?i) <title> (.*?) </title> }xmsg; dd \@titles; " ["The Rain in Spain", "How Now Brown Cow"]` [download] Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]