Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, thank you for your advice and assistance with my last post. I have decided to rewrite a smaller script to do the html parsing I need. I'm close but have come up on another couple of places where I am stuck. Here are the questions:

1)how do a deal with a class that has spaces in it?

2)How can I pull out all text from an h3 tag?

3)How can I pull out all text from a span following that h3 tag?

4)The following is what I have so far that is not working for me:</p?

$r2 = $dom2->find( '.LC20lb.MBeuO.DKV0Md')->each(sub { push @columns, join '|', map { $_->all_text } $_->find('span') +->each; }

Thanks in advance for any assistance you can provide.

Replies are listed 'Best First'.
Re: Another Mojo Dom question
by marto (Cardinal) on Jan 30, 2024 at 09:14 UTC

    Again, it'd help if you posted example data. How do I post a question effectively?.

    1. classes don't have spaces, elements can have multiple classes, e.g. class="header blue" would apply both the "header" and "blue" classes.

    2.

    my $html = '<h3>Heading 3.1</h3> <div>Heading 3.1 content</div> <h3>Heading 3.2</h3> <div>Heading 3.2 content</div>'; my $dom = Mojo::DOM->new( $html ); for my $entry ( $dom->find('h3')->each ){ say $entry->all_text; }

    output:

    Heading 3.1 Heading 3.2

    3.

    my $html = '<h3>Heading 3.1</h3> <span>Heading 3.1 content</span> <h3>Heading 3.2</h3> <span>Heading 3.2 content</span>'; my $dom = Mojo::DOM->new( $html ); for my $entry ( $dom->find('h3 + span')->each ){ say $entry->all_text; }

    output:

    Heading 3.1 content Heading 3.2 content

    Again, posting some sample data is a better way to get the help you need quicker. See also Mojo::DOM::CSS.

    Update: Previously... Registering an account will make of easier to manage threads/questions