in reply to Re: Quick 'n dirty extraction of JSON from an HTML page
in thread Quick 'n dirty extraction of JSON from an HTML page
Some of the JavaScript seems to be using key/value specifications that aren't valid JSON because the keys aren't quoted strings, e.g.
... so I changed the regular expression to bevar renderer = new US.Opportunity.OpportunityRenderViewModel({ opportunity: opportunity, currentJobBoardId: "6162c253-9d81-da08-c252-d43d2fcb8345", isViewingInternal: false });
(throwing in a leading quotation mark, in order to find only JSON that has a quoted initial key).m/\((\{".*?\})\)/gms
I also played with the possibility that the HTML page would contain more than one block of JSON, and changed your code to be
...so as to find and print for me each of multiple JSON blocks (not shown here). Love it!my ( $json, $ref ); for ( $scrape =~ m/\((\{".*?\})\)/gms ) { $json = $1; $ref = decode_json $json; print Dumper $ref; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Quick 'n dirty extraction of JSON from an HTML page
by tobyink (Canon) on Mar 09, 2021 at 14:29 UTC |