in reply to Re^3: JSON::XS Wide Character Problem
in thread JSON::XS Wide Character Problem
The suggested use of JSON::XS#INCREMENTAL-PARSING looks like the much better solution for such kind of parsing problems. graff++
> If there is a good separator like blank line(s), you can set the input record separator $/ accordingly.
You didn't show us more than one object and your link is broken. So I had to guess they all start with '{"created_at":' at line's start.
And I shortened your objects for better demonstration
use v5.12.0; use warnings; use Data::Dump; use JSON::XS; my $ident ='{"created_at":'; local $/ = "\n$ident"; my $prefix=""; while (<DATA>) { chomp; # removes $/ from the end my $obj = "$prefix$_"; ddx $obj; say "-" x 30; dd decode_json($obj); #say "-" x 72; $prefix = $ident; } __DATA__ {"created_at":"Sat Mar 02 18:45:26 +0000 2013","id":307924626426695681 +,"id_str":"307924626426695681","etc":"***YADDA YADDA ***","id_str":"2 +621098741943851970"} {"created_at":"Sat Mar 02 18:45:26 +0000 2013","id":307924626426695681 +,"id_str":"307924626426695681","etc":"***YADDA YADDA ***","id_str":"2 +621098741943851970"} {"created_at":"Sat Mar 02 18:45:26 +0000 2013","id":307924626426695681 +,"id_str":"307924626426695681","etc":"***YADDA YADDA ***","id_str":"2 +621098741943851970"}
# raw_json.pl:16: "{\"created_at\":\"Sat Mar 02 18:45:26 +0000 2013\", +\"id\":307924626426695681,\"id_str\":\"307924626426695681\",\"etc\":\ +"***YADDA YADDA ***\",\"id_str\":\"2621098741943851970\"}" ------------------------------ { created_at => "Sat Mar 02 18:45:26 +0000 2013", etc => "***YADDA YADDA ***", id => 307924626426695681, id_str => 2621098741943851970, } # raw_json.pl:16: "{\"created_at\":\"Sat Mar 02 18:45:26 +0000 2013\", +\"id\":307924626426695681,\"id_str\":\"307924626426695681\",\"etc\":\ +"***YADDA YADDA ***\",\"id_str\":\"2621098741943851970\"}" ------------------------------ { created_at => "Sat Mar 02 18:45:26 +0000 2013", etc => "***YADDA YADDA ***", id => 307924626426695681, id_str => 2621098741943851970, } # raw_json.pl:16: "{\"created_at\":\"Sat Mar 02 18:45:26 +0000 2013\", +\"id\":307924626426695681,\"id_str\":\"307924626426695681\",\"etc\":\ +"***YADDA YADDA ***\",\"id_str\":\"2621098741943851970\"}\n" ------------------------------ { created_at => "Sat Mar 02 18:45:26 +0000 2013", etc => "***YADDA YADDA ***", id => 307924626426695681, id_str => 2621098741943851970, }
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
|
|---|