in reply to Using DBIx Class with json data types
PostgreSQL can index columns of the json or jsonb data type. On a 3.5 GB table I see a speedup from indexing, from ~4 seconds (seqscan) to 4 ms (GIN index).
(this is using Postgres 9.5 but 9.4 should work similar, and 9.6 (beta coming out this week) will just add parallel search to the seqscan.)
I did that test with synthetic data of roughly the form that you gave:
{ 'order_code' => '0165-02675-4565', 'email' => 'foo@test.com', 'pric +e' => '15', 'title' => 'Mt Thing XYZ' };
where I only varied the email adress.
With the CREATE INDEX statement:
create index junk7_gin_idx using gin (data);
and the search statement:
select * from junk7 where data @> '{ "email": "abcd@text.com" }';
FWIW, below is a bash test I was playing with; perhaps there is something useful in there for you.
(careful: it deletes a table named 'junk7')
#!/bin/sh for power in 7; do table=junk${power} echo "-- table $table"; echo "drop index if exists ${table}_gin_idx" | psql -qtAX if [[ 1 -eq 1 ]]; then time perl -E' sub ss{ join"",@_[ map{rand @_} 1 .. shift ] }; say( q({{order_code, "0165-02675-4565"}, {email, ).q(").ss(4,"a".. +"z").q(@test.com"}, {price, "15"}, {title, "Mt Thing XYZ"}}) ) for 1 .. 1e'"${power};" \ | psql -aqXc " drop table if exists $table; create table $table(id serial, data jsonb, txt text); copy $table(txt) from stdin; update $table set data = cast(json_object(cast(txt as text[][])) + as jsonb); alter table $table drop column txt; " fi echo "table $table limit 3" | psql -qX # show a few rows echo "select count(*) from $table " | psql -qX # show rowcount echo " select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- cold cache (possibly) select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- warm cache explain analyze select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- explain " | psql echo "create index ${table}_gin_idx on $table using gin (data); analyze $table; " | psql echo " select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- cold cache (possibly) select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- warm cache explain analyze select * from $table where data @> '{\"email\": \"ab +cd@test.com\"}'; -- explain " | psql echo "\dti+ ${table}*" | psql -qX done
output:
-- table junk7 id | data ----+----------------------------------------------------------------- +------------------------------------ 1 | {"email": "syya@test.com", "price": "15", "title": "Mt Thing XYZ +", "order_code": "0165-02675-4565"} 2 | {"email": "lzda@test.com", "price": "15", "title": "Mt Thing XYZ +", "order_code": "0165-02675-4565"} 3 | {"email": "ygiu@test.com", "price": "15", "title": "Mt Thing XYZ +", "order_code": "0165-02675-4565"} (3 rows) count ---------- 10000000 (1 row) id | data ---------+------------------------------------------------------------ +----------------------------------------- 408723 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 1008165 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} [...] 9380475 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 9824614 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} (26 rows) Time: 4071.469 ms id | data ---------+------------------------------------------------------------ +----------------------------------------- 408723 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 1008165 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} [...] 9380475 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 9824614 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} (26 rows) Time: 3879.432 ms QUERY PLAN ---------------------------------------------------------------------- +------------------------------------------ Seq Scan on junk7 (cost=0.00..583103.10 rows=10004 width=108) (actua +l time=144.428..3894.673 rows=26 loops=1) Filter: (data @> '{"email": "abcd@test.com"}'::jsonb) Rows Removed by Filter: 9999974 Planning time: 0.034 ms Execution time: 3894.709 ms (5 rows) CREATE INDEX Time: 82600.671 ms [...] 9380475 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 9824614 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} (26 rows) Time: 23.768 ms id | data ---------+------------------------------------------------------------ +----------------------------------------- 408723 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 1008165 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} [...] 9380475 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} 9824614 | {"email": "abcd@test.com", "price": "15", "title": "Mt Thin +g XYZ", "order_code": "0165-02675-4565"} (26 rows) Time: 4.792 ms QUERY PLAN ---------------------------------------------------------------------- +------------------------------------------------------ Bitmap Heap Scan on junk7 (cost=125.49..35457.45 rows=9999 width=108 +) (actual time=3.412..3.490 rows=26 loops=1) Recheck Cond: (data @> '{"email": "abcd@test.com"}'::jsonb) Heap Blocks: exact=26 -> Bitmap Index Scan on junk7_gin_idx (cost=0.00..122.99 rows=999 +9 width=0) (actual time=3.395..3.395 rows=26 loops=1) Index Cond: (data @> '{"email": "abcd@test.com"}'::jsonb) Planning time: 0.832 ms Execution time: 3.544 ms (7 rows) Time: 4.895 ms List of relations Schema | Name | Type | Owner | Table | Size | Descrip +tion --------+---------------+-------+----------+-------+---------+-------- +----- public | junk7 | table | aardvark | | 3579 MB | public | junk7_gin_idx | index | aardvark | junk7 | 177 MB | (2 rows)
|
|---|