Since a unique record is based on a combination of most fields the current indexes do not work well for this operation. Would adding another field, that is a concatenation of the other fields, and indexing that be a good approach?
| [reply] |
No, you need to create an index that matches your query terms. See: http://www.postgresql.org/docs/9.1/static/indexes.html
You can try this :
delete from agent_log t1 where WHERE timestamp < now() - interval '7 days') and timespan < (select max(timestamp) from agent_log t2 where t2.class=t1.class and t2.hostname=t1.hostname and t2.id_address=t1.ip_address)
I suppose you have an index on timestamp and the other fields? A surrogate key might help, or even using timespan as primary key, if possible.
If you can create a history table, then you could:
SELECT
class,
max(timestamp) as timestamp,
hostname,
ip_address,
promise_handle,
promiser,
promisee,
policy_server,
promise_outcome
FROM agent_log
WHERE timestamp < now() - interval '7 days'
into history_table ;
delete from agent_log
WHERE timestamp < now() - interval '7 days'
That would be a good start towards partitioning your data, which you might need to consider.
libremen.com : legal case, contract and insurance claim management software
| [reply] [d/l] |
| [reply] [d/l] |
I did some testing with a small sampling of data. Explain and analyze below.
Seq Scan on agent_log (cost=48184.35..52265496.54 rows=101705 width=122) (actual time=5288.903..175884.135 rows=203428 loops=1)
Filter: (("timestamp" < (now() - '7 days'::interval)) AND (NOT (SubPlan 1)))
SubPlan 1
-> Materialize (cost=48184.35..48646.77 rows=20342 width=77) (actual time=0.022..0.317 rows=2604 loops=203428)
-> GroupAggregate (cost=44040.21..47905.01 rows=20342 width=77) (actual time=4417.518..5286.975 rows=2604 loops=1)
-> Sort (cost=44040.21..44548.73 rows=203410 width=77) (actual time=4417.496..5223.285 rows=203428 loops=1)
Sort Key: (date_trunc('day'::text, public.agent_log."timestamp")), public.agent_log.class, public.agent_log.promiser, public.agent_log.hostname, public.agent_log.ip_address
Sort Method: external merge Disk: 19648kB
-> Seq Scan on agent_log (cost=0.00..8024.51 rows=203410 width=77) (actual time=0.011..175.827 rows=203428 loops=1)
Filter: ("timestamp" < (now() - '7 days'::interval))
Total runtime: 175924.695 ms
| [reply] |