Re^2: Memory utilization and hashes

Replies are listed 'Best First'.
Re^3: Memory utilization and hashes by poj (Abbot) on Jan 19, 2018 at 13:34 UTC
What does this sample of data you provided look like after the *nix sort ? `Query;1;host;www.example.com Answer;1;ip;1.2.3.4 Query;2;host;www.cnn.com Query;3;host;www.google.com Answer;2;ip;2.3.4.5 Answer;2;ip;2.3.4.5 Query;4;host;www.google.com Answer;4;ip;3.4.5.6 Answer;3;ip;3.4.5.6 Query;2;host;www.example2.com Answer;4;ip;1.2.4.5 Answer;2;ip;2.3.4.5` [download] poj	[reply] [d/l]
Re^4: Memory utilization and hashes by bfdi533 (Friar) on Jan 25, 2018 at 20:30 UTC
There is actually missing data in the sample data. In the real data file, it includes the date and time of the entry. Once sorted by date and ID, then I can be sure that if the date changes and the ID changes as well, then there are no more answers to be had and I can dump the data, empty the hash and move on. The real file is more like this once sorted: 2018-01-25 01:01:01;Query;1;host;www.example.com 2018-01-25 01:01:01;Answer;1;ip;1.2.3.4 2018-01-25 01:01:05;Query;2;host;www.cnn.com 2018-01-25 01:01:05;Answer;2;ip;2.3.4.5 2018-01-25 01:01:05;Answer;2;ip;2.3.4.5 2018-01-25 01:01:06;Query;3;host;www.google.com 2018-01-25 01:01:06;Answer;3;ip;3.4.5.6 2018-01-25 01:01:08;Query;4;host;www.google.com 2018-01-25 01:01:08;Answer;4;ip;3.4.5.6 2018-01-25 01:01:08;Answer;4;ip;1.2.4.5 2018-01-25 01:01:11;Query;2;host;www.example2.com 2018-01-25 01:01:11;Answer;2;ip;2.3.4.5 [download]	[reply] [d/l]
Re^3: Memory utilization and hashes by bfdi533 (Friar) on Jan 18, 2018 at 23:46 UTC
For what is is worth, and if anyone is interested, here are some stats from the processing after I introduced the *nix sort before my perl script. elapsed time \| type \|rows after\| rows before\| pct \| rows/second \| \|processing\| processing \|smaller\| 00:03:05.98667 \| dns \| 1791555 \| 4614653 \| 38.82 \| 24811.7405403301 00:03:50.106203 \| dns \| 2262736 \| 5822777 \| 38.86 \| 25304.737221708 00:04:51.91195 \| dns \| 2733705 \| 7039758 \| 38.83 \| 24116.0322487654 00:05:36.348691 \| dns \| 3208365 \| 8266995 \| 38.81 \| 24578.6447850335 00:06:33.947878 \| dns \| 3683419 \| 9490938 \| 38.81 \| 24091.8622234589 00:07:35.58667 \| dns \| 4155971 \| 10705249 \| 38.82 \| 23497.7221787459 00:08:25.086565 \| dns \| 4633553 \| 11946401 \| 38.79 \| 23652.1852447214 00:09:07.952743 \| dns \| 5109618 \| 13183845 \| 38.76 \| 24060.1861536808 00:10:16.250404 \| dns \| 5596902 \| 14441405 \| 38.76 \| 23434.3132373833 00:10:54.578348 \| dns \| 6070888 \| 15662586 \| 38.76 \| 23927.7483709253 00:11:39.012952 \| dns \| 6547181 \| 16896184 \| 38.75 \| 24171.4891714911 00:12:43.13814 \| dns \| 7019314 \| 18113219 \| 38.75 \| 23735.1772249255 00:13:34.23578 \| dns \| 7499659 \| 19365386 \| 38.73 \| 23783.5114541392 00:14:35.939246 \| dns \| 7973633 \| 20591767 \| 38.72 \| 23508.2137191967 00:15:12.223167 \| dns \| 8448494 \| 21815382 \| 38.73 \| 23914.5231004641 00:15:52.951662 \| dns \| 8923786 \| 23043433 \| 38.73 \| 24181.1142357817 00:17:45.637116 \| dns \| 9402613 \| 24278649 \| 38.73 \| 22783.2238906363 00:17:52.402055 \| dns \| 9880079 \| 25516948 \| 38.72 \| 23794.1990888856	[reply]