in reply to Re: Hash Search is VERY slow
in thread Hash Search is VERY slow

Unless you can guarantee that the CSV you are dealing with is absolutely free of fields with embedded newlines, CSV parsing cannot be threaded or parsed in parallel.

The OP only deals with fields 7 and 31, which are unlikely to contain new-lines, but we have no idea what the other fields may hold.


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^3: Hash Search is VERY slow
by dsheroh (Monsignor) on Sep 29, 2021 at 12:13 UTC
    We do have some idea of what the other fields may hold, in that the original post states the CSV files are firewall logs. I'm not aware of any common firewall log format which includes data that might contain embedded newlines, so it's probably not an issue in this case. But OP would know the specifics of the format they're dealing with better than I do, of course.
Re^3: Hash Search is VERY slow
by karlgoethebier (Abbot) on Sep 29, 2021 at 15:36 UTC

    Yes sure. Anyway:

    And yes, i'm aware that i shouldn't split etc.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

Re^3: Hash Search is VERY slow
by rtjensen (Novice) on Sep 29, 2021 at 14:16 UTC
    Right, there are no new-lines in there and it's pretty uniform data overall.