in reply to Re^5: Multi-CPU when reading STDIN and small tasks
in thread Multi-CPU when reading STDIN and small tasks

You are correct as that is the data which is being processed however their example presented there is only one sample and there are variations even within the event types. I haven't realized some of this until now as well and points out the samples I gave were maybe not the best.

Their SYSCALL example: 3 records (SYSCALL, CWD and PATH)
The two samples I gave:3 records (SYSCALL, CWD and PATH)
Other SYSCALL data: 5 records (SYSCALL, EXECVE, CWD, PATH and PATH)
Other Types (LOGIN): 1 record (LOGIN)

There is a list record types but it does not explain which records relate to which events. I believe that the records always appear in the same order within the event though.

If there was consistency in the events and records (if that is required), how could the processing be sped up?

  • Comment on Re^6: Multi-CPU when reading STDIN and small tasks

Replies are listed 'Best First'.
Re^7: Multi-CPU when reading STDIN and small tasks
by BrowserUk (Patriarch) on Jan 29, 2017 at 18:13 UTC
    how could the processing be sped up?

    At the moment you have a triple nested loop over every entry in your hash/subhash/subhash every few seconds. That's a disaster.

    If you know that when the timestamp of a record from a node changes, it is the start of a new event, you simply accumulate event data from each node until the timestamps changes, write the current accumulation and replace it with the new record.

    Events with multi lines are rolled up as the multi-parts arrive, and are output immediately the first part of a new event is received. You don't even have to check for EOE types, though you could continue to if you preferred.

    The possible caveat of this approach is that each event from a given node is only written when the (first line of the) next event is received; thus if a node goes down, its last event may never get written.

    If that is unacceptable, then you would need to re-institute a timeout mechanism.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.

      The benefits of having another set of eyes look at things. Thank you. I'm also adding threading of as well as a method of moving things out of the main loop so that it can spend it's time handling the data flow with processing queues up in threads (Thread::Queue) but that is only as effective if the main loop can keep the threads busy.