All those events whereby control is passed from one function to another indirectly ... I see why BrowserUk is always advocating real threads, and why tye advocates Coro.
Each of your "threads" of execution (represented, it seems, by a POE::Session) seem completely independent, which means you could theoretically get away with forking off each proxy instead of using event-based programming. With proper COW support in the kernel (which I suspect has been there for over a decade), this should be relatively cheap. The only downside is if in later stages of processing you decide you need additional modules loaded - you'll be loading them every time you need them unless you use them prior to the fork.
However, if you use threads (whether that's threads or Coro), you should be able to load them the same as your initial foray - whenever you want. So now, the downsides go to threads - like forking, any logging will need to lock your logfile, just to keep lines from being interleaved. That's minor, because I believe Log::Log4perl already can do that. The second point is that if you create too many OS threads, you can overwhelm the system. By that I don't mean that the system will crash because you have too many threads, I expect both Linux and Windows to handle thousands of threads. What I mean is that your proxy may start taking up so much of the CPU by virtue of having threads with equal weighting that other processes may get starved. However, there are two mitigating factors to this in my mind: The first is that your threads should be fairly inactive - your CPU probably can handle all the proxying of your maxed out uplink while leaving plenty of room for doing other activities (though compiling kde may be impacted :-> ). The second mitigating factor here is that current kernels (2.6.38+, I believe) have a new scheduling method which will switch not just between threads, but between sessions, which will result in non-proxy threads being scheduled between proxy threads more often, reducing that starvation (but that could theoretically impact your proxy, if it weren't for the first mitigating factor :-) ).
Coro is just a single kernel thread, much like your POE solution, but ever so slightly less taxing on the kernel (fewer states need to be saved off) than pure threads, and, IME, much less taxing on the programmer.
With my CB proxy (nee fetcher), I use Coro to schedule everything, and AnyEvent::Socket::tcp_server to start the servers (one socket-based, the other tcp-based). When a connection comes in, I spawn a Coro thread (well, actually, three - one for input on the socket, one for listening to the CB fetcher that is in the same process, and one for serialising the output back across the socket). Each thread otherwise looks like "normal" non-event-based perl programming. For example, the code to listen for input on the socket looks like this: while (defined $fh and defined (my $cmd = $fh->readline())) { ... } - the same as you would do normally. I don't need to use IO::Select, or set up multiple functions with events. I don't have to store intermediate data on a heap - my lexical stack already works fine for that, so perl (Coro) takes care of that.
What I don't know with Coro is how many threads it will handle without going nuts, i.e., slower than other solutions. I recall reading somewhere that it can be much much faster than using threads, even on a multi-core system, and I think this is one of those areas where it does better, so I expect that more threads will work fine here.
In reply to Re: A Perl-based Transparent TCP Proxy (TPROXY and POE)
by Tanktalus
in thread A Perl-based Transparent TCP Proxy (TPROXY and POE)
by charlesboyo
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |