in reply to Re^8: Why this code is so slow if run in thread?
in thread Why this code is so slow if run in thread?
One last thought. If you are still hankering for more speed, compile your own PDL modules and are comfortable with C, then you could look to making a custom version of the ccNcompt() routine in PDL::Image2D, that accumulates the component bounds in the same pass as it discovers them.
On the basis of a quick look at the source it wouldn't be too hard to modify the routine to do the accumulation; though there are a few complications.
The way their algorithm works, different parts of a single component can be labeled with different values as the scan progresses, and these aliases are then resolved at the end.
You would need to merge the bounds of the aliased parts at that same time.
In as much as it appears that the C source code is actually embedded as strings within Perl source code and subject to some kind of templating mechanism prior to those Perl sources being executed to generate the C sources which are then compiled.
I've worked on a similar source generation mechanism in the past and a) it can be very difficult to understand how the different phases work together; b) it can be a nightmare to debug.
It was not obvious to me how (or even if it is possible) to return two separate data structures -- the existing 'colored' image and the require AoAs bounds data -- from the routine.
If your need is such that this idea is attractive, then you would definitely need the help of the PDL devs.
Thanks for posting such an interesting problem. It has given me much mental stimulation of the last week or so.
One final, final thought. Many years ago, I wrote some OCR routines in 6502 assembler for a BBC micro. The first part of that process was to isolate the individual characters, essentially the same task as this. However, I was lucky in as much as the stuff I was dealing with was handwritten text and crosses filled into predefined boxes on a form -- multiple choice question papers with boxes for choices, names, id numbers etc. and the positions of those boxes were known a priori to a high degree of accuracy. That made my life simple -- for the first stage at least.
It is clear from your sample image that you're not dealing with simple text, but there also appear to be registration marks on the image. If that is true for all your samples, and they are consistent, it might be possible to predefine the areas of interest, rather than needing to discover them new for each image, which would greatly speed up your task.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^10: Why this code is so slow if run in thread?
by vr (Curate) on Dec 16, 2016 at 16:11 UTC | |
|
Re^10: Why this code is so slow if run in thread?
by etj (Priest) on May 18, 2022 at 23:48 UTC |