Find a number in a list closest to a target

gnu@perl has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Find a number in a list closest to a target by eric256 (Parson) on Jul 25, 2003 at 14:08 UTC
You could just loop threw comparing the times. No need for hashes and sorting and keys then. `my $BestTime; my $BestOffBy; my $ctime = (stat($key))[10]; print "IN MULTIPLES \$ctime is $ctime\n"; for (@returns) { print "Working Multiple match: $_\n"; my $temp = abs($ctime - $_); if ($temp < $BestOffBy) { $BestOffBy = $temp; $BestTime = $_ } } print "$BestTime is only $BestOffBy away from $ctime.";` [download] I'm not sure this is any better or faster but eliminating the reverse, sort, and keys of the hash could be important if you are checking alot of times. ___________ Eric Hodges	[reply] [d/l]
Re: Re: Find a number in a list closest to a target by gnu@perl (Pilgrim) on Jul 25, 2003 at 14:33 UTC
Yeah, much better. I will be doing this A LOT! Good catch.	[reply]
Re: Re: Re: Find a number in a list closest to a target by BrowserUk (Patriarch) on Jul 25, 2003 at 15:02 UTC
If you will only look up one value against the list, or if the list will vary between lookups, then the linear search will be the best solution. But if your list remains static, and you need to compare more than one value against it, it wouldn't take many linear searches before the cost of sorting the static list and performing a binary chop to locate the nearest two values and then determining the minimum offset between the two would be amortised and begin to pay dividends. If you need code for this, speak up:) Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply]
Re: Re: Re: Re: Find a number in a list closest to a target by eric256 (Parson) on Jul 25, 2003 at 19:50 UTC
Re: Re: Re: Re: Re: Find a number in a list closest to a target by BrowserUk (Patriarch) on Jul 25, 2003 at 22:51 UTC
Re: Re: Find a number in a list closest to a target by gnu@perl (Pilgrim) on Jul 25, 2003 at 14:53 UTC
One note though, I had to initialize $BestOffBy = $ctime or the 'if' statement was never entered.	[reply]
Re: Find a number in a list closest to a target by broquaint (Abbot) on Jul 25, 2003 at 14:14 UTC
Keeping track of the difference and the closest time might be easier e.g `my $ctime = (stat($key))[10]; my($diff, $closest) = $ctime; for(@returns) { $closest = $_ and $diff = abs($ctime - $_) if abs($ctime - $_) < $diff; } print "closest match is - $closest\n";` [download] The code is untested, but it should give you something to work with, and saves having to keep a hash around of all the differences and times. HTH `_________ broquaint`	[reply] [d/l]
Re: Find a number in a list closest to a target by TomDLux (Vicar) on Jul 25, 2003 at 14:17 UTC
You say you only want the one which is closest. So instead of using spacec saving data you don't need, and undergoing the expense of a sort, why not keep track of the closest as you go along. It's much the same as determining the minimum of a list. `$delta = time(); // has to be less than now - 1970 $minval = $min; for ( @returns ) { my $val = abs( $ctime - $_ ); if ( $val < $delta ) { $delta = $val; $minval = $_; } } print "MULTIPLE MATCH best match = $minval";` [download] update: I see I'll have to learn to type faster! -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply] [d/l]
Re: Find a number in a list closest to a target by Limbic~Region (Chancellor) on Jul 25, 2003 at 14:15 UTC
gnu@perl, Unfortunately I don't have the time to go find links to all the references I want, but what you want to do is: Sort the epoch times Perform a variation on a binary search The variation on the binary search is to also keep track of the offset and not just is it higher or lower. This is proven to be the fastest algorithm for this specific type of search. The only trick will be if two items in the list are equidistant from the target number and you would prefer one over the other. In that case you will have to do more work. If you would like additional information and can wait until tonight (EST) - drop me a note. Cheers - L~R	[reply]
Re: Re: Find a number in a list closest to a target by skyknight (Hermit) on Jul 25, 2003 at 15:02 UTC
This may or may not be a wholly useful thing to do, depending on some stuff that he didn't state in his problem. It wasn't clear from his description whether or not he would be repeatedly doing searches on the same list, or if he'd be getting a fresh list every time. If he uses the naive method of traversing the whole n element array, this will yield him O(n) for one search, and O(mn) were he to perform this search m times. With your strategy, in the case of a single search he'd get O(nlg(n)) (unnecessary overhead because we're doing pre-processing then throwing it away), but in the case of m searches we'd have O(mlg(n)) + O(nlg(n)) = O((m+n)*lg(n)) which we can simplify further if we know the relative sizes of m and n. Thus, your idea is fantastic if he meant he'd be repeatedly searching on the same list, but he wasn't clear whether that was the case, so it might turn out to be an investment of pre-computation in a problem that doesn't in fact exist.	[reply]
Re: Find a number in a list closest to a target by dragonchild (Archbishop) on Jul 25, 2003 at 15:34 UTC
All of the above suggestions are good. There are a few optimizations you can do if you can add a constraint or two on your list of epoch times. If you can require that the list be sorted, you can do something like: `my $best = undef; for (@returns) { my $diff = abs($ctime - $_); unless (defined $best) { $best = [$_, $diff]; next; } if ($diff < $best) { $best = [$_, $diff]; next; } if ($diff > $best) { last; } } print "Best match is $best->[0] at $best->[1] away from $ctime\n";` [download] The presumption here is that $best->[0] will be getting smaller and smaller, then start getting larger. If your list is sorted, then you can stop looking once the smallest difference is found - you won't be beating it. Now, the above algorithm is a naive search, or linear search. You can look at doing a binary search, but it becomes somewhat complicated. If you have more than, say, 1000 items and that list is completely static and you look up into it more than, say, 100 times, then I'd look at doing a cached binary search. (The numbers are somewhat arbitrary. However, linear searches are actually faster than binary searches for extremely small sets, like under 30 elements. Also, it's harder to maintain a cached binary search than a early-terminating linear search, which may be a consideration.) There are several methods to cache. One of them could be noting where several marker points are. For example, if you were doing a search across the numbers 1 to 100, you'd note where 10, 20, 30, 40, 50, etc would be. This is especially important if your array is sparsely populated in some areas, but densely populated in others. This would allow you to find out which subset of the search area you need to focus in. Depending on how large your search area is, you might have second-level markers, possibly only in dense subsections. The goal here is that if you have a large number of elements to check against, you want to reduce the number of comparisons you make. This allows you to reduce your search time. But, remember that for smaller sets (under 100 elements), most search algorithms tend to run within acceptable times, regardless if they're O(n) or O(log n ** 0.1). (Big-O of the tenth root of the log of n ... very very fast!) At those sizes, your bottlenecks start being I/O and your comprehension, not the search algorithm being used. ------ We are the carpenters and bricklayers of the Information Age. Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.	[reply] [d/l]
Re: Find a number in a list closest to a target by ihb (Deacon) on Jul 25, 2003 at 23:25 UTC
Just because `&reduce` is pretty and TMTOWTDI: `use List::Util 'reduce'; my $closest = reduce { abs($a - $ctime) < abs($b - $ctime) ? $a : $b + } @epochs; print $closest;` [download] or even `my $closest = reduce { $a->[1] < $b->[1] ? $a : $b } map [ $_ => abs($ctime - $_) ], @epochs; print $closest->[0];` [download] Not that I'd expect the second to be more efficient in this particular case. `:-)` (In fact, I benchmarked the regular `for` loops in the previous replies with and without a temporary `$diff` variable and it seemed like it was more efficient to just redo the math (on ActivePerl 5.8.0 build 806). But please verify that yourself, as the benchmark was hasty and inexhaustive.) `ihb`	[reply] [d/l] [select]
Re: Find a number in a list closest to a target by gnu@perl (Pilgrim) on Jul 25, 2003 at 17:04 UTC
Note to all: Since so many people have mentioned it, the list of epoch times varies. The code example I posted is in a loop which receives a new reference time and list of epoch times.	[reply]
Re: Re: Find a number in a list closest to a target by dragonchild (Archbishop) on Jul 25, 2003 at 17:20 UTC
How large is your list of epoch times? Is it under 100 or over 100? If it's under 100, a linear search would be best, in my opinion, because it's easiest to code up and maintain. If this is speed-critical, where every millisecond counts, then you'll probably want to do something a little more complex. ------ We are the carpenters and bricklayers of the Information Age. Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.	[reply]
Re: Re: Re: Find a number in a list closest to a target by gnu@perl (Pilgrim) on Jul 25, 2003 at 17:28 UTC
Relatively small, probably under 10 items. That is why I was looking for another way to do it. The hash and sorting was too much overhead for the amount of data. Although I will be processing about 100,000 files daily the data per file (epoch times) will most likely never exceed 5 or 6 items.	[reply]
Re4: Find a number in a list closest to a target by dragonchild (Archbishop) on Jul 25, 2003 at 17:33 UTC