in reply to Bayesian not-for-spam

howdy Kickstart

merlyns link is very useful for discrete data, and a naive bayesian network. The bayesian approach is not limited to these, more advanced bayesian nets allowing causality to be modelled in a statistically rigorous and useful way. Neural Nets, mentioned above, not to mention decision trees, are alternatives here.

unfortunately, there is no software for the advanced bayesian nets in perl. if you are lucky enough to posess matlab, you will find BNT, by Murphy, which is GPL'd, very useful.

if the naive bayesian approach is sufficient, and it should be the start (and is by no means naive) you will find the question you face is how do you discretize your data?

day of the week etc is easy: the problems you will face will be in dealing with continuous variables like price. this discretization should not be linear if you are to make the most of the information that is there; you could also use an 'expert' to help you decide on the bands. this discretization is essential for typical bayesian net methods to work, so it is worth devoting attention to it.

once you have done this, just loop through your db and determine the conditional probabilities. feed these into your naive bayesian net, and robert is your uncle.

if it works, remember me in your will

...wufnik

in the world of the mules there are no rules