in reply to Re^3: Supervised machine learning algo for text matching across two files
in thread Supervised machine learning algo for text matching across two files

Since I have 15% matched to use as ground truth... can't I somehow use the other 50 columns in the file ( all of which has various data fields) to train some sort of supervised approach that uses all the data to suggest a match statistically?

I want to say the answer has something to do with Expectation Maximization type approach but I'm way out of my depth here.

  • Comment on Re^4: Supervised machine learning algo for text matching across two files