Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Extracting structured data from unstructured text - just how difficult would this be?

by moklevat (Priest)
on Feb 21, 2008 at 16:18 UTC ( [id://669300]=note: print w/replies, xml ) Need Help??


in reply to Extracting structured data from unstructured text - just how difficult would this be?

In response to your question, I'm going to say "quite difficult" or at least very time consuming. On the other hand, if the point is to get work done, then I think Amazon has already created the system you are looking for with the Mechanical Turk.
  • Comment on Re: Extracting structured data from unstructured text - just how difficult would this be?

Replies are listed 'Best First'.
Re^2: Extracting structured data from unstructured text - just how difficult would this be?
by clinton (Priest) on Feb 21, 2008 at 16:23 UTC
    That may just be a brilliant solution - good thinking batman!

    The only downside is that we have to verify their work, which may be almost as time consuming

      Perhaps you could set your system up to have duplicate data entry, and then diff the duplicate entries to flag potential problems. Alternately you could set up a second Turk task to compare and verify entries.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://669300]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2024-04-18 09:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found