in reply to Re: comparing sentences
in thread comparing sentences

Lots of good stuff. Thank you all for your inputs.

A couple of you asked for samples. There’s nothing unusual about the texts that will be used. I plan to test with texts from wikipedia by introducing misspellings, deletions, additions and changes in punctuation. Of course that continues to beg the question, how much can you alter a sentence before it becomes something else? Maybe I should be asking a different kind of monk about that. :)

Nevertheless I’ve included some texts below just to give a broad sense of what I expect to see. These are all from: https://en.wikipedia.org/wiki/Human_rights.

There’s flexibility on the question of how far into the text the algorithm has to be able to make a determination. Probably sentence by sentence as a first approximation.

Certainly Levenshtein distance looks worthy of study and String::Approx looks very interesting as well, along with a few more suggestions made in the String::Approx description on cpan. I’ll have to experiment will all of this and see where it gets me. And I have to beg your pardon - it could take a while to be able to comment further on these suggestions.

> And if you want to go hardcore on the problem: “wordnet”
It’s not so far fetched. At least, some effort to do grammatical parsing or look at sentence structure could be helpful. I’ve had good experiences with Lingua/LinkParser and it can be a way to look at the abstraction of the sentence instead of at the sentence itself, though it's probably too much overhead for this application.

————————————————————————

TEMPLATE
The earliest conceptualization of human rights is credited to ideas about natural rights emanating from natural law. In particular, the issue of universal rights was introduced by the examination of extending rights to indigenous peoples by Spanish clerics, such as Francisco de Vitoria and Bartolomé de Las Casas. In the Valladolid debate, Juan Ginés de Sepúlveda, who maintained an Aristotelian view of humanity as divided into classes of different worth, argued with Las Casas, who argued in favour of equal rights to freedom from slavery for all humans regardless of race or religion.

USER ATTEMPTS TO COPY TEMPLATE, WITH ERRORS
The earliest conceptulization of human rights is credited to ideas about rights emanating from natural law. In particular the issue of universal rights was introduced by extending rights to indigenous peoples by Spanish clerics, such as Francisco de Vitoria and Bartolomé de Las Casas. In the Valladolid debate, Juan Ginés de Sepúlveda, who maintained an Aristotelian view of humanity as divided into clases of different worth argued with Las Casas, who argued in favour of equal rights to freedom from slavery for all humans regardless of race.

USER TYPES SOMETHING ELSE ENTIRELY
Although ideas of rights and liberty have existed in some form for much of human history, there is agreement that the earlier conceptions do not closely resemble the modern conceptions of human rights. According to Jack Donnelly, in the ancient world, "traditional societies typically have had elaborate systems of duties... conceptions of justice, political legitimacy, and human flourishing that sought to realize human dignity, flourishing, or well-being entirely independent of human rights. These institutions and practices are alternative to, rather than different formulations of, human rights".14 The history of human rights can be traced to past documents, particularly Constitution of Medina (622), Al-Risalah al-Huquq (659-713), Magna Carta (1215), the Twelve Articles of Memmingen (1525), the English Bill of Rights (1689), the French Declaration of the Rights of Man and of the Citizen (1789), and the Bill of Rights in the United States Constitution (1791)