The output:# # WARNING WARNING WARNING WARNING # # USE AT YOUR OWN RISK. # # THIS IS A MASSIVE KLUDGE. # # YOU HAVE BEEN WARNED. # my $in = <DATA>; # ASSUME sentences end in a period and a space. my @sentences = split '\. ', $in; foreach( @sentences ) { # ASSUME these words are mostly useless # for our purposes... s/\b(with|a|of|the|in|just)\b//gi; # ASSUME phrases are comma-separated. my @phrases = split ','; my @subjects = (); my @descs = (); foreach ( @phrases ) { s/^\s*//; # trim leading spaces. s/\n//g; # remove newline. # Well, do we have a subject, or a descriptor? # ASSUME subjects are capitalized (!!) push @subjects, $_ if /^[A-Z]/; # ASSUME descriptions are not. push @descs, $_ unless /^[A-Z]/; } # Print 'em all out. foreach my $subj ( @subjects ) { my @subsub = ($subj); # ASSUME 'and' separates multiple subjects (!!) @subsub = split ' and ', $subj if $subj =~ /\band\b/; foreach my $ss (@subsub) { print "$ss: $_\n" foreach @descs; } } } __DATA__ With a population of more than 10.2 million, Seoul, the capital of Sou +th Korea, is the world's largest city in terms of population. Sao Pau +lo(Brazil), the world's second-largest city, has a population of just + over ten million. Three other cities, Bombay(India), Jakarta(Indones +ia) and Karachi(Pakistan), have grown to more than nine million peopl +e.
Seoul: population more than 10.2 million Seoul: capital South Korea Seoul: is world's largest city terms population Sao Paulo(Brazil): world's second-largest city Sao Paulo(Brazil): has population over ten million Three other cities: have grown to more than nine million people. Bombay(India): have grown to more than nine million people. Jakarta(Indonesia): have grown to more than nine million people. Karachi(Pakistan): have grown to more than nine million people.
In reply to Re: Re: Re: The (futile?) quest for an automatic paraphrase engine
by rje
in thread The (futile?) quest for an automatic paraphrase engine
by dimar
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |