jajaja has asked for the wisdom of the Perl Monks concerning the following question:
in left column its a word and in right column number of occurencies in some different text of same language. i have file "input" looking like thisa 14458708 se 10848091 v 10688846 na 8721120 je 5353514 že 4991304
and i need to replace the words in "input" with words with highest occrencies from "cetnosti" and write it into output. the problem is that file cetnosti is too big to read it all into memory so i read only beginning from it with most used wordsJe mi urcite cti, avsak predstavit pomerne strucne, a navic bez moznos praktickych ukazek, nas hlavni a nosny produkt, muze zpusobit I male komplikace. Proto prijmete prosim tento clanek jako snahu, poskytnout
now i was thinking how to replace words from input and write them into outputuse Tree::Trie; $trie = new Tree::Trie; $filei = "cetnosti"; $filer = "input"; $filew = "output"; open(INFO, $filei) || die "error: couldnt open file: $!"; $lineno=1; while ((defined ($line = <INFO>)) && ($lineno < 100000)) { $line =~ s/\t.*//g; $line =~ s/\n//g; $trie->add($line); $lineno++; } close(INFO);
anybody can think about some good way? thank you for helpopen(READ, $filer) || die "error: couldnt open file: $!"; open(WRITE, "> $filew") || die "error: couldnt open file: $!"; while (defined ($line = <READ>)) { @words = split(/ /, $line); #here id like to compare word from each line of "input" with "cetnosti +" and write it to "output" but i have no idea how to do it } close(READ); close(WRITE);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: fill diacritic into text
by Fletch (Bishop) on May 30, 2007 at 16:26 UTC | |
by Grundle (Scribe) on May 30, 2007 at 17:34 UTC | |
| |
|
Re: fill diacritic into text
by graff (Chancellor) on May 30, 2007 at 18:23 UTC | |
by jajaja (Initiate) on May 31, 2007 at 06:27 UTC | |
by graff (Chancellor) on May 31, 2007 at 12:56 UTC | |
|
Re: fill diacritic into text
by BrowserUk (Patriarch) on May 30, 2007 at 18:16 UTC | |
|
Re: fill diacritic into text
by ambrus (Abbot) on May 31, 2007 at 09:24 UTC | |
by jajaja (Initiate) on May 31, 2007 at 11:19 UTC | |
by ambrus (Abbot) on Jun 01, 2007 at 09:35 UTC |