in reply to read whole file in a directory

choroba has pointed out bugs in your regular expression (see Metacharacters in perlre). To avoid this sort of mistake, rather than typing all that in, you should consider building a regular expression expression from a list of words, like perhaps:

#penghilangan stopword my @words = qw( untuk dari di yang dan ini itu atau pada ke adalah setelah selalu daripada dengan dalam akan juga tidak karena tersebut ada bisa sebagai sudah saat oleh harus menjadi secara last modified lebih hanya para telah seperti sementara kepada namun sangat lalu belum bagi tak kalau bahwa tetapi dapat antara banyak kembali saja atas hingga melalui terjadi tapi sampai tentang sama agar memang lagi selama mencapai terus yakni the terhadap ketika merupakan sehingga sebuah jika bukan jadi sejumlah sejak perlu mulai jelas pun masih mengatakan menurut sekitar lain melakukan baru beberapa hal ); my $regex = join '|', map qr/\b\Q$_\E\b/, @words; $kata =~ s/$regex//g;

Other changes you might consider include:

  1. strict and warnings are good. See Use strict warnings and diagnostics or die.
  2. A more natural way of expressing $#ARGV + 1 != 1 might be @ARGV != 1
  3. Your second $kata =~ tr/[A-Z]/[a-z]/; is unnecessary, since you already lower-cased everything when building %freq.
  4. You have a whole bunch of substitutions for removing characters. Looking at them, I wonder if you really mean what you have written. For example, do you really want to remove the three character sequence "`”, or do you mean remove any occurrence of these three characters? (The escape before " is unnecessary) I think you would probably get your actual desired result replacing $kata =~ s/\d+//g;, $kata =~ s/[!.,()*]|\"`”//g; and $kata =~ s/-+//g; with $kata =~ s/[\d!.,()*"`”\-+]//g;

Update: Corrected oversight in replacement RE in 4. Thanks choroba.

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Replies are listed 'Best First'.
Re^2: how to create word list from input text file
by ask91 (Initiate) on Mar 28, 2012 at 17:11 UTC
    Thanks to choroba & kennethk

    great.. :D it works in a blink
    i've got an output.dat with the right words written on it exactly the same fromthe input text

    well..im ashamed with my ability of building a code program, bcause it's still messy
    but thanks for helping me :)
    that's comment in my code written in Indonesian Language-by the way- my country.

    i realize how cool ProgrammingLanguages are,,
    even different people with different language could think unite with program code :D
    First ask yourself `How would I do this without a omputer?' Then have +the computer do it the same way..

    i try to..just..seemed i do not have them understand my messy code writing..hhe