Hello Monks

I wanted a simple keyword generator for my CMS that I have written in ANSI C. So I came up with this, but am not sure the best way to implement it (or something similar if it exists, I was not able to find a module or anything that does this 'type' of thing.

I know this is very rudimentary and quite possibly not very efficient, but I'm fairly green with perl.

The first thing I was wondering, is it possible to actually compile this snippet into an ANSI C program? or is that foolish thinking?

I can CURL to the program and pass it the $line and the $stopwords if that is easiest (using CGI module I would assume, I have toyed with that in the past.)

I can use a system call as well, but I always get told that's not a good method.

Lastly I can connect to the Mysql database and get the body content as well, but that's a little more work, as I have not used mysql in my perl before

I am open to any other suggestions as well!, Thank you in advance for any and all wisdom imparted!

#!/usr/bin/perl use strict; use warnings; my $line = <<TEXT; Moby-Dick was published in 1851 during a productive time in American l +iterature, which also saw the appearance of Nathaniel Hawthorne's The + Scarlet Letter and Harriet Beecher Stowe's Uncle Tom's Cabin. Two ac +tual events served as the genesis for Melville's tale. One was the si +nking of the Nantucket ship Essex in 1820, after it was rammed by a l +arge sperm whale 2,000 miles (3,200 km) from the western coast of Sou +th America.[4][5][6] First mate Owen Chase, one of eight survivors, r +ecorded the events in his 1821 Narrative of the Most Extraordinary an +d Distressing Shipwreck of the Whale-Ship Essex. The other event was the alleged killing in the late 1830s of the albin +o sperm whale Mocha Dick, in the waters off the Chilean island of Moc +ha. Mocha Dick was rumored to have twenty or so harpoons in his back +from other whalers, and appeared to attack ships with premeditated fe +rocity. One of his battles with a whaler served as subject for an art +icle by explorer Jeremiah N. Reynolds[7] in the May 1839 issue of The + Knickerbocker or New-York Monthly Magazine. Melville was familiar wi +th the article, which described: TEXT my $stopwords = "and|that|they|very|you|your|want|are|able|aren|are|bu +t|doesn|the|see|not|most|many|need|needs|look|just|get|from|for|all|t +his|have|who|with|was|went|when|has|him|his|what|which|while|two"; $line =~ s/[[:punct:]]|[0-9]/ /g; $line = lc ($line); $line =~ s/\b(?:$stopwords)\b/ /gi; my %count_of; foreach my $word (split /\s+/, $line) { length($word) > 2 and $count_of{$word}++; } print "All words and their counts: \n"; for my $word (sort keys %count_of) { $count_of{$word} > 1 and print "'$word': $count_of{$word}\n"; } __END__

In reply to Simple Keyword Generator by itsscott

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.