I just had occasion to find a list of the 1,000 most commonly used words in English.

I pasted them into BBEdit, set to Perl as the default language, and about forty of them came up coloured as Perl keywords.

I don't know what conclusions one can draw from this but I thought the monks might be interested. Would this be true of other languages? Would there be a higher or lower proportion?

The top 1,000 are below.

the, of, to, and, a, in, is, it, you, that, he, was, for, on, are, with, as, I, his, they, be, at, one, have, this, from, or, had, by, hot, word, but, what, some, we, can, out, other, were, all, there, when, up, use, your, how, said, an, each, she, which, do, their, time, if, will, way, about, many, then, them, write, would, like, so, these, her, long, make, thing, see, him, two, has, look, more, day, could, go, come, did, number, sound, no, most, people, my, over, know, water, than, call, first, who, may, down, side, been, now, find, any, new, work, part, take, get, place, made, live, where, after, back, little, only, round, man, year, came, show, every, good, me, give, our, under, name, very, through, just, form, sentence, great, think, say, help, low, line, differ, turn, cause, much, mean, before, move, right, boy, old, too, same, tell, does, set, three, want, air, well, also, play, small, end, put, home, read, hand, port, large, spell, add, even, land, here, must, big, high, such, follow, act, why, ask, men, change, went, light, kind, off, need, house, picture, try, us, again, animal, point, mother, world, near, build, self, earth, father, head, stand, own, page, should, country, found, answer, school, grow, study, still, learn, plant, cover, food, sun, four, between, state, keep, eye, never, last, let, thought, city, tree, cross, farm, hard, start, might, story, saw, far, sea, draw, left, late, run, don't, while, press, close, night, real, life, few, north, open, seem, together, next, white, children, begin, got, walk, example, ease, paper, group, always, music, those, both, mark, often, letter, until, mile, river, car, feet, care, second, book, carry, took, science, eat, room, friend, began, idea, fish, mountain, stop, once, base, hear, horse, cut, sure, watch, color, face, wood, main, enough, plain, girl, usual, young, ready, above, ever, red, list, though, feel, talk, bird, soon, body, dog, family, direct, pose, leave, song, measure, door, product, black, short, numeral, class, wind, question, happen, complete, ship, area, half, rock, order, fire, south, problem, piece, told, knew, pass, since, top, whole, king, space, heard, best, hour, better, true ., during, hundred, five, remember, step, early, hold, west, ground, interest, reach, fast, verb, sing, listen, six, table, travel, less, morning, ten, simple, several, vowel, toward, war, lay, against, pattern, slow, center, love, person, money, serve, appear, road, map, rain, rule, govern, pull, cold, notice, voice, unit, power, town, fine, certain, fly, fall, lead, cry, dark, machine, note, wait, plan, figure, star, box, noun, field, rest, correct, able, pound, done, beauty, drive, stood, contain, front, teach, week, final, gave, green, oh, quick, develop, ocean, warm, free, minute, strong, special, mind, behind, clear, tail, produce, fact, street, inch, multiply, nothing, course, stay, wheel, full, force, blue, object, decide, surface, deep, moon, island, foot, system, busy, test, record, boat, common, gold, possible, plane, stead, dry, wonder, laugh, thousand, ago, ran, check, game, shape, equate, hot, miss, brought, heat, snow, tire, bring, yes, distant, fill, east, paint, language, among, grand, ball, yet, wave, drop, heart, am, present, heavy, dance, engine, position, arm, wide, sail, material, size, vary, settle, speak, weight, general, ice, matter, circle, pair, include, divide, syllable, felt, perhaps, pick, sudden, count, square, reason, length, represent, art, subject, region, energy, hunt, probable, bed, brother, egg, ride, cell, believe, fraction, forest, sit, race, window, store, summer, train, sleep, prove, lone, leg, exercise, wall, catch, mount, wish, sky, board, joy, winter, sat, written, wild, instrument, kept, glass, grass, cow, job, edge, sign, visit, past, soft, fun, bright, gas, weather, month, million, bear, finish, happy, hope, flower, clothe, strange, gone, jump, baby, eight, village, meet, root, buy, raise, solve, metal, whether, push, seven, paragraph, third, shall, held, hair, describe, cook, floor, either, result, burn, hill, safe, cat, century, consider, type, law, bit, coast, copy, phrase, silent, tall, sand, soil, roll, temperature, finger, industry, value, fight, lie, beat, excite, natural, view, sense, ear, else, quite, broke, case, middle, kill, son, lake, moment, scale, loud, spring, observe, child, straight, consonant, nation, dictionary, milk, speed, method, organ, pay, age, section, dress, cloud, surprise, quiet, stone, tiny, climb, cool, design, poor, lot, experiment, bottom, key, iron, single, stick, flat, twenty, skin, smile, crease, hole, trade, melody, trip, office, receive, row, mouth, exact, symbol, die, least, trouble, shout, except, wrote, seed, tone, join, suggest, clean, break, lady, yard, rise, bad, blow, oil, blood, touch, grew, cent, mix, team, wire, cost, lost, brown, wear, garden, equal, sent, choose, fell, fit, flow, fair, bank, collect, save, control, decimal, gentle, woman, captain, practice, separate, difficult, doctor, please, protect, noon, whose, locate, ring, character, insect, caught, period, indicate, radio, spoke, atom, human, history, effect, electric, expect, crop, modern, element, hit, student, corner, party, supply, bone, rail, imagine, provide, agree, thus, capital, won't, chair, danger, fruit, rich, thick, soldier, process, operate, guess, necessary, sharp, wing, create, neighbor, wash, bat, rather, crowd, corn, compare, poem, string, bell, depend, meat, rub, tube, famous, dollar, stream, fear, sight, thin, triangle, planet, hurry, chief, colony, clock, mine, tie, enter, major, fresh, search, send, yellow, gun, allow, print, dead, spot, desert, suit, current, lift, rose, continue, block, chart, hat, sell, success, company, subtract, event, particular, deal, swim, term, opposite, wife, shoe, shoulder, spread, arrange, camp, invent, cotton, born, determine, quart, nine, truck, noise, level, chance, gather, shop, stretch, throw, shine, property, column, molecule, select, wrong, gray, repeat, require, broad, prepare, salt, nose, plural, anger, claim, continent, oxygen, sugar, death, pretty, skill, women, season, solution, magnet, silver, thank, branch, match, suffix, especially, fig, afraid, huge, sister, steel, discuss, forward, similar, guide, experience, score, apple, bought, led, pitch, coat, mass, card, band, rope, slip, win, dream, evening, condition, feed, tool, total, basic, smell, valley, nor, double, seat, arrive, master, track, parent, shore, division, sheet, substance, favor, connect, post, spend, chord, fat, glad, original, share, station, dad, bread, charge, proper, bar, offer, segment, slave, duck, instant, market, degree, populate, chick, dear, enemy, reply, drink, occur, support, speech, nature, range, steam, motion, path, liquid, log, meant, quotient, teeth, shell, neck


($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: Common Words, Perl Keywords
by antirice (Priest) on Nov 25, 2003 at 22:31 UTC

    Yes! Finally, someone may agree with me to change our keywords with much better words. I've always been quite fond of crabcakes, periwinkle, and of course uncopyrightable. I think kill should be replaced with bore_to_death. tie should be replaced with tye, log with timber and our with not_their. and and or should be replaced with dynasty and days_of_our_lives, push should be changed to shove to show that snobby little array that we mean business, no should be replaced with damn, use should be replaced with periwinkle, tell should be replaced with can_you_hear_me_now, while should become as_the_world_turns and study should become wtf.

    Of course, I've always regretted that perl doesn't include a bend_to_my_will_computer built-in. That's why I'm looking forward to Perl 6! :)

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      I know of at least one OS with a system library routine "_make_me_unpreemptable".
Re: Common Words, Perl Keywords
by hardburn (Abbot) on Nov 25, 2003 at 21:30 UTC

    Perl tends to follow principles of natural languages closer than most computer languages, so I'd expect the count to be relatively high. However, any language designed by native English speakers will almost certainly have a lot of English-based keywords (unless you're talking about BF).

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

      Perl tends to follow principles of natural languages closer than most computer languages

      Closer than what? SML? Perl's constantly held up as being "closer to natural languages" but compared to what? C? When you make a fair comparison, like Ruby or Python (or even Java) that simply does not hold up. You can write code in Python that reads as concise sentences and doesn't look like \$@%<>{}; are all now letters of the alphabet.

      As for number of words 25 of those are python keywords. This is a much higher percentage of keywords than Perl. This is also completely irrelevant as being one of those words does not make the language easier to use. Def, del, exec are all good keywords but won't be found on that list. If you're going to look at how close a programming language is to a human language, you'd be better off focusing on the actual instruction statements rather than numbers of keywords on some list.

        comparison, like Ruby or Python (or even Java) that simply does not hold up. You can write code in Python that reads as concise sentences and doesn't look like \$@%<>{}; are all now letters of the alphabet.

        If the 1000 words list had contained the common swearing and cursing in spoken english , regexes and perlvar might have given Perl better percentage ;-)

        The movement towards making programming languages as close to English as possible was a big failure (see also: COBOL, FORTRAN). However, following the principles of natural languages doesn't necessarily mean you have a lot of English-like words in your syntax (or French-like, or German-like, or whatever). Rather, it means code is structured in a similar way.

        For instance, you might say "Release the sheep, unless it's raining". Wheras in Perl you could say $sheep->release unless is_raining();. Another example is the use of a pronoun: $_. How many times do you say 'it' in normal English speech? A bloody lot of times. $_ is used similarly.

        This isn't to say that other programming languages don't have some aspects of natural language in them. The difference is that natural language is all but an explicit design goal in Perl.

        ----
        I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
        -- Schemer

        : () { :|:& };:

        Note: All code is untested, unless otherwise stated

      Love Perl, but don't sit in the well.

Re: Common Words, Perl Keywords
by diotalevi (Canon) on Nov 25, 2003 at 21:32 UTC
    In running that list against B::Keywords (an extract of perl's recognized barewords) I come up with the following forty intersections: and, close, connect, continue, die, do, each, else, for, if, join, kill, last, length, listen, log, map, my, next, no, open, or, our, print, push, read, require, select, send, sleep, study, system, tell, tie, time, until, use, wait, while, write.
Re: Common Words, Perl Keywords
by allolex (Curate) on Nov 25, 2003 at 23:57 UTC

    This is certainly an interesting exercise (at least for me) but before getting into the meat of my remarks, I'd like to question you about your methods and sources. First of all, where did you get these words from? How big was the text source? What were your criteria for defining what "word" means? (I see "won't" and "don't" in the list, which most people would say are two words :) )

    I've noticed that most of the keywords in the list are action verbs. That seems to follow since programming is about giving a machine instructions to do something. And considering that these instructions are supposed to be on a fairly basic level, it's also not surpising that the action verbs used as Perl keywords are highly frequent.

    Actually, this makes me recall a thread started by liz about the Natural Semantic Metalanguage (NSM), which proposes that word meaning can be decomposed into atomic units of meaning called primes. Research bears out that the primes tend to have a lot in common with very frequent words in most languages. The semantic primes, however, have a special meaning that may not exactly correspond to the meaning of the word in common use, even though it may have the same name for the convenience of the linguists who use NSM to describe meaning. This is a direct parallel of programming language designers' choices about which actions, relationships, or evaluations need to be expressed as a word, and which should be expressed as a sigil, or via syntax.

      I'd like to question you about your methods and sources. First of all, where did you get these words from? How big was the text source? What were your criteria for defining what "word" means?

      I got them from an about.com site for teachers of English.

      I have no idea of the provenance or the sample.



      ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print

        The source is attributed to "Jerry Jones" (big help). Anyone interested in looking at the list can see it here. It's supposed to be North American English, if anyone cares.

        --
        Allolex

        The Moby Lexicon project, now concluded, has several different slices of the dictionary; it found the most common words in a couple of different samples, and ranked them by prevalence. This kind of data is very useful for certain search analysis: rank a match which hits a less-common word higher than a match on mundane words. I was doing some work on protocol compression and canonical word numbering as well. The Moby Lexicon can be found with Google, and has other goodies like parts-of-speech, hyphenation, common person names by gender, and a few studies of other languages.

        --
        [ e d @ h a l l e y . c c ]

Re: Common Words, Perl Keywords
by Thelonius (Priest) on Nov 25, 2003 at 21:33 UTC
    No. For example, if you looked at the top 1000 Russian words, very few would be Perl keywords. ;-)
Re: Common Words, Perl Keywords
by Wassercrats (Initiate) on Nov 29, 2003 at 08:28 UTC
    I think it would be a good rule to have none of a programming language's keywords be English words. It would make it easier for context highlighting and any find-keyword operation, and it would make selection of non-conflicting variable names easier. They could be based in English words, but with some twist, for example, prnt instead of print, and clos instead of close. Larry might approve, since he created elsif.