Dear monks
I am trying to filter out archaic entries from a dictionary. They are marked with "(arch)". This is my best regexp so far (sorry, I never quite got used to using /x):
s#(?<=;)(?:\(\S+\) )*\(\d+\) (?:\(\S+\) )*\(arch\).*?;($| \(\d+\))#$1#gAnd here is a selection of lines I am trying to filter:
(n,vs) (1) look; glimpse; glance; (vs) (2) to glance; to glimpse; (3) +(arch) first meeting; (adv) (4) apparently; seemingly; (n-t,n-adv) (1) moment; a (short) time; a while; (2) former times; (3) + (arch) two-hour period; (v5s,vt) (1) to pass (time); to spend; (2) to overdo (esp. of one's al +cohol consumption); to drink (alcohol); (3) (arch) to take care of; t +o support; (suf,v5s) (4) to overdo; to do too much; (5) to ... withou +t acting on it; (pn,adj-no) (1) we; us; (2) (arch) I; me; (3) (arch) you (referring to + a group of one's equals or inferiors); (n) (1) eye; eyeball; (2) (arch) pupil and (dark) iris of the eye; (3) + (arch) insight; perceptivity; power of observation; (4) (arch) look; + field of vision; (5) (arch) core; center; centre; essence; (v5m,vt) (1) to step on; to tread on; (2) to experience; to undergo; ( +3) to estimate; to value; to appraise; (4) to rhyme; (5) (arch) to in +herit (the throne, etc.); (6) to follow (rules, morals, principles, e +tc.); (v5s,vt) (1) to build up; to establish; (2) to form; to become (a stat +e); (3) to accomplish; to achieve; to succeed in; (4) to change into; + (5) to do; to perform; (aux-v) (6) (arch) to intend to; to attempt; +to try; (7) (arch) to have a child; (adv) (1) (uk) that is to say; that is; in other words; I mean; (2) (u +k) in short; in brief; to sum up; ultimately; in the end; in the long + run; when all is said and done; what it all comes down to; when you +get right down to it; (n) (3) (uk) clogging; obstruction; stuffing; ( +degree of) blockage; (4) (uk) shrinkage; (5) (uk) end; conclusion; (6 +) (uk) (arch) dead end; corner; (7) (uk) (arch) distress; being at th +e end of one's rope; (n,adj-no) (1) inside; within; (2) while; (3) among; amongst; between; + (pn,adj-no) (4) we (referring to one's in-group, i.e. company, etc.) +; our; (5) my spouse; (n) (6) (arch) imperial palace grounds; (v5r,vi) (1) to rot; to go bad; to decay; to spoil; to fester; to deco +mpose; to turn sour (e.g. milk); (2) to corrode; to weather; to crumb +le; (3) to become useless; to blunt; to weaken (from lack of practice +); (4) to become depraved; to be degenerate; to be morally bankrupt; +to be corrupt; (5) to be depressed; to be dispirited; to feel discour +aged; to feel down; (suf,v5r) (6) (uk) (ksb:) indicates scorn or disd +ain for another's action; (v5r,vi) (7) (arch) to lose a bet; (8) (arc +h) to be drenched; to become sopping wet; (v5s,vt) (1) to build up; to establish; (2) to form; to become (a stat +e); (3) to accomplish; to achieve; to succeed in; (4) to change into; + (5) to do; to perform; (aux-v) (6) (arch) to intend to; to attempt; +to try; (7) (arch) to have a child;
The format seems to be (part-of-speech) (number) (tags) colon-separated definitions, where tags contains the arch tag, and part-of-speech is repeated only once.
In reply to Removing with regexps by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |