[all variants] seeking various "keyword" commands [Archive]

SaintDanBert

August 23rd, 2011, 03:50 PM

I went looking for the command kwic to create a keyword in context index of some text files. No joy.

I do not find a package by that name.

I do not find anything using apropos keyword or apropos context.

A man-page for the command might look like this http://www.thinkage.ca/english/gcos/expl/kwic.html

Output from the command might look like this

Thus the input line

cooking the wily cauliflower

would produce output lines of the form

cooking the wily cauliflower
cooking the wily cauliflower
cooking the wily cauliflower

Variations on the output report not only these strings but the line number where that string appears in the original document. Command line options control thelength of string considered context, appearance and sort order, excluded words, output format (groff, TeX, html, &c) and other processing details.

There is a second command -- I forget the name -- that will read a text file and generate a "histogram" of the keywords. The output consists of separate lines, each line holding a word and a count. The most frequently used word appears first. The least frequently used word appears last. Command line options alter the sort order and other details of the presentation.

Thanks in advance,
~~~ 0;-Dan

SaintDanBert

August 25th, 2011, 03:29 PM

Bump!

Someone out there must know about kwic indexing for linux documents.

~~~ 0;-Dan

jfb3

August 25th, 2011, 04:51 PM

Have you checked http://stsdas.stsci.edu/cgi-bin/gethelp.cgi?histogram
???

SaintDanBert

August 25th, 2011, 06:38 PM

Yes, I've used that tool often ... given the data.

I'm looking for a tool that will scan a text or similar document file
and count words:

prompt$ someCommand prose.txt

the 5287
a 3123
uh 1456
quick 234
brown 156
fox 82

The other tool I'm seeking does the keyword index.

Cheers,
~~~ 0;-Dan