Free Software for DOS
Text Utilities – 2
Spellers, Dictionaries, Text Analysis, ASCII Charts

9 Dec 2005

Global Menu:
Go back to Front Page Menus

Go to top of Text Utils – 1
Go to top of Text Utils – 3
Go to top of Text Utils – 4
Go to top of Text Utils – 5



This page:
ASCII TEXT SPELLCHECKERS

WORD LISTS, DICTIONARIES, ENCYCLOPEDIAS

WORD COUNT & TEXT ANALYSIS

ASCII CHARTS

Page 1:
SEARCH AND REPLACE

sed – stream editor

SEARCH ONLY

grep – global regular expression print

LINE KILL / REPLACE

Page 3:
FILE SORTING

FILE COMPARE / DIFFERENCE

POSTSCRIPT AND PDF

CONVERT UNIX < > DOS FORMATS

CONVERT OTHER FORMATS

Page 4:
GENERAL TEXT FORMAT & FILTER

CHARACTER TRANSLATION & STRIPPING

DUPLICATE-LINE FILTERS

TEXT JUSTIFY

Page 5:
GENERAL TEXT VIEWERS

TSR (POPUP) TEXT VIEWERS

TEXT VIEWERS FOR PROGRAMMERS

SMALL / TINY TEXT VIEWERS

UNIX man AND info FILE VIEWERS

CONVERT TEXT TO EXE

ASCII TEXT SPELLCHECKERS

International Ispell (1) — Interactive text and HTML spell checker.

unrated

[added 1998-07-03, updated 2005-12-08]

Ispell, an interactive spell checker developed for Unix platforms, can be used as a standalone program or as an external checker for many power editors. This version includes English dictionaries (UK & US), and runs in text or HTML mode. This is a 32-bit DJGPP build, requires 80386+ and a DOS Protected Mode Interface (CWSDPMI or other)..

From the program help:
Whenever a word is found that is not in the dictionary,
it is printed on the first line of the screen. If the dictionary
contains any similar words, they are listed with a number
next to each one. You have the option of replacing the word
completely, or choosing one of the suggested words.

Commands are:

 R       Replace the misspelled word completely.
 Space   Accept the word this time only.
 A       Accept the word for the rest of this session.
 I       Accept the word, and put it in your private dictionary.
 U       Accept and add lowercase version to private dictionary.
 0-n     Replace with one of the suggested words.
 L       Look up words in system dictionary.
 X       Write the rest of this file, ignoring misspellings,
         and start next file.
 Q       Quit immediately. Asks for confirmation.
         Leaves file unchanged.
 !       Shell escape.
 ^L      Redraw screen.
 ^Z      Suspend program.
 ?       Show this help screen.

Authors: Geoff Kuenning et al. Port by Eli Zaretskii, Israel (2001).

2005-05-14: v3.3.01.

Downloads
Binaries, manual
isp3301b.zip
(978K)
Source
isp3301s.zip
(721K)

Geoff Kuenning's International Ispell Home Page.


International Ispell (2) — Interactive spell checker, supports 8-bit characters.

unrated

[added 1998-04-06, updated 2005-04-16]

This EMX/gcc-compiled DOS & OS/2 port minimally requires a 386 PC, but I'd recommend a fast 486 or Pentium with at least 8MB RAM and a disk cache. The package is a very large download, containing executables, source, and multiple language dictionaries (Dutch, English, French and German). The compiled English dictionary requires about 4.7MB disk space (contains at least 210,000 unique words including many technical and scientific terms). Supports 8-bit characters. Supports maintenance of a user ("private") dictionary, which by default is stored in the root directory with the filename _english. All in all, I like the comprehensiveness and "intelligence" of this ISPELL. The program itself loads slowly on a Pentium 60 (w/ 8MB RAM), and is much too slow on a 386/20 (8MB). Requires ANSI.SYS or equivalent, and DOS extender (included). I wouldn't waste time downloading this package unless you're willing to invest a _little_ time with setup. Package includes C source code.

Core commands are same as for v3.3, above.

Authors: Geoff Kuenning et al. (1983-1997). Port by Piet Tutelaers, Netherlands (1997).

1997-08-15: v3.1.20.

Download ispellw32.zip (2.5MB).

Geoff Kuenning's International Ispell Home Page.


GNU ispell — Interactive spell checker, runs well on older PCs.

unrated

[added 1998-04-06, updated 2005-04-16]

This old, but widely distributed 16-bit ispell includes only an English dictionary (38,000 words / 156K on disk). Run the program without parameters to check a single word, or pass it a filespec and it will enter a line-by-line interactive check / correction mode. It can check multiple files in sequence if you pass it a wildcarded filespec. The package lacks usage documentation (but see Downloads, below) and unless you're familiar with ispell, you could end up frustrated. Just hit the "?" key when inside the program (started with ispell ?) to get the list of navigation commands. Easy to use. I'm sure there are additional hidden features, but I haven't used it much. Runs briskly enough on a 386/20.

Commands are:

 R       Replace the misspelled word completely.
 Space   Accept the word this time only
 A       Accept the word for the rest of this file.
 I       Accept the word, and put it in your private dictionary.
 0-9     Replace with one of the suggested words.
 <NL>    Recompute near misses.  Use this if you interrupted
         the near miss generator, and you want it to
         return to this word.
 Q       Write the rest of this file, ignoring misspellings,
         and start next file.
 X       Exit immediately.  Asks for confirmation y/n.
         Leaves file unchanged.
 !       Shell escape.
 ^L      Redraw screen.

Other: To exit single-word mode, type ^C. Package includes the Look utility.

Capabilities which are absent in GNU ispell vs Internatiional Ispell: GNU's is not case sensitive, suffix handling is more primitive and it won't allow non-alphabetical characters into the dictionary.

Authors: Pace Willisson (1988). Port by Pavel Ganelin (1993).

1993-10-26: v4.0 (despite the version number, it is older than the Unix-based versions 3.x).

Downloads
Binaries
ispel40x.zip
(260K)
Source, full docs
ispell-4.0.tar.gz
(379K)

JSPELL — Excellent interactive spell checker (English dictionary).

* * * * *

[added 1998-09-17, updated 1998-10-25]

When considering both ease-of-use and versatility, you won't find a better choice than JSPELL. Note: JSPELL may not run on some faster Pentiums (divide overflow error) – use SLOWDOWN to avoid the error.

Author: Joohee Jeong (1998). Suggested by Robert Bull, Scott Nesbitt.

1998-10-21: v2.1x (10-98); "Added a feature that can omit lines starting with > or any other string specified by the user in the configuration file jspell.cfg. This feature is useful in spell-checking a reply to an email message. This version is freeware (No need for the registration code file)."

Download jspel211.zip (209K).


SpellTest — Spell checker for plain or html text; interactive mode or file report (English dictionary).

unrated

[added 1999-04-18, updated 2004-10-29]

This speller could be particularly useful to web authors because it ignores HTML codes in documents during a spell check. SpellTest can run in two modes: 1. A simple interactive mode allows manual replace of unknown terms – but has no features like "ignore all" or "add to custom dic"; 2. SpellTest probably functions best as a report-to-file speller. Reported terms are referenced by original document line numbers. No limit on text file sizes. Includes a large 2MB dictionary and user dictionaries are supported. Requires a fast 386+ PC and about 2MB RAM.

Usage : spelltst.exe <file> <options>
Options:
       -r:<report name> , by default report.txt
       -n Dont load addishional dictionaries.
       -o Online error fixing. (Ascii text files only).
       -nr Dont create a report file.

Author: Oleg Stepanyuk, Russia (1999).

Download spelltst.zip (972K).


GDSPELL — Interactive spell checker handles big files. (English dictionary)

* * *

[updated 2005-04-10]

GDSPELL is an easy to use standalone spell checker from the developers of the freeware NE editor (also included here). Both programs use the same dictionary, so you don't need to clutter your hard disk with different dictionaries. Unlike NE, GDSPELL can check big files, and create and use a custom dictionary. Spell checking dialog is similar to those found in popular word processors.

Limitations:

EXE size: 55K; Dictionary size: 370K

(Thanks to Yves Bellefeuille's freeware list for pointing me to this one).

Author: G.D. Davis (1995); distributed by GDSoft.

1995-06-01: v3.00b.

Download gdsp300b.zip (414K).


Tschek — Spell checker outputs list of all misspelled words to screen or file.

* * *

Most of us use word processors or stand-alone dialog spell checkers to perform "on-the-fly" spell checking and correction (e.g., GDSpell). But sometimes these spell checkers can be cumbersome and time consuming because they prompt word by word. If you are spell checking an HTML or technical document with a "dumb" spell checker, this can be tedious. Of course, you could add all the those strange words or tags to a user dictionary, but that's no fun either. Or, you could use a spell checker that simply outputs a list of unrecognized words to a file without any prompting or correction. You can browse the output file, quickly locate words that are obvious typos, and manually correct the original document (e.g., using a search / replace tool).

Features: Limitations:

Author: Timo Salmi, Finland (1996).

1996-03-02: v1.5.

Download tschek15.zip (68K).

More in these pages from Timo Salmi.


Look — Look up words (from a word list) to verify spelling.

unrated

[added 2001-10-21, updated 2005-04-16]

Look is not a spell checker but rather lists words from a word list file that most closely match a string (i.e., useful for looking up an uncertain spelling). Look is included in some ISPELL distributions but here it is listed separately to bring more attention to it.

Look.exe appears to work a lot like grep (in fact, it requires grep/egrep/fgrep for -r option). However, it has certain conveniences for looking up words in a spelling list. With no options, look searches a word list file for all words that start with the first characters of the string you give it. Options allow it to ignore caps or small letters, use bona file regular expression wildcards, and use dictionary order. Look is meant to be used within editors like vi that allow you to run external programs. It can also be used on the command line.

Look appears to be happy using any ASCII spelling list, such as the SIL Word List, or Moby Words (users can add, remove, or modify words in such lists with any text editor). By default, look uses a word list named ISPELL.WOR (included), but you can supply a different file as an option.

usage: look [-dfr] string [file]
 -d  dictionary order: consider only letters, digits, and spaces
 -f  fold upper case to lower
 -r  string is a regular expression

Note: Use of regular expression switch -r requires the programs grep/egrep/fgrep (not included) in path.

Suggested by Howard Schwartz.

Look.exe is part of the GNU ispell binaries package, above.


WORD LISTS, DICTIONARIES, ENCYCLOPEDIAS

Moby Words — English word, name, and phrase lists; 610,000+ entries (ASCII).

* * * * *

[added 2001-10-21, updated 2004-06-29]

Moby Words is part of the Moby Project, a large collection of lists of words and phrases, and works of literature (contents are now in the public domain).

Partial contents of Moby Words:

Author: Grady Ward (1996).

Downloads
Moby Words
mwords.tar.z
(4MB)
Moby Project
moby.tar.z
(26MB)

Get more info at the Moby Words page.

If you don't like a 26MB download, go to Moby Project page for smaller pieces.


SIL Word List — ASCII English, 110,000 words, can function as dictionary for some spellers.

unrated

[updated 2004-07-02]

Four text files contain approximately 110,000 English words total. The set can be used as a large dictionary for spellers that can use ASCII-only dictionaries. See Tschek for an application.

From the doc:
This word list includes inflected forms, such as plural nouns and the -s, -ed and -ing forms of verbs. Thus the number of lexical stems represented in the list is considerably smaller than the total number of words.

Author: Evan Antworth / SIL International (1991).

Downloads
A–D
words1.zip
(95K)
E–K
words2.zip
(75K)
L–R
words3.zip
(99K)
S–Z
words4.zip
(85K)

Jorj — English dictionary for DOS.

* * *

Jorj is a stand-alone dictionary program for DOS. Two executable versions are packaged together (compiled for different memory usage). Jorj can be run in memory resident (pop-up) or non-memory resident modes. One of the provided executables ("Omega") will use XMS memory when the program is run as a TSR.

One unique feature of Jorj is its ability to search for entries even when your spelling is incorrect. Jorj also has a "word scan" feature that will list all entries containing a given search string. The lexicon has some significant drawbacks. The word list is small but adequate (larger in registered version) and definitions are brief – and not authoritative. Words are syllabified, but parts of speech are lacking. Even with these shortcomings, JORJ still serves as a handy reference.

EXE size: 35K (alpha) or 64K (omega). Dictionary size: 1.2MB.

Author: George Fredal / Jorj Software (1997).

1997-01-01: Unnumbered release.

Download jorj97.zip (652K).


Probert E-Text Encyclopaedia (PENC) — Text-only encyclopedia.

* *

[updated 1998-11-10]

This text-only encyclopedia (not a program) is comprehensive enough to be useful. An encyclopedia this small can't add much detail to entries – and it doesn't. It should not be viewed as "authoritative" since there is no reference to author's source information. The text version is obviously difficult to navigate (hint: use LIST viewer and its find function). There are online encyclopedias on the Net (including the updated, but commercial / copyrighted online version of this one) that are easier to use and won't gobble up your disk space. Beyond these limitations, this may be a useful reference for you. It is divided into major topical sections, within which entries are sorted alphabetically. There are some areas covered which some "popular" CD-ROM dictionaries often neglect.

This encyclopedia has become a shareware / commercial product – the final freeware versions listed here are still available, but are no longer being updated.

Author: Matthew Probert, UK (1998).

1998-08-02: Edition 16.0.

Download either format
Plain text
penc1g.zip
(1.7M)
HTML
penc1ih.zip
(3.6M)

Browse the current edition of The Probert Encyclopaedia online.


WORD COUNT & TEXT ANALYSIS

Also see UXUTL or the GNU Textutils for UNIXish WC.


WCNT — Count and analyze word frequency in text and HTML documents.

* * * *

One of the more comprehensive "word count" programs I've encountered. It includes a host of options: Can analyze HTML documents (ignores tags in word counts). Count of lines, characters, non-whitespace characters, words, distinct words and unique words. Average length of words, distinct words and unique words. Sorted word lists with frequencies. Word length distribution histograms. Configurable word sets. DOS code page awareness. Multiple filespecs with wildcards: Outputs combined statistics of all files when passed a filespec with wildcards. Donationware.

Author: Branko Radovanovic, Croatia (1997).

1997-04-23: v1.20.

Download wcnt120.zip (20K).


wc — Simple word count program.

* * *

A DOS clone of the Unix wc utility with some added features. Unlike WCNT, wc 1) lists individual file stats when passed a filespec with wildcards. 2) Can read from standard input as well as from files. wc also generates error level values for use in batch files.

Author: Roman Nurilov (1997).

1997-08-06: v1.1

Download wc_11.zip (9.4K).


wc24 — Word counter also counts sentences, calculates readability index.

* * *

[updated 2005-04-16]

Another word count program that can optionally count sentences and generate a rough and ready "readability" index based on a combination of word length and sentence length. Can read from standard input as well as from files.

Author: Bob Ferguson, Netherlands (1996).

Download wc24.zip (15K).

More in these pages from Bob Ferguson.


TI (Text Information) — Comprehensive text file statistics generator.

unrated

[added 1999-08-19]

This program generates a wealth of statistics about a text file including size of file, whitespace, lines, blank lines, shortest line, longest line, average line length, average with blanks, number of pages, lines per page. Expects a single filename.

Options:
 -A#       Display how many times each letter is used.
	   the '#' represents an optional character to be counted specifically.
 -C#       Same as '-A' switch, except # is the numerical ASCII value of the
	   character to be counted specifically (A = 65 or 97, @ = 64).
 -F#       Treat words > # characters as long when calculating fog index *
	   # must be included in -F switch (default 9 if no -F switch)
 -L#       Display length of each line in the file. # is an optional number
	   specifying how long a 'long' line is, in characters
 -O:name   Prints output to both the screen and an output file, the name of the
	   output file is optional - the input name with .ti extension will be
	   used if no name is given. 'ti input.txt > output.txt' will not work.
 -P#       The number of lines per page for calculating number of pages.
 -T        Assume non-text file, changes handling of ascii values 128-255.
 -W        Display a list of word lengths used in the file.
 -?        This help screen.

Author: Quentin J. Christensen, Australia (1999).

Download ti12.zip (30K).


ASCII CHARTS

CTRLALT — TSR pops up ASCII charts, Hex table, ANSI codes, mark/paste, more.

* * * *

[updated 2005-07-01]

This may be the oldest DOS program I still use. CtrlAlt was released in 1986, but still serves a useful purpose if you're a DOS lover. Using a variety of mnemonic key combinations, you can pop up all sorts of charts from which you can paste special characters into a document. Includes ASCII, Hex, and Ansi code charts, key scan codes, line drawing characters. Can also mark and copy screen text. Includes more exotic stuff as well. Note: CTRLALT cannot unload itself from memory. This is one program that requires a thorough reading of the documentation – there are no help screens.

Authors: Barry Simon and Richard M. Wilson (1986).

1986-04-17: v1.0.

Download ctrlalt.zip (53K).


ASCIITab — TSR, mouse compatible ASCII chart.

unrated

[added 1998-04-28, updated 2005-07-01]

A nicely enhanced TSR ASCII chart symbol picker; especially useful in editors. Allows creation of a multiple character string on the chart's editing line (up to 255 symbols). Requires about 4.1K RAM with a default 128 character buffer. README.EXE displays in English and Russian.

Other features:

Also in download package: FontGrab v2.1b, font dumper utility, and Int9 v2.0b, reads keyboard scancodes.

Author: Dmitry B. Afanasiev, Russia (1996).

1996-05-06: v3.50.

Download ascii350.zip (34K).


T-CHAR — Non-TSR ASCII chart.

unrated

[added 1998-04-28, updated 2005-07-01]

1994-11-05:

A slick non-TSR ASCII chart which displays ASCII, HEX, and BIN values of selected characters. Returns all values for selected character. Originally part of the Terminate communications software package.

Author: Bo Bendtsen, Denmark (1994). Suggested by Robert Bull.

Download t-char.zip (16K).


Go to Top | Front Page ]


©1994-2004, Richard L. Green.
This Edition ©2004-2005, Richard L. Green and Short.Stop.