|Free Software for DOS|
Text Utilities 3
Sort, Compare / Difference, Convert, PDF
|9 Dec 2005|
|Go to top of Text Utils 1|
|Go to top of Text Utils 2|
|Go to top of Text Utils 4|
|Go to top of Text Utils 5|
|This page:||FILE SORTING|
|FILE COMPARE / DIFFERENCE|
|POSTSCRIPT AND PDF|
|CONVERT UNIX < > DOS FORMATS|
|CONVERT OTHER FORMATS|
|Page 1:||SEARCH AND REPLACE|
|sed stream editor|
|grep global regular expression print|
|LINE KILL / REPLACE|
|Page 2:||ASCII TEXT SPELLCHECKERS|
|WORD LISTS, DICTIONARIES, ENCYCLOPEDIAS|
|WORD COUNT & TEXT ANALYSIS
||ASCII CHARTS||Page 4:||GENERAL TEXT FORMAT & FILTER||CHARACTER TRANSLATION & STRIPPING||DUPLICATE-LINE FILTERS||TEXT JUSTIFY||Page 5:||GENERAL TEXT VIEWERS||TSR (POPUP) TEXT VIEWERS||TEXT VIEWERS FOR PROGRAMMERS||SMALL / TINY TEXT VIEWERS||UNIX ||CONVERT TEXT TO EXE|
Also see: 32-bit SORT included with the GNU Textutils.
RPSORT Sorts large files extremely fast.
* * * * *
[added 1998-03-21, updated 2004-06-28]
A super-fast sort program which handles large files. "RPSORT supports numerous sort key types including regular text keys, C language strings, Turbo Pascal strings, signed and unsigned binary integers of any length and several types of binary floating point numbers."From a reader:
I tested many of the sort programs in the SimtelNet repository on text files. Most are limited somehow (like DOS sort), or choke, or take a long time to sort, or plainly produce a wrong output (missing or extra records, etc.). The final two survivors were msort and rpsort. I tested both on very long text files (tens of megabytes: the collated complete works of Shakespeare, Project Gutenberg). Msort took several tens of minutes, rpsort did the same in *seconds* (I thought it hadn't run at all.) Given that, there was nothing else to say about DOS sort programs, in my opinion.
Author: Robert Pirko (1992). Suggested by João Magalhaes.
Download rpsrt102.zip (88K).
PCSORT Full screen text sort program, supports block, word, and multi-line sorting.
PCSORT (9K) runs as a full screen, interactive program by default but can also function in the role of command line filter. Although source file size is limited by available conventional memory, PCSORT offers an easy-to-use interface and can sort multiline records (up to 9 lines) and blocks simultaneously. Results can be viewed before being written to disk.
/Sn n=size of record in lines (1-9) /Pn n=sort priority (1-9) /R Sort current priority in reverse order /N Numeric sort current priority /C Case sensitive sort /L[n] Line sort: n=record sort line (1-9) /[B][+] nn [xx [y]] Block or column sort: nn=start column xx=width y=sort line (1-9) /W [+|-] n Word sort: n=word count minus = count from end of record
Screen menu commands: F1 Displays all sort fields; Alt-F1 Resets all the sort variables to their defaults; F2; Save file; F3 New file; F4 Sort text; F5 Increase lines per record (1-9); Shift F5 Decrease lines per record; F6 Select next key priority (1-9); Shift F6 Select previous key priority; F7 Sort order (de/ascending); F8 Alphanumeric or Numeric sort; F9 Select next Field type: Line, block, word or none; Shift F9 Select previous Field type; F10 Mark the record line for line sort or mark block sort field or select sort word count; Shift F10 Reverse selection of word count.
The v. 1.1 update of PCSORT was originally published in 1991 but apparently is not widely distributed on the Net. The pcsort11.zip archive contains the asm source code, the doc file and the com program for PCSORT as updated 4/18/91 to fix a problem with form feeds at ends of data files. Also contains PCSORT article published in PC Mag: see the included *.xyw (XyWrite) docs.
Author: Michael J. Mefford, for PC Magazine (1991). Suggested by Robert Bull.
Download pcsort11.zip (40K).
RALPH Sort lines of text in reverse alphabetical order.
* * * *
RALPH sorts lines of text from right to left, i.e, lines are read backwards. If input has multi-word lines, then output will be sorted by line-final words, etc.
ear earache earaches eardrop eardrops eardrum eardrums eared earflap earflaps earful earfuls elephant elephantiases elephantiasis elephantine elephants imprecated imprecates imprecating imprecation imprecations raindrop raindrops raining > eared imprecated earache elephantine raining imprecating earful eardrum imprecation earflap raindrop eardrop ear earaches elephantiases imprecates elephantiasis earfuls eardrums imprecations earflaps raindrops eardrops elephants elephant
abajar abajo desganar desganchar desgano desgarbado desgarbilada desgarbilado desgarbo desgargantar desgargantarse desgargolar desgaritar desgarrada desgarradamente desgarrado > desgarbilada desgarrada desgargantarse desgarradamente desgarbo desgarbado desgarbilado desgarrado abajo desgano desganchar abajar desgargolar desganar desgaritar desgargantar
Syntax: ralph [-a] [-p padding] [infile] > [outfile] -a Extract analysis failures from an AMPLE log file. -l linesize Set the maximum line length (default is no limit). -p padding Specify the minimum padding for each line (default is 0). If no infile is specified, ralph reads from the standard input. If no outfile is specified, ralph writes to the standard output.
Author: SIL International (1998).
1989-01-24: v1.1 for DOS16. Runs on DOS 2.0+. Handles files up to ~128K. Bug: Removes top bit from upper ASCII characters fixed in v1.1b. Package also contains scripts with similar function, for
awk and other Unix programs.
1998-09-01: v1.1b for DOS32. DJGPP build, requires 80386+ and a DOS Protected Mode Interface (CWSDPMI or other).
1998-09-01: v1.1b for Win32 console.
Get more programs for linguists from the SIL Software Catalog.
|FILE COMPARE / DIFFERENCE|
Text file compare programs are frequently used by programmers for version maintenance but they can also be used by us common folk to compare two versions of an ascii document (e.g., different versions of those autoexec.* files that accumulate in your root directory, file lists, etc.). The programs below may use different "difference" algorithms each with unique strengths / limitations. Not usually a major issue for simple uses, but you may wish to try them all to determine which suits your needs and style best. The programs listed below may not be the best picks for programming needs.
Double Lister Dual window text comparer.
added 12-15-98, updated 2004-08-20]From the docs:
...displays two files simultaneously in separate windows. These can be scrolled individually, or locked together, useful to locate differences between similar files. Options: Search for text string, split windows horizontally or vertically, change window size and tab spacing, display line ends, 7-bit mode, hex mode with offsets and alignment.
Author: Steven S. Bates (1989). Suggested by Robert Bull.
Visual Compare Feature-rich file comparer.
* * * * *
A favorite for general use because of the flexible display options. Interactive and command line modes; possesses an internal viewer with scrolling capability; by default it colorizes new/old/changed text which makes for easy comprehension of differences. Split window (horiz. or vertical), dual file display option. Flexible output options. Understands UNIX formatted text.
"The maximum allowed line length in file one and file two is 2048 characters. The maximum number of lines that file one and file two each can contain is 16368. The maximum number of lines that the composite file can contain is 16368."
Difference algorithm used: "linear space refinement of the basic O(ND) difference algorithm."
Command line usage: VCOMP fileone filetwo [options] Options: /B...Monochrome display. /Tn...Tab width. Range is 2-64. Default is 8. /25...Display 25 lines if you have either an EGA or a VGA. /43...Display 43 lines if you have an EGA. /50...Display 50 lines if you have a VGA. /S[-].Write edit script to standard output. /C...Write composite file to standard output. /D...Write difference file to standard output. /En...Maximum edit distance. Range is 0-32736. Default is 32736. /I...Ignore leading space and tab characters. /K...Consider upper-case and lower-case letters equivalent. /Z...Consider all characters significant.
Author: John R. Whitney (1993).
Download vc154.zip (38.4K).
@COMPARE Text file comparer for very large files.
* * *
(aka "ATCOMPARE", "ACOMPARE"). Comprehensible ouput to screen is color coded but you can't scroll back through output as in VCOMP. Easily digestible report-to-text file output with side-by-side comparisons (unfortunately broken word fragments can result from the program's wrapping of text when generating side-by-side comparisons).Limitations:
Usage: @Compare [options] [<filename1> [<filename2>]] where [options] begins with / or -, and is a combination of the following: P -directs output to the printer F -directs output to a file M -suppresses colors for monochrome monitors T -suppresses the title header H -suppresses highlighting in unequal lines A -replaces graphics characters with standard Ascii codes R -prints a report of discrepancies by field to a file C -disables breaks after every screenful of output L -allows for long and E -extra long searches; not usually necessary B -suppress direct video writes; use BIOS instead Q -quits
Author: Brian C. Madsen (1994-98). Suggested by Marianna Van Erp.
1999-05-13: v1.8. Bug fix for fast Pentiums.
Download atcomp18.zip (23K).
jDif Fast file difference utility.
Don't have much experience with this one. Fast, color-coded output to screen, or send results to report file.
Limitations:Syntax: jdif oldfile newfile [options] /a...370 Assembler (columns 1 to 72) /c...COBOL (columns 7 to 72) /f...DOS FC style output /h...Help (this is it) /r...output Report /v...do not buffer output
"Private persons are hereby licensed to use the software at home for non-commercial purposes at no charge."
Author: Jonathan Rosenne / QSM Programming Ltd., Israel (1996). Suggested by Marianna Van Erp.
Download jdif01.zip (36K).
FINTRSCT Compare 2 files; outputs shared / unique lines to 3 report files.
[added 1998-07-05, updated 2001-07-18]
File Intersection takes a different approach to the task of file comparing. FINTRSCT compares two (smaller) files and outputs three files: one file listing lines unique to file 1; a second file containing lines unique to file 2; and a third file containing paired shared lines. Lines are numbered to allow easy location in original files. The order of lines in the input files is not relevant, and comparisons are case insensitive. Useful for comparing different versions win.ini, autoexec.bat, etc..also useful for comparing updated file lists (e.g., easily determine "what's new"). Two version included: 16-bit DOS and 32-bit Windows.
Remarks: This tool acts similar to line uniqifiers but unlike the latter doesn't require the manual merge of the two text files and post-merge sorting . The DOS version handles smaller files. I tested two 200K files (about 20,000 one-word lines each) and the program locked my machine. I then tested two 100K files (about 10,000 one-word lines, with one shared line) and it worked, but took about two minutes to process on a P-60. Win32 version untested.
USAGE: fintrsct file1 file2 Creates : unique1 - lines unique to file1 unique2 - lines unique to file2 common - lines common to file1 and file2
Author: Paul Trout (1996). Suggested by Marianna Van Erp.
1996-12-07: Unnumbered release.
Download fintrsct.zip (25K).
|POSTSCRIPT AND PDF|
Also see AntiWord, below.
Here is a list of links, to PostScript and PDF documents by Adobe and others.
a2ps (Any to PostScript) Generates PostScript from ASCII, dvi, and other file formats.
* * * * *
a2ps builds PostScript documents by adding formatting codes to a source text. Output may be sent directly to a PostScript printer, or to a file which can be viewed (and more) with Ghostscript. This is a large, complex program, but setting it up and learning it will pay off if you need PostScript docs, you will be very happy with a2ps. Ported from Unix, 32-bit DJGPP build, requires 80386+ and a DOS Protected Mode Interface (CWSDPMI or other).From the docs:
The format used is nice and compact: normally two pages on each physical page, borders surrounding pages, headers with useful information (page number, printing date, file name or supplied header), line numbering, pretty-printing, symbol substitution etc. This is very useful for making archive listings of programs or just to check your code in the bus. Actually a2ps is kind of bootstrapped: its sources are frequently printed with a2psUsage:
While at the origin its name was derived from "ASCII to PostScript", today we like to think of it as "Any to PostScript". Indeed, a2ps supports delegations, i.e., you can safely use a2ps to print DVI, PostScript, LaTeX, JPEG etc., even compressed.
A short list of features of a2ps might look like this:
- Customizable through various configuration files
- Powerful escapes to define the headers, table of contents etc. the way you want
- Variables to push even further the customizability in a comfortable manner
- Open approach of encodings
- Excellent support of the Latin 2, 3, 4, 5 and 6 encodings, thanks to
- Fully customizable output style: fonts, background and foreground colors, line numbering style etc.
- Possibility to delegate the processing of some files to other filters
- Many contributions, e.g., pretty-print diffs, print reference cards of programs, sanitize broken PostScript files, print Duplex on Simplex printers etc.
- And finally, the ability to pretty-print sources written in quite a few various languages
a2ps [OPTION]... [FILE]... Convert FILE(s) or standard input to PostScript. Mandatory arguments to long options are mandatory for short options too. Long options marked with * require a yes/no argument, corresponding short options stand for 'yes'. Tasks: --version display version --help display this help --guess report guessed types of FILES --which report the full path of library files named FILES --glob report the full path of library files matching FILES --list=defaults display default settings and parameters --list=TOPIC detailed list on TOPIC (delegations, encodings, features, variables, media, ppd, printers, prologues, style-sheets, user-options) After having performed the task, exit successfully. Detailed lists may provide additional help on specific features. Global: -q, --quiet, --silent be really quiet -v, --verbose[=LEVEL] set verbosity on, or to LEVEL -=, --user-option=OPTION use the user defined shortcut OPTION --debug enable debugging features -D, --define=KEY[:VALUE] unset variable KEY or set to VALUE -M, --medium=NAME use output medium NAME -r, --landscape print in landscape mode -R, --portrait print in portrait mode --columns=NUM number of columns per sheet --rows=NUM number of rows per sheet --major=DIRECTION first fill (DIRECTION=) rows, or columns -1, -2, ..., -9 predefined font sizes and layouts for 1.. 9 virtuals -A, --file-align=MODE align separate files according to MODE (fill, rank page, sheet, or a number) -j, --borders* print borders around columns --margin[=NUM] define an interior margin of size NUM The options -1.. -9 affect several primitive parameters to set up predefined layouts with 80 columns. Therefore the order matters: '-R -f40 -2' is equivalent to '-2'. To modify the layout, use '-2Rf40', or compose primitive options ('--columns', '--font-size' etc.). --line-numbers=NUM precede each NUM lines with its line number -C alias for --line-numbers=5 -f, --font-size=SIZE use font SIZE (float) for the body text -L, --lines-per-page=NUM scale the font to print NUM lines per virtual -l, --chars-per-line=NUM scale the font to print NUM columns per virtual -m, --catman process FILE as a man page (same as -L66) -T, --tabsize=NUM set tabulator size to NUM --non-printable-format=FMT specify how non-printable chars are printed Headings: -B, --no-header no page headers at all -b, --header[=TEXT] set page header -u, --underlay[=TEXT] print TEXT under every page --center-title[=TEXT] set page title to TITLE --left-title[=TEXT] set left and right page title to TEXT --right-title[=TEXT] --left-footer[=TEXT] set sheet footers to TEXT --footer[=TEXT] --right-footer[=TEXT] The TEXTs may use special escapes. -a, --pages[=RANGE] select the pages to print -c, --truncate-lines* cut long lines -i, --interpret* interpret tab, bs and ff chars --end-of-line=TYPE specify the eol char (TYPE: r, n, nr, rn, any) -X, --encoding=NAME use input encoding NAME -t, --title=NAME set the name of the job --stdin=NAME set the name of the input file stdin --print-anyway* force binary printing -Z, --delegate* delegate files to another application --toc[=TEXT] generate a table of content When delegations are enabled, a2ps may use other applications to handle the processing of files that should not be printed as raw information, e.g., HTML PostScript, PDF etc. -E, --pretty-print[=LANG] enable pretty-printing (set style to LANG) --highlight-level=LEVEL set pretty printing highlight LEVEL LEVEL can be none, normal or heavy -g alias for --highlight-level=heavy --strip-level=NUM level of comments stripping -o, --output=FILE leave output to file FILE. If FILE is '-', leave output to stdout. --version-control=WORD override the usual version control --suffix=SUFFIX override the usual backup suffix -P, --printer=NAME send output to printer NAME -d send output to the default printer --prologue=FILE include FILE.pro as PostScript prologue --ppd[=KEY] automatic PPD selection or set to KEY -n, --copies=NUM print NUM copies of each page -s, --sides=MODE set the duplex MODE ('1' or 'simplex', '2' or 'duplex', 'tumble') -S, --setpagedevice=K[:V] pass a page device definition to output --statusdict=K[:[:]V] pass a statusdict definition to the output -k, --page-prefeed enable page prefeed -K, --no-page-prefeed disable page prefeed
Authors: Miguel Santana & Akim Demaille, France (2001)
2001-01-16: v4.13. Free under GNU General Public License.
Get more info and versions for other OSes at La GNU a2ps home page. Note that the page's DOS version info and download link are old use our link.
Adobe's PostScript pages.
PSUtils (PostScript utilities) Manipulate PostScript documents.
* * * * *
This is a collection of programs and scripts that adjust formatting of PostScript documents or prepare other formats for further processing. Some of the tasks can be performed in a2ps, but for small jobs, these are faster. Also, the PSUtils will do a few things that a2ps does not at all. Final output can be viewed in Ghostscript or sent to a PostScript printer. EXEs are 32-bit DJGPP compilations, require 80386+ and a DOS Protected Mode Interface (CWSDPMI or other). Scripts require installation of their languages click the links in the script names to see what they are.
Program executables psbook Rearranges pages into signatures psselect Selects pages and page ranges pstops Performs general page rearrangement and selection psnup Put multiple pages per physical sheet of paper psresize Alter document paper size epsffit Fits an EPSF file to a given bounding box Scripts getafm (sh) Outputs PostScript to retrieve AFM file from printer showchar (sh) Outputs PostScript to draw a character with metric info fixdlsrps (perl) Filter to fix DviLaser/PS output so that PSUtils works fixfmps (perl) Filter to fix framemaker documents so that psselect etc. work fixmacps (perl) Filter to fix Macintosh documents with saner version of md fixpsditps (perl) Filter to fix Transcript psdit documents to work with PSUtils fixpspps (perl) Filter to fix PSPrint PostScript so that psselect etc. work fixscribeps (perl) Filter to fix Scribe PostScript so that psselect etc. work fixtpps (perl) Filter to fix Troff Tpscript documents fixwfwps (perl) Filter to fix Word for Windows documents for PSUtils fixwpps (perl) Filter to fix WordPerfect documents for PSUtils fixwwps (perl) Filter to fix Windows Write documents for PSUtils extractres (perl) Filter to extract resources from PostScript files includeres (perl) Filter to include resources into PostScript files psmerge (perl) Hack script to merge multiple PostScript files
Author: Angus J. C. Duggan, Scotland (1995).
2000-12-05: v1.17 for DOS.
Get detailed info on all components at the PSUtils page
Ghostscript (AFPL Ghostscript) Views and prints PostScript and PDF files. * * * * * [added 2005-12-08] Ghostscript reads PostScript and PDF files, processes them, and sends formatted output to the screen, to a file, or to a non-PostScript printer. 32-bit program, requires DOS extender (4GW, in binaries package). Distributed under Aladdin Free Public License (AFPL). From the docs Ghostscript works by providing:
Originally published by Aladdin Enterprises. Now maintained by artofcode LLC and Artifex Software. 1997-11-23: v5.10, last for DOS.
Usage: gs [switches] [file1.ps file2.ps ...]
Most frequently used switches: (you can use # in place of =)
-dNOPAUSE no pause after page
-q 'quiet', fewer messages
-g<width>x<height> page size in pixels
-r<res> pixels/inch resolution
-sDEVICE=<devname> select device
-dBATCH exit after last file
-sOutputFile=<file> select output file: - for stdout,
|command for pipe, embed %d or %ld for page #
Input formats: PostScript PostScriptLevel1 PostScriptLevel2 PDF
vga ega svga16 atiw tseng tvga deskjet djet500 laserjet ljetplus ljet2p
ljet3 ljet4 cdeskjet cdjcolor cdjmono cdj550 pj pjxl pjxl300 uniprint
epson eps9high ibmpro bj10e bj200 bjc600 bjc800 pcxmono pcxgray pcx16
pcx256 pcx24b pcxcmyk tiffcrle tiffg3 tiffg32d tiffg4 tifflzw tiffpack
bmpmono bmp16 bmp256 bmp16m tiff12nc tiff24nc psmono psgray bit bitrgb
bitcmyk jpeg jpeggray pdfwrite nullpage
. ; . ; c:/gs ; c:/gs/fonts
For more information, see use.txt.
Ghostscript (AFPL Ghostscript) Views and prints PostScript and PDF files.
* * * * *
Ghostscript reads PostScript and PDF files, processes them, and sends formatted output to the screen, to a file, or to a non-PostScript printer. 32-bit program, requires DOS extender (4GW, in binaries package). Distributed under Aladdin Free Public License (AFPL).
From the docs Ghostscript works by providing:
Originally published by Aladdin Enterprises. Now maintained by artofcode LLC and Artifex Software.
1997-11-23: v5.10, last for DOS.
Versions for Windows and other OSes, as well as support utils, docs, etc., are available. Go to the Ghostscript, Ghostview and GSview Home Page for info, and to one of the Mirror Sites for Ghostscript for downloads.
Binaries and source code for current, some older, and developer versions are also available at the File List page at SourceForge.
PSX Converts PostScript documents to plain text.
PSX is a small (16K) and simple command line PostScript document-to-text converter that I found somewhere on a BBS. It does a very inconsistent job of translation (sometimes good, sometimes very poor) but if you just want to browse the contents of a PostScript text file you downloaded off the Net, this program may suffice as a disk-saving alternative to Ghostscript. The main eye-sores resulting from conversion are loss of paragraph formatting and some split words. PSX is donationware. I suspect you won't find the latest version anywhere on the Net except here.
syntax: PSX [PostScriptfile] [textfile] [/option] Both the input (PostScript) and output (text) file names may be optionally entered at the command line. If no text file name is specified PSX creates an ASCII file using the same name as the PostScript file, but with ".TXT" as the DOS filename extension. If no PostScript filename is specified, PSX will ask for one. options: /HELP (or /?) displays this text /WIDTH=n (n is a number between 40 and 132 controlling output)
Author: Frank Brown (1992-95).
Download psx102e.zip (15K).
Text2PDF Converts text files to PDF.
[added 1998-12-08, updated 2005-03-02]
Text2PDF is a small (20K), versatile utility that converts a plain ASCII file to 7-bit clean Adobe PDF file (version 1.1) from any input file. It reads from standard input or a named file, and writes the PDF file to standard output.
Limitations: You cannot produce hypertext links either to bookmarks, within the file, or to external content. You cannot add styles to headings or body elements, nor does the program reformat bullets and numbered lists. Text is formatted as is. You will probably have to tweak your text files to ensure that the word wrapping is correct.
text2pdf [options] [filename] Options: -h show this message -f<font> use PostScript <font> (must be in standard 14, default: Courier) -I use ISOLatin1Encoding -s<size> use font at given pointsize (default 10) -v<dist> use given line spacing (default 12 points) -l<lines> lines per page (default 60, determined automatically if unspecified) -c<chars> maximum characters per line (default 80) -t<spaces> spaces per tab character (default 8) -F ignore formfeed characters (^L) -A4 use A4 paper (default Letter) -A3 use A3 paper (default Letter) -x<width> independent paper width in points -y<height> independent paper height in points -2 format in 2 columns -L landscape mode
Author: Phil Smith (1996). Suggestion & notes by Scott Nesbitt.
Download text2pdf.zip (12K).
Go to the text2pdf page, and to the PDF Corner, for versions for Windows and Unixes, and other related materials.
Xpdf Toolkit for extracting text / information / images from Adobe PDF files.
[added 2000-02-09, updated 2005-12-08]
A suite of command line tools for extracting data from Adobe PDF files.
Pdftotext Converts PDF files to plain text. If text file is '-', the text is sent to stdout. Pdfinfo Prints the contents of the 'Info' dictionary (plus some other useful information). Pdftops Reads a PDF file and writes a printable PostScript file. If PS file is '-', the content is sent to stdout. Pdfimages Saves images from a PDF file as Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files.
Remarks: Programs are 32-bit DJGPP compilations, require 80386+, DOS Protected Mode Interface (CWSDPMI or other), and FPU (80387 or 80486+). File names and zip archive directory names do not all conform to DOS 8+3 conventions. Possible special requirements: gzip in path (latest versions may not need it). These programs may not be well-suited to low resource hardware. Also available for Win32, Linux, OS/2, and other OSes. Source available. Free under GNU General Public License.
Author: Derek B. Noonburg / Foo Labs (2005). Added on tip by Bob Williams (Surv-PC forum).
Download xpdf-3.01-dos6.zip (1.6MB).
Xpdf pages at Foo Labs.
Get latest version info and files from the Download page.
Acrobat Reader Adobe's PDF file reader.
* * *
Why use an old DOS version of Acrobat? Good question. You probably shouldn't. It seems to do fine with old PDF files, or simple ones such as tax forms but it does not support many of the latest enhancements introduced over the past couple years. Hint: The "bitmap" printing option is useful if you lack the fonts required by a given document.
Requirements: DOS 3.30+, 80386 (80486 better), 2MB RAM (4MB better), 5MB disk space, VGA, and maybe some disk "acrobatics" if you're short on disk space the 2.5MB zip below contains a 2.5MB self-extracting EXE, which must be run to unpack the install files (2.5MB total), and then you must run the installer EXE.
Author: Adobe Systems (1993).
Download Acrodos.zip (2.5MB).
Or get these two files and unzip them to diskettes:
AdobeAcrobatDos1.zip and AdobeAcrobatDos2.zip (1.4MB each).
Also see Ghostscript.
Adobe's Acrobat pages.
|CONVERT UNIX < > DOS FORMATS|
Advanced, broad function text processing programs like LM, SED, or AWK can perform most of the specialized tasks described in this section, but those listed below may be better suited to the casual user or may include special options not available in other tools.
If you're looking for a converter that also handles MAC text, see NLX in the Penta Text Tools, or REMOVE, or FIXTEXT.
RUM Converts a file between UNIX and DOS text formats.
Simple, reliable, and user-friendly. Batch or interactive mode operation possible. No wildcard support. Can ouput to same or different filename.
Author: Jack Lee (1993). Suggested by Marianna Van Erp.
Download rum10.zip (10K).
FLIP Converts file(s) between UNIX and DOS text formats.
FLIP accepts wildcards and offers some specialized options (e.g., convert binaries, no time stamp modification of output files). Outputs to same filename.
Author: Rahul Dhesi (1989). Originally featured on Yves Bellefeuille's freeware list.
Download flip1exe.zip (15K).
ux2dos & dos2ux Convert Unix format text < > DOS format (mixed formats handled).
[added 2004-10-03]These are DOS counterparts of Unix utilities. From the docs:
dos2ux replaces carriage-return/newline pairs by newlines in DOS format text files to conform to UNIX requirements. Existing isolated newlines are left intact, so that no changes are made to a file which is already in UNIX format.
ux2dos adds carriage returns to isolated newlines (linefeeds) in UNIX format text files to conform to DOS requirements. Existing carriage-return/newline pairs are left intact, so that no changes are made to a file which is already in DOS format.
Access and modification time stamps of the files are preserved.
Author: Nelson H. F. Beebe (1989).
Download ux2dos.zip (27K).
|CONVERT OTHER FILE FORMATS|
Notes: HTML converters are listed on the HTML page. For a good beginner's intro to the desktop publishing package TeX, see Scott Nesbitt's article TeX: The DTP Alternative.
AntiWord Displays MS Word files, and converts to plain text, PostScript, PDF.
[added 2001-10-14, updated 2005-12-08]
AntiWord displays documents created by Microsoft Word v2, or v6 and later. It also converts from Word format to plain text, PostScript (see Ghostscript) or PDF. "A Word document can now be saved as 'formatted' text. That means with things like *bold* to show bold text, /italics/ to show italics and _undeline_ to show underlined text are added to the plain text". Use as a filter. 16- and 32-bit DOS versions available 32-bit version is a DJGPP build, requires 80386+ and a DOS Protected Mode Interface (CWSDPMI, or other). Also available for RISC OS, Linux, Unix (with sources), BeOS, OS/2, Mac OS/X, Amiga. Freeware under GNU General Public License.
Usage: antiword [switches] wordfile1 [wordfile2 ...] Switches: [-f|-t|-a papersize|-p papersize|-x dtd] [-m mapping][-w #][-i #][-Ls] -f formatted text output -t text output (default) -a <paper size name> Adobe PDF output -p <paper size name> PostScript output paper size like: a4, letter or legal -x <dtd> XML output like: db (DocBook) -m <mapping> character mapping file -w <width> in characters of text output -i <level> image level (PostScript only) -L use landscape mode (PostScript only) -s Show hidden (by Word) text
Limitations: "Many images are not shown yet. Some of the images that are shown, are shown in the wrong place. PostScript output is only available in ISO 8859-1 and ISO 8859-2.."
Notes: The DOS version expects its mapping files in
HOME is set, or in
HOME is not set (mapping files are distributed in
RESOUR~1\ directory of zip place them in
C:\ANTIWORD after unzipping).
Author: Adri van Os, Netherlands (2005). Suggested by Robert Bull.
Antiword page "...best viewed with your monitor switched on."
catdoc Converts / extracts text from Word, Excel or PowerPoint files.
[added 1999-08-06, updated 2005-12-08]This is a set of three utils:
From the docs:
catdoc Converts MS Word files to plain text or other formats xls2csv Converts Excel spreadsheets to comma-separated value (CSV) text catppt Extracts readable text from PowerPoint files
catdoc behaves much like [Unix]
catbut it reads MS-Word file and produces human-readable text on standard output. Optionally it can use LaTeX escape sequences for characters which have special meaning for LaTeX. It also makes some effort to recognize MS-Word tables...additional output formats, such as HTML can be easily defined...uses internal Unicode representation of text, so it is able to convert texts when charset in source document doesn't match charset on target system.
xls2csv reads MS-Excel spreadsheet and dumps its content as comma-separated values to stdout. Numbers are printed without delimiters, strings are enclosed in the double quotes. Double-quotes inside string are doubled.
catppt reads MS-PowerPoint presentations and dumps content to stdout.
Notes: 16-bit DOS operation only no support for 32-bit Windows Long File Names. Docs in plain text, Unix
man and HTML formats. Source included. Source-only distribution is also available (Unixes and DOS). Released under GNU General Public License.
Author: Victor Wagner, Russia (2005).
Download catdoc-0.94.zip (346K). Unzip with "create directories" option.
Get more info, history & updates at the catdoc & xls2csv page
Also see the author's DOS utilities page.
HelpDeco Converts Win 3.x/95 HLP files to RTF format.
* * * *
This DOS program is a Windows 3.x/95 *.HLP file decompiler. It's useful to the non-programmer because it has an option /r that converts HLP files to RTF format, which can be further converted to plain text (by word processors) or to HTML (e.g., see Martha). Also included: SPLITMRB and ZAPRES, image processors. A large program (237K), but fast. Documentation bilingual (German/English); may be difficult to follow.From the docs: HelpDeco...
will recreate all source files (RTF, HPJ, MVP, BMP, WMF, SHG, MRB,...) from all Windows 3.x/'95 .HLP help files and most .MVB multi media viewer titles. Load the resulting RTF file into WinWord to view and print, or modify the topics of the help file and rebuild it using the appropriate help compiler (HC30, HC31, HCP, HCW, HCRTF, WMVC, MMVC, MVC, not included, available at Microsoft). The rebuilt helpfile will not be identical, but should behave like the original, even in respect to inter-HLP-file links. All text, formatting, hypertext links, pictures, macros etc. will be conserved...It will run as a 16 bit application from MS-DOS command line and as a 32 bit application from Windows 95/NT command line.
Author: Manfred Winterhoff, Germany (1997).
|helpdc21.zip||(218K)||EXEs, source, HLP file format description|
Get more info and external support utils at the HelpDeco page.
WP2LaTeX Converts WordPerfect 3.x-8.x, HTML, RTF & other document files to LaTeX.
[updated 2005-12-08]From the docs:
WP2LaTeX is a program, which is designed to translate WordPerfect documents into LaTeX 2.09 and LaTeX 2.0e. The current version is able to cope with Macintosh WordPerfect 3.x, WP 4.x; WP 5.x and WP 6.x documents. (WP 7.x & 8.x have same binary file format as WP6 so no additional conversion module is necessary.) WP2LaTeX is NOT a text processor and converted documents will require a LaTeX document processor.
It is possible to convert a lot of features in the current version for example: Headers, Tables, Equations, Centered+Right+Left text, a lot of extended characters (greek, math, cyrilic, accented) and of course a normal text.
These are WP3.x, WP4.x, WP5.x, WP6.x and even several non WP like abiword, Accent, MTEFF, OLE Stream, HTML, RTF, T602, UNICODE and WORD.
Other: Accepts Unicode input. TIPA font support. Program messages in choice of English / Czech / German. Download package contains executables for DOS16, DOS32 (DJGPP), OS/2, Linux, Win32. Source code also included.
Maintainer: Jaroslav Fojtik, Czech Republic. Suggested by Scott Nesbitt.
Download wp2latex-3.23.zip (3.1MB).
Online User's Guide.
Visit the WP2LaTeX Homepage for latest version and support utilities.
Xray extract plain text present in binary files.
* * *
Xray extracts plain text from binary files. One use: can show text contained in executables, dll's, etc. It can also be used as a crude means of getting plain text from any word processor file, although formatting is lost in the process.
Download xray105.zip (7.5K).
Also see the similar program ReadText, rt101.zip (6K).
[ Go to Top | Front Page ]
©1994-2004, Richard L. Green.
This Edition ©2004-2005, Richard L. Green and Short.Stop.