You can download the program (binaries) for Windows (606 kb): textstat_win32.zip
You can download the sources (85 kb): textstat_sources.zip
This program will give you several informations on a text file.
>It will count the number of caracters, words and sentences.
It will make an estimate of the number of syllables and calculate the Flesch score (see below).
It will give you a list of all the words in the text with the number of times they appear.
It will also detect and list repetitions of words in a short range.
Analysis of a text:
This program reads the text files and HTML files. If your document is in another format, you must export it towards the text format (text only) or HTML format.
You launch the TextStat program. In the zone "File to stat", you enter the name of your text file. The button above allows you to choose a file. You can specify if the text is in HTML in which case HTML tags are ignored.
In the zone "File for results", you enter the name of a file which will contain the whole statistics. The button above allows you to choose a file. You can specify a file with the format text or HTML.
You can then launch the statistics by clicking on button "TS". The processing can be long for a large file. Once the process is finished, in the right-hand side, you will see the result of the statistics. This result is also recorded in the "File for results" so that you can consult it in another program.
Several options let you parameterize these statistics:
You can modify the list of the separators of words and sentences.
You can ask the program to be ignore the difference between tiny or capital letters and characters accentuated or not by selecting the suitable boxes.
You can also indicate as the file uses the table of characters from DOS (ASCII) instead of ANSI.
You can also ask for a search of repetitions of words. This aims to help avoiding the use of the same word in a short interval of text. For that, you must select the box and define the number of words for the interval. You can also define a list of words to be ignored in this search, if not, the result is likely to become unusable.
Score of Rudolf Flesch
For an explanation, see http://www.mang.canterbury.ac.nz/co...
A high score means a text easy to read.
A low or negative score means a text difficult to read.
This method use the following formula:
206,835 - (1,015 x average words per sentence) - (84,6 x average syllables per words)
Table for this method:
||8th and 9th grade
||10th to 12th grade (high school)
The installation program was made with a great tool: Inno Setup 1.3
Currently, the program works in English or French. If you want to translate it in another language, please send me a email: email@example.com
All comments are welcome.
Version 3.0 : Rewriten with wxWindows library. Read and display HTML files. Released under GNU GPL licence.
Version 2.0 : Add the detection of repetitions.
Version 1.0 : First public release as freeware.
This program is released under the GNU GPL license.
This project is present on:
Please report any comment or bug to: Lionel Allorge