Lune Rouge

TextStat 3.0

A Free software by Lionel Allorge

By: Lionel Allorge

Sommaire => English => Softwares => TextStat

Create statistics on a text file:

ts3_e.jpg

You can download the program (binaries) for Windows (606 kb): textstat_win32.zip

You can download the sources (85 kb): textstat_sources.zip

 

This program will give you several informations on a text file.

>It will count the number of caracters, words and sentences.

It will make an estimate of the number of syllables and calculate the Flesch score (see below).

It will give you a list of all the words in the text with the number of times they appear.

It will also detect and list repetitions of words in a short range.

 

Analysis of a text:

This program reads the text files and HTML files. If your document is in another format, you must export it towards the text format (text only) or HTML format.

You launch the TextStat program. In the zone "File to stat", you enter the name of your text file. The button above allows you to choose a file. You can specify if the text is in HTML in which case HTML tags are ignored.

In the zone "File for results", you enter the name of a file which will contain the whole statistics. The button above allows you to choose a file. You can specify a file with the format text or HTML.

You can then launch the statistics by clicking on button "TS". The processing can be long for a large file. Once the process is finished, in the right-hand side, you will see the result of the statistics. This result is also recorded in the "File for results" so that you can consult it in another program.

Several options let you parameterize these statistics:

You can modify the list of the separators of words and sentences.

You can ask the program to be ignore the difference between tiny or capital letters and characters accentuated or not by selecting the suitable boxes.

You can also indicate as the file uses the table of characters from DOS (ASCII) instead of ANSI.

You can also ask for a search of repetitions of words. This aims to help avoiding the use of the same word in a short interval of text. For that, you must select the box and define the number of words for the interval. You can also define a list of words to be ignored in this search, if not, the result is likely to become unusable.

 

Score of Rudolf Flesch

For an explanation, see http://www.mang.canterbury.ac.nz/co...

A high score means a text easy to read.

A low or negative score means a text difficult to read.

This method use the following formula:

206,835 - (1,015 x average words per sentence) - (84,6 x average syllables per words)

Table for this method:

Score School level
90-100 5th grade
80-90 6th grade
70-80 7th grade
60-70 8th and 9th grade
50-60 10th to 12th grade (high school)
30-50 college
0-30 college graduate

 

The installation program was made with a great tool: Inno Setup 1.3

Currently, the program works in English or French. If you want to translate it in another language, please send me a email: lionel.allorge@lunerouge.org

All comments are welcome.

History:

Version 3.0 : Rewriten with wxWindows library. Read and display HTML files. Released under GNU GPL licence.

Version 2.0 : Add the detection of repetitions.

Version 1.0 : First public release as freeware.

License:

This program is released under the GNU GPL license.

This project is present on:

sflogo-88-1.jpg

Please report any comment or bug to: Lionel Allorge


stop software patents
Say NO to software patents!

The European Union is considering software patents in Europe. This is a danger for software authors and software companies who do not use and do not need the patent system to innovate. They must be protected from owners of dubious granted patents.

Please sign the Petition to stop software patents in Europe.

Contact the webmaster Copyright ©
Lune Rouge