One of the nice things about LaTeX is that the definitive version of the document is stored as plain text. This simplifies version control and avoids the need to use a complicated editor with an opaque binary file format. It also means that writing scripts to check for writing mistakes is relatively easy. Of course, detecting grammatical errors in general is very challenging, but certain classes of mistakes are easy to check for. Matt Might has posted a few scripts which I’ve found quite useful.
I had cause to write another script to check for consistent use of hyphens. For example, both “non-deterministic” and “nondeterministic” are acceptable, but a single document should pick one variant and use it consistently. You can find the script here; suggestions for improvement (rather, patches) are welcome.
What other common mistakes are amenable to a simple script? Checking for consistent use of the Oxford comma should be straightforward, for example. There are also scripts like chktex that look for mistakes in LaTeX usage (although I’ve personally found chktex to be too noisy to be useful).
Naturally, there are limits on what you can do without parsing natural language. After The Deadline claims to provide some NLP-based grammar analysis, but I haven’t used it personally.
- atdtool, a command-line interface to After The Deadline
- diction, a GNU tool to check for common writing mistakes
- LanguageTool, another open source style and grammar checker
- TextLint, a writing style checker