2025-03-15 1.6: new installation method script is now called textenhance instead of textEnhance 2022-06-29 1.5: if --left-bracket is set to 'NONE', no comment processing takes place if --whitespace-replacement is set to 'NONE', no replacement takes place (we need these changes because sometimes we have to pass an argument but cannot pass an empty string (because of shell command line problems; this enables services like 'subtitle' to call textEnhance without any masking actions. 2020-08-25 1.4: Refactoring and robustification Minor changes: - Arguments are passed to helper script in a more robust and backwards-compatible manner, removing the requirement for bash version >= 4.4 - The --logfile option was previously erroneously unmentioned in the Readme, has now been added - Trailing newline is only removed from output of pdftotext and catdoc if it's an unnecessary second one - Availability of helper programs is checked before calling them using the check_available function, ensures meaningful error messages - A .gitattributes file has been added to the repository to ensure that example input files have their encodings, line terminators etc preserved 2020-07-21 1.3: Eliminate Markdown artifacts Previously, when provided with input files in docx or odt format, the program would sometimes produce output containing certain Markdown-like formatting elements, namely: - Underscores for emphasis - All-caps for headlines and strong text - Indentation for code blocks This was due to the program pandoc, which textEnhance utilizes for converting from docx and odt to plain text, inserting these. A custom filter has now been added to remove any such formatting information. 2020-07-10 1.2: Document format support Major changes: The program can now handle input files in the most common document formats: pdf, docx, odt, doc and rtf. It will convert them to plain text and then proceed as it does with plain text input files. Minor changes: - Debugging information can now be saved to a file specified with the new --logfile option instead of stdout - The program now filters out any ASCII control characters as well as UTF-8-encoded soft hyphens before writing to output Fixes: - Ensure CR (old Mac style) line breaks are converted 2020-06-05 1.1.1: Temporary bug fix regarding '#' At the moment, the BAS WebServices cannot handle when an option is passed whose value begins with a hash ('#'); the likely cause is that this gets turned into a shell comment somewhere down the line. As a temporary workaround, textEnhance sets the value of the --left-bracket option to '#', unless the user specifies their own character. 2020-06-03 1.1: API change The input file is now a required option instead of a positional argument, passed as -i/--infile. 2020-05-18 1.0: First stable release Major changes: - Restrictions on input file names: files with whitespace or non-ascii characters in the name are refused, as well as files with any of the following document format extensions: doc, docx, xls, xlsx, odt, ods, pdf - Restrictions on input file content: empty files are refused - The user is informed of any changes to the input file (that were not expressly requested with options such as --brackets) with warnings instead of debug messages Minor changes: - Temporary files are only created in /tmp - Handle unknown encodings gracefully Fixes: - Fix bug which resulted in comments being enclosed in '<>' twice - Fix iconv causing core dump with large input files 2020-05-13 0.1: First alpha version Note: this has accidentally been tagged as 'v0.0' instead of 'v0.1' in git.