text.alignment - Text Alignment with Smith-Waterman
Find similarities between texts using the Smith-Waterman algorithm. The algorithm performs local sequence alignment and determines similar regions between two strings. The Smith-Waterman algorithm is explained in the paper: "Identification of common molecular subsequences" by T.F.Smith and M.S.Waterman (1981), available at <doi:10.1016/0022-2836(81)90087-5>. This package implements the same logic for sequences of words and letters instead of molecular sequences.
Last updated
cpp
4.89 score 11 stars 14 scripts 188 downloadsimage.binarization - Binarize Images for Enhancing Optical Character Recognition
Improve optical character recognition by binarizing images. The package focuses primarily on local adaptive thresholding algorithms. In English, this means that it has the ability to turn a color or gray scale image into a black and white image. This is particularly useful as a preprocessing step for optical character recognition or handwritten text recognition.
Last updated
cpp
4.24 score 23 stars 15 scripts 187 downloadsrecogito - Interactive Annotation of Text and Images
Annotate text with entities and the relations between them. Annotate areas of interest in images with your labels. Providing 'htmlwidgets' bindings to the 'recogito' <https://github.com/recogito/recogito-js> and 'annotorious' <https://github.com/recogito/annotorious> libraries.
Last updated
4.12 score 22 stars 12 scripts 228 downloads