RPatternJoin: String Similarity Joins for Hamming and Levenshtein Distances

This project is a tool for words edit similarity joins (a.k.a. all-pairs similarity search) under small (< 3) edit distance constraints. It works for Levenshtein/Hamming distances and words from any alphabet. The software was originally developed for joining amino-acid/nucleotide sequences from Adaptive Immune Repertoires, where the number of words is relatively large (10^5-10^6) and the average length of words is relatively small (10-100).

Version: 1.0.0
Imports: Rcpp (≥ 1.0.13), stats
LinkingTo: Rcpp, RcppArmadillo
Suggests: Matrix, testthat, stringdist
Published: 2024-10-25
DOI: 10.32614/CRAN.package.RPatternJoin
Author: Daniil Matveev [aut, cre], Martin Leitner-Ankerl [ctb, cph], Gene Harvey [ctb, cph]
Maintainer: Daniil Matveev <dmatveev at sfsu.edu>
License: MIT + file LICENSE
NeedsCompilation: yes
Language: en-US
Materials: NEWS
CRAN checks: RPatternJoin results

Documentation:

Reference manual: RPatternJoin.pdf

Downloads:

Package source: RPatternJoin_1.0.0.tar.gz
Windows binaries: r-devel: RPatternJoin_1.0.0.zip, r-release: RPatternJoin_1.0.0.zip, r-oldrel: RPatternJoin_1.0.0.zip
macOS binaries: r-release (arm64): RPatternJoin_1.0.0.tgz, r-oldrel (arm64): RPatternJoin_1.0.0.tgz, r-release (x86_64): RPatternJoin_1.0.0.tgz, r-oldrel (x86_64): RPatternJoin_1.0.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=RPatternJoin to link to this page.