Locality-preserving representations of k-mer sets
13
Super-k-mers maps
PhD Thesis
Abstract
Introduction
1
Comparing genomic sequences
2
Comparing using k-mers
3
Sketching sequences
4
Sampling with minimizers
High-performance sequence processing
5
An intro to vectorization
6
Vectorized sequence parsing
7
Rolling hashes on sequences
8
Vectorized computation of minimizers
9
Application to sequence filtering
Locality-preserving representations of k-mer sets
10
Background on k-mer sets
11
Necklaces and minimizers
12
Set representation and operations
13
Super-k-mers maps
14
Sketching super-k-mers
Sampling k-mers to lower memory & complexity
15
Background on minimizers
16
Multiminimizers
17
Locally-consistent phrases
18
Lexicographic-informed sampling
Discussion & conclusion
References
Locality-preserving representations of k-mer sets
13
Super-k-mers maps
13
Super-k-mers maps
See
(
Smith
et al.
, 2024
;
Martayan
et al.
, 2025
)
Martayan
, I.,
Robidou
, L.,
Shibuya
, Y., &
Limasset
, A. (2025)
Hyper-k-mers: Efficient streaming k-mers representation
.
Research in computational molecular biology (RECOMB 2025)
. Springer Nature Switzerland.
Smith
, C.,
Martayan
, I.,
Limasset
, A., &
Dufresne
, Y. (2024) Brisk: Exact resource-efficient dictionary for k-mers.
bioRxiv
, DOI:
10.1101/2024.11.26.625346
.
12
Set representation and operations
14
Sketching super-k-mers