tandemRepeat static method
List<List<int> >
tandemRepeat(
- NucleotideSequence seq,
- int unitLen,
- int minRepeat,
- bool compareOnlyBase, {
- bool fuzzyComp = false,
(en) Search for tandem repeats (short sequence repeats).
(ja) タンデムリピート(短い配列の反復)を検索します。
seq
: Sequence data.unitLen
: The length of the arrays that make up the repeats.minRepeat
: Specifies the minimum number of iterations to find.compareOnlyBase
: If true, compare only Nucleotide Base. If false, compare also replacement decoration and anotherName.fuzzyComp
: If true, Can contain m, r, w, s, y, k, v, h, d, b, n. If true, t and u are searched as the same.
Returns : [repeatStartPosition, repeatEndPosition
,...]
Implementation
static List<List<int>> tandemRepeat(
NucleotideSequence seq, int unitLen, int minRepeat, bool compareOnlyBase,
{bool fuzzyComp = false}) {
List<List<int>> r = [];
late NucleotideSequence pattern;
for (int i = 0; i <= seq.length() - unitLen * minRepeat; i++) {
int repeatCount = 0;
for (int j = 1; j * unitLen + i <= seq.length() - unitLen; j++) {
if (compareOnlyBase) {
pattern = seq.subSeqNonInfo(i, i + unitLen);
if (UtilCompareNucleotide.compareBase(
seq.subSeqNonInfo(i + j * unitLen, i + (j + 1) * unitLen),
pattern,
fuzzyComp)) {
repeatCount++;
} else {
break;
}
} else {
pattern = seq.subSeq(i, i + unitLen);
if (UtilCompareNucleotide.compare(
seq.subSeq(i + j * unitLen, i + (j + 1) * unitLen),
pattern,
fuzzyComp)) {
repeatCount++;
} else {
break;
}
}
}
if (repeatCount >= minRepeat - 1) {
r.add([i, i + unitLen * (repeatCount + 1)]);
}
}
return r;
}