tandemRepeat static method
List<List<int> >
tandemRepeat(
- AminoAcidSequence seq,
- int unitLen,
- int minRepeat,
- bool compareOnlyBase, {
- bool fuzzyComp = false,
(en) Search for tandem repeats (short sequence repeats).
(ja) タンデムリピート(短い配列の反復)を検索します。
seq
: Sequence data.unitLen
: The length of the arrays that make up the repeats.minRepeat
: Specifies the minimum number of iterations to find.compareOnlyBase
: If true, compare only AminoAcid.type. If false, compare also AminoAcid.infoKey.fuzzyComp
: If true, Can contain B,Z,X,J.
Returns : [repeatStartPosition, repeatEndPosition
,...]
Implementation
static List<List<int>> tandemRepeat(
AminoAcidSequence seq, int unitLen, int minRepeat, bool compareOnlyBase,
{bool fuzzyComp = false}) {
List<List<int>> r = [];
late AminoAcidSequence pattern;
for (int i = 0; i <= seq.length() - unitLen * minRepeat; i++) {
int repeatCount = 0;
for (int j = 1; j * unitLen + i <= seq.length() - unitLen; j++) {
if (compareOnlyBase) {
pattern = seq.subSeqNonInfo(i, i + unitLen);
if (UtilCompareAminoAcid.compareType(
seq.subSeqNonInfo(i + j * unitLen, i + (j + 1) * unitLen),
pattern,
fuzzyComp)) {
repeatCount++;
} else {
break;
}
} else {
pattern = seq.subSeq(i, i + unitLen);
if (UtilCompareAminoAcid.compare(
seq.subSeq(i + j * unitLen, i + (j + 1) * unitLen),
pattern,
fuzzyComp)) {
repeatCount++;
} else {
break;
}
}
}
if (repeatCount >= minRepeat - 1) {
r.add([i, i + unitLen * (repeatCount + 1)]);
}
}
return r;
}