On expected score of cellwise alignments

Riho Klement, Jüri Lember

Abstract


We consider certain suboptimal alignments of two independent i.i.d. random sequences from a finite alphabet A = {1;...,K}, both sequences having length n. In particular, we focus on so-called cellwise alignments, where in the first step so many 1-s as possible are aligned. These aligned 1-s define cells and the rest of the alignment is defined so that the already existing alignment of 1-s remains unchanged. We show that as n grows, for any cellwise alignment, the average score of a cell tends to the expected score of a random cell, a.s. Moreover, we show that a large deviation inequality holds. The second part of the paper is devoted to calculating the expected score of certain cellwise alignment referred to as priority letter alignment. In this alignment, inside every cell first all 2-s are aligned. Then all 3-s are aligned, but in such way that the already existing alignment of 2-s remains unchanged. Then we continue with 4-s and so on. Although easy to describe, for K bigger than 3 the exact formula for expected score is not that straightforward to find. We present a recursive formula for calculating the expected score.

Keywords


longest common subsequence; optimal alignment; large deviation inequality

Full Text:

PDF


DOI: http://dx.doi.org/10.12697/ACUTM.2017.21.10

Refbacks

  • There are currently no refbacks.




ISSN 1406–2283 (print)
ISSN 2228–4699 (online)