Acta et Commentationes Universitatis Tartuensis de Mathematica
http://acutm.math.ut.ee/index.php/acutm
<p><em>Acta et Commentationes Universitatis Tartuensis de Mathematica </em>(ACUTM) is an international journal of pure and applied mathematics.</p>University of Tartu Pressen-USActa et Commentationes Universitatis Tartuensis de Mathematica1406-2283Tests based on characterizations, and their efficiencies: a survey
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.01
A survey of goodness-of-fit and symmetry tests based on the characterization properties of distributions is presented. This approach became popular in recent years. In most cases the test statistics are functionals of <em>U</em>-empirical processes. The limiting distributions and large deviations of new statistics under the null hypothesis are described. Their local Bahadur efficiency for various parametric alternatives is calculated and compared with each other as well as with diverse previously known tests. We also describe new directions of possible research in this domain.Ya. Nikitin2017-07-032017-07-03211Pattern recognition using hidden Markov models in financial time series
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.02
Our aim consists in developing a software which can recognize M trading patterns in real time using Hidden Markov Models (HMMs). A trading pattern is a predefined figure indicating a specific behavior of prices. We trained <em>M</em> + 1 HMMs using Baum-Welch Algorithm combined with Genetic Algorithm. In particular, with HMMs we describe <em>M</em> trading patterns while the other one, called threshold model, can recognize all the not predefined patterns. The classification algorithm correctly recognizes 93% of the provided patterns. Thanks to the analysis of the false positive examples, we finally designed some more filters to reduce them.Sara RebagliatiEmanuela Sasso2017-07-032017-07-03211Markov-modulated multivariate linear regression
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.03
The article concerns parameter estimation for the Markov-modulated multivariate linear regression model. It is supposed that the parameters of the linear regression are dependent from states of a random environment. The last is described as a continuous-time homogeneous irreducible Markov chain with known parameters. The procedure of estimating the regression parameters is established.Alexander Andronov2017-07-032017-07-03211On estimation of insurance risk parameters by combining local regression and distribution fitting ideas
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.04
The problem of premium estimation is an essential part of the insurance mathematics. Often the problem is divided into two parts: estimation of claim number (or frequency) and the estimation of individual claim amounts (severities). In this paper, we will focus on the former. More precisely, we are looking for certain semiparametric dynamic regression type model to avoid the "price shock" issue of static classication. We apply locally the regression method, use local maximum likelihood estimation for the parameters of the model and cross-validation techniques to determine the optimal size of a neighborhood. A case study with real vehicle casco insurance dataset is included, the results obtained by proposed method are compared by the ones obtained by global regression and the classification and regression trees (C&RT) approach.Meelis KäärikRaul KangroLiina Muru2017-07-032017-07-03211Using k-anonymization for registry data: pitfalls and alternatives
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.05
We describe an applied study of ICT students' employment in Estonia based on data from two national registries. The study offered an opportunity to compare results from both <em>k</em>-anonymised data as well as those from the novel Sharemind platform for privacy-preserving statistical computing, which offers a way to use confidential data for research without loss of information. Comparison of results using <em>k</em>-anonymized and lossless data indicate substantial differences in estimates of students' employment rates. The results illustrate, on the basis of a real-world study, how the effects of <em>k</em>-anonymization can lead to considerable bias in estimates. While privacy-preserving computing does entail inconveniences because original microdata is not revealed to the statistician, this can be offset by greater confidence in the results.Sten AnspalMart KaskaIndrek Seppo2017-07-032017-07-03211Statistical analysis of high-order Markov dependencies
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.06
The paper deals with parsimonious models of integer valued time series. Such models are special cases of high-order Markov chain with a small number of parameters. Two new parsimonious models are presented. The first is Markov chain of order <em>s</em> with <em>r</em> partial connections, and the second model is called Markov chain of conditional order. Theoretical results on probabilistic properties and statistical inferences for these models are given.Yu. S. KharinM. V. Maltsew2017-07-032017-07-03211Downward calibration property of estimated response propensities
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.07
We consider four methods for estimating response propensities: three traditional ones (linear, logistic, probit) and one more recent, a decision tree method. We show that some but not all the methods produce estimates that calibrate sample totals of auxiliary variables down to the response set totals. The downward calibration property reveals interesting relationships between estimated propensities, auxiliary variables, and true response probabilities. However, the property itself does not guarantee more accurate propensity estimation. Our simulation study shows that the accuracy of the estimation method depends primarily on the relationship nature between true response probabilities and auxiliary variables.Natalja LepikImbi Traat2017-07-032017-07-03211Effect of auxiliary information in data collection and estimation stage
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.08
Responsive design is a newly emerged view focusing on reducing the effects of non-response by monitoring and intervening the data collection process. Informative measures that use auxiliary information are used to guide the data collection process. Aspiration to a well representative set of respondents is currently done through balancing – means of auxiliary variables have to be equal in the sample and the set of respondents. Auxiliary variables are later used in the estimation stage to improve the estimates, but assume that more auxiliary variables are available in the estimation stage. The auxiliary vector is split by variables (a) used in monitoring and estimation, and (b) only used in the estimation stage. Explicit terms of calibration weights and response propensities are developed and useful properties of those terms are proved. Theoretical results and two emerging strategies are tested in simulations.Kaur Lumiste2017-07-032017-07-03211Residency index – a tool for measuring the population size
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.09
After the Estonian census 2011 the census team found that there was some under-coverage of the census data. To determine the amount of non-enumerated people the following procedure was used. The set of people belonging to Estonian population register as residents, but not enumerated in census 2011 were regarded as potential residents. All existing administrative registers were used to define the signs of life for these people: activity in a register during 2011 gave to a person a sign of life. The signs of life were used as binary variables to discriminant the residents and non-residents. The following task was to use the methodology for following years and to cover the whole population. Hence we decided to define for each person from the population a residency index between 0 and 1 that will be recalculated yearly using the signs of life.Ethel MaasingEne-Margit TiitMare Vähi2017-07-032017-07-03211On expected score of cellwise alignments
http://acutm.math.ut.ee/index.php/acutm/article/view/ACUTM.2017.21.10
We consider certain suboptimal alignments of two independent i.i.d. random sequences from a finite alphabet <em>A</em> = {1;...,<em>K</em>}, both sequences having length <em>n</em>. In particular, we focus on so-called cellwise alignments, where in the first step so many 1-s as possible are aligned. These aligned 1-s define <em>cells</em> and the rest of the alignment is defined so that the already existing alignment of 1-s remains unchanged. We show that as <em>n</em> grows, for any cellwise alignment, the average score of a cell tends to the expected score of a random cell, a.s. Moreover, we show that a large deviation inequality holds. The second part of the paper is devoted to calculating the expected score of certain cellwise alignment referred to as <em>priority letter alignment</em>. In this alignment, inside every cell first all 2-s are aligned. Then all 3-s are aligned, but in such way that the already existing alignment of 2-s remains unchanged. Then we continue with 4-s and so on. Although easy to describe, for <em>K</em> bigger than 3 the exact formula for expected score is not that straightforward to find. We present a recursive formula for calculating the expected score.Riho KlementJüri Lember2017-07-032017-07-03211