The problem of query selectivity estimation for database queries is critical for efficientquery execution by database management systems. A query execution method strongly depends on earlyestimated size of a query result. This estimation determines a data access method used later during thequery execution. The selectivity parameter is a fraction of table rows that satisfy a single-table querycondition. For a selection condition of a range query where an attribute has a continuous domain, theselectivity is equivalent to a definite integral form probability density function (PDF) of attribute valuesdistribution. For a compound selection condition based on many attributes we need a multidimensionalspace-efficient non-parametric estimator of multivariate PDF of attribute values distribution. A knownapproach based on Discrete Cosine Transform (DCT) spectrum as an representation of multidimensionalPDF is considered. The energy compaction property of DCT lets omit a region of spectrum coefficientswith small absolute values without significant losing an accuracy of selectivity estimation. An area ofrelevant spectrum coefficients is called a sampling zone. Results of experiments from previous worksshows that applying the reciprocal shape of the sampling zone gives the least selectivity estimation errorsubject to a predetermined size of the zone. The main result of this work is a theoretical confirmation of onlyexperimental results from previous works. The paper presents the proof of the theorem that the reciprocalshape of the sampling zone is asymptotically error-optimal. The proof is based on calculus of variationsand the isoperimetric problem.

JO - Theoretical and Applied Informatics L1 - http://sd.czasopisma.pan.pl/Content/93680/mainfile.pdf L2 - http://sd.czasopisma.pan.pl/Content/93680 PY - 2012 IS - No 1 EP - 22 KW - query selectivity estimation KW - probability density function KW - discrete cosine transform KW - calculus of variations KW - isoperimetric problem A1 - Augustyn, Dariusz R. PB - Committee of Informatics of Polish Academy of Science PB - Institute of Theoretical and Applied Informatics of Polish Academy of Science VL - vol. 24 JF - Theoretical and Applied Informatics SP - 3 T1 - Asymptotically error-optimal shape of sampling zone for query selectivity estimation method based on discrete cosine transform DA - 2012 UR - http://sd.czasopisma.pan.pl/dlibra/docmetadata?id=93680 ER -