Finding the mean in a partition distribution
Glassen, Thomas J., Oertzen, Timo von, and Konovalov, Dmitry A. (2018) Finding the mean in a partition distribution. BMC Bioinformatics, 19. 375.
|
PDF (Published Version)
- Published Version
Available under License Creative Commons Attribution. Download (796kB) | Preview |
Abstract
Bayesian clustering algorithms, in particular those utilizing Dirichlet Processes (DP), return a sample of the posterior distribution of partitions of a set. However, in many applied cases a single clustering solution is desired, requiring a 'best' partition to be created from the posterior sample. It is an open research question which solution should be recommended in which situation. However, one such candidate is the sample mean, defined as the clustering with minimal squared distance to all partitions in the posterior sample, weighted by their probability. In this article, we review an algorithm that approximates this sample mean by using the Hungarian Method to compute the distance between partitions. This algorithm leaves room for further processing acceleration.
Item ID: | 55852 |
---|---|
Item Type: | Article (Research - C1) |
ISSN: | 1471-2105 |
Keywords: | mean partition; partition distance; Bayesian clustering; Dirichlet Process |
Copyright Information: | Copyright © The Author(s). 2018 Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
Funders: | Max Planck Society (MPS) |
Date Deposited: | 14 Oct 2018 22:54 |
FoR Codes: | 46 INFORMATION AND COMPUTING SCIENCES > 4613 Theory of computation > 461399 Theory of computation not elsewhere classified @ 100% |
SEO Codes: | 89 INFORMATION AND COMMUNICATION SERVICES > 8999 Other Information and Communication Services > 899999 Information and Communication Services not elsewhere classified @ 100% |
Downloads: |
Total: 936 |
More Statistics |