Finding the mean in a partition distribution

Glassen, Thomas J., Oertzen, Timo von, and Konovalov, Dmitry A. (2018) Finding the mean in a partition distribution. BMC Bioinformatics, 19. 375.

[img]
Preview
PDF (Published Version) - Published Version
Available under License Creative Commons Attribution.

Download (796kB) | Preview
View at Publisher Website: https://doi.org/10.1186/s12859-018-2359-...
 
2
916


Abstract

Bayesian clustering algorithms, in particular those utilizing Dirichlet Processes (DP), return a sample of the posterior distribution of partitions of a set. However, in many applied cases a single clustering solution is desired, requiring a 'best' partition to be created from the posterior sample. It is an open research question which solution should be recommended in which situation. However, one such candidate is the sample mean, defined as the clustering with minimal squared distance to all partitions in the posterior sample, weighted by their probability. In this article, we review an algorithm that approximates this sample mean by using the Hungarian Method to compute the distance between partitions. This algorithm leaves room for further processing acceleration.

Item ID: 55852
Item Type: Article (Research - C1)
ISSN: 1471-2105
Keywords: mean partition; partition distance; Bayesian clustering; Dirichlet Process
Copyright Information: Copyright © The Author(s). 2018 Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Funders: Max Planck Society (MPS)
Date Deposited: 14 Oct 2018 22:54
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4613 Theory of computation > 461399 Theory of computation not elsewhere classified @ 100%
SEO Codes: 89 INFORMATION AND COMMUNICATION SERVICES > 8999 Other Information and Communication Services > 899999 Information and Communication Services not elsewhere classified @ 100%
Downloads: Total: 916
Last 12 Months: 84
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page