A Convergent Differentially Private k-Means Clustering Algorithm

Lu, Zhigang, and Shen, Hong (2019) A Convergent Differentially Private k-Means Clustering Algorithm. In: Lecture Notes in Computer Science (11439) pp. 612-624. From: PAKDD 2019: 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, 14-17 April 2019, Macau, China.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website: http://doi.org/10.1007/978-3-030-16148-4...


Preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied in the interactive and the non-interactive settings. However, existing interactive differentially private clustering algorithms suffer from a non-convergence problem, i.e., these algorithms may not terminate without a predefined number of iterations. This problem severely impacts the clustering quality and the efficiency of the algorithm. To resolve this problem, we propose a novel iterative approach in the interactive settings which controls the orientation of the centroids movement over the iterations to ensure the convergence by injecting DP noise in a selected area. We prove that, in the expected case, our approach converges to the same centroids as Lloyd’s algorithm in at most twice the iterations of Lloyd’s algorithm. We perform experimental evaluations on real-world datasets to show that our algorithm outperforms the state-of-the-art of the interactive differentially private clustering algorithms with a guaranteed convergence and better clustering quality to meet the same DP requirement.

Item ID: 77411
Item Type: Conference Item (Research - E1)
ISBN: 978-3-030-16148-4
Copyright Information: © Springer Nature Switzerland AG 2019.
Date Deposited: 06 Mar 2023 23:48
FoR Codes: 46 INFORMATION AND COMPUTING SCIENCES > 4604 Cybersecurity and privacy > 460402 Data and information privacy @ 50%
46 INFORMATION AND COMPUTING SCIENCES > 4611 Machine learning > 461106 Semi- and unsupervised learning @ 50%
SEO Codes: 22 INFORMATION AND COMMUNICATION SERVICES > 2204 Information systems, technologies and services > 220405 Cybersecurity @ 100%
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page