Abstract:Affinity Propagation (AP) clustering is an algorithm based on message passing between data points, which mainly achieves clustering through the similarity between data. Compared with traditional clustering methods, AP clustering algorithm is able to implement clustering without giving a predetermined number of clusters, suggesting its advantages of fast and high efficiency. However, the accuracy is not high with the increase of clustering efficiency in the processing of high dimensional complex datasets. In order to improve the efficiency and accuracy of AP clustering algorithm, a coarse-grained parallel AP clustering algorithm based on intra-class and inter-class distances is proposed—IOCAP. Firstly, the idea of granularity is introduced to divide the initial data set into multiple subsets. Secondly, the similarity matrix is improved by combining the intra-class and inter-class distances for each subset. Finally, the improved parallel AP clustering is implemented based on the MapReduce model. Experiments on real data sets show that the IOCAP algorithm has better adaptability on large datasets, which effectively enhance the accuracy of the algorithm while maintaining the AP clustering effect.