• local equivalences of distances between clusterings—a geometric perspective

    نویسندگان :
    جزئیات بیشتر مقاله
    • تاریخ ارائه: 1392/07/24
    • تاریخ انتشار در تی پی بین: 1392/07/24
    • تعداد بازدید: 1036
    • تعداد پرسش و پاسخ ها: 0
    • شماره تماس دبیرخانه رویداد: -
     in comparing clusterings, several different distances and indices are in use. we prove that the misclassification error distance, the hamming distance (equivalent to the unadjusted rand index), and the χ 2 distance between partitions are equivalent in the neighborhood of 0. in other words, if two partitions are very similar, then one distance defines upper and lower bounds on the other and viceversa. the proofs are geometric and rely on the concavity of the distances. the geometric intuitions themselves advance the understanding of the space of all clusterings. to our knowledge, this is the first result of its kind.

    practically, distances are frequently used to compare two clusterings of a set of observations. but the motivation for this work is in the theoretical study of data clustering. distances between partitions are involved in constructing new methods for cluster validation, determining the number of clusters, and analyzing clustering algorithms. from a probability theory point of view, the present results apply to any pair of finite valued random variables, and provide simple yet tight upper and lower bounds on the χ 2 measure of (in)dependence valid when the two variables are strongly dependent.

سوال خود را در مورد این مقاله مطرح نمایید :

با انتخاب دکمه ثبت پرسش، موافقت خود را با قوانین انتشار محتوا در وبسایت تی پی بین اعلام می کنم