Algorithmic Considerations
- Partitioning criteria- Single level vs. hierarchical partitioning (often, multi-level hierarchical partitioning is desirable)
 
- Separation of clusters- Exclusive 排他性(e.g., one customer belongs to only one region) vs. non-exclusive 非排他性(e.g., one document may belong to more than one class)
 
- Similarity measure- Distance-based (e.g., Euclidean, road network, vector) vs. connectivity-based (e.g., density or contiguity) 距离性/连通性
 
- Clustering space- Full space (often when low dimensional) vs. subspaces (often in high-dimensional clustering)
 
Requirements and Challenges
- Scalability- Clustering all the data instead of only on samples
 
- Ability to deal with different types of attributes- Numerical, binary, categorical, ordinal, linked, and mixture of these
 
- Constraint-based clustering- User may give inputs on constraints
- Use domain knowledge to determine input parameters
 
- Interpretability and usability
- Others- Discovery of clusters with arbitrary shape
- Ability to deal with noisy data
- Incremental clustering and insensitivity to input order
- High dimensionality
 
 
                         
                                

