Establishing Robust Clusters
The Holy Grail for some would be a statistical measure of with which to assess the "validity" of our clusters. A number of such measures exist, but their usefulness for a wide variety of data, and for the types of questions humanists typically ask of their data is an open question. Lexos offers one measure, the Silhouette Score, which attempts to quantify our confidence that individual documents have been assigned to the "correct cluster". However, we recommend that you integrate non-statistical approaches into your workflow. Creating a number of different cluster analyses with slightly different settings to see how well the clusters hold up to these "tweaks" is probably the most reliable way to establish confidence in your clusters. Drout et al. have outlined a variety of procedures in Beowulf Unlocked: New Evidence from Lexomic Analysis (2016). [Extracts or summaries should be added here.]