Establishing Robust Clusters
The Holy Grail for some would be a statistical measure of with which to assess the "validity" of our clusters. A number of such measures exist, but their usefulness for a wide variety of data, and for the types of questions humanists typically ask of their data is an open question. At the present (2018), we use Lexos to prepare texts and then move to Eder's bootstrap concensus tree (BCT) tool in the Stylo in R package.
We recommend that you integrate non-statistical approaches into your workflow. Creating a number of different cluster analyses with slightly different settings to see how well the clusters hold up to these "tweaks" is probably the most reliable way to establish confidence in your clusters. Drout et al. have outlined a variety of procedures in Beowulf Unlocked: New Evidence from Lexomic Analysis (2016).