by bobmcpop
Last Updated October 10, 2018 19:19 PM

I'm very new to cluster analysis. In papers such as Richette et al.^{1} (which tries to see which concomitant diseases cluster together), authors first cluster the *variables* and then the *observations* (i.e., patients). (Bevis et al.^{2}, did the same thing.) They used SAS's `PROC VARCLUS`

and factor analysis (others have used PCA) for clustering *variables*, and cluster analysis for the patients. I don't understand why they would (need to) do both? In the first paper, all their discussion centered on the latter.

- Richette P, Clerson P, Périssin L, et al. Revisiting comorbidities in gout: a cluster analysis.
*Annals of the Rheumatic Diseases*2015;74:142-147. - Bevis, et al. (2018). Comorbidity clusters in people with gout: an observational cohort study with linked medical record review. Rheumatology (Oxford). 57(8): 1358-1363.

From a mathematical point of view, a standard dataset is just a matrix of numbers organized into rows and columns. We attach meanings to these, and think of the rows as pertaining to patients and the columns as representing variables, but they're just numbers and you can perform mathematical operations on them. The question is whether any given operation is meaningful.

Variables can be understood to be manifestations of some underlying truth that we don't have access to. In such a case, people often seek to combine the variables to get a better picture of the reality. These are called latent variables. The standard is to determine them through factor analysis, but PCA will typically yield almost the same results, and clustering algorithms can be applied to the columns (variables) to do the same thing. The latter guarantees that the result will have simple structure, at the cost of a worse empirical fit. That's presumably what they were after. This is done first because there's no point in clustering patients on the wrong variables—that would bias the results.

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger