We use two basic methods—clustering and factor analysis, similarly to Forbes et al. (2017)—to examine how the preference items are related to each other, and in particular, how they can be grouped. We try each approach separately within each sex.

The clustering approach is theoretically simple and requires no modeling or scale assumptions beyond the assumption of an ordinal scale for each item and independent sampling of subjects. Clustering is usually used to group cases (subjects), but we use it to group variables (items). We use agglomerative mean-linkage hierarchical clustering (function hclust in the R package stats ) with the distance metric d ( X , Y ) = 1 − | k ( X , Y )|, where k ( X , Y ) is the Kendall correlation of X with Y . We use Kendall correlation, a rank-based measure, to avoid having to assign meaning to the distance between points on each item's 7-point rating scale (to account for ties, R uses the τ B metric of Kendall, 1945, which removes tied pairs from the quotient used to calculate τ ). Mean-linkage hierarchical clustering proceeds iteratively by first assigning every object to its own cluster, so there are q clusters for q objects. Then, for every pair of clusters A and B , the mean of the pairwise distances between objects in A and B is computed, and the two clusters with the least such mean are combined into one cluster, leaving q − 1 clusters. Clusters continue to be combined two at a time until all q objects are in a single cluster. Given our distance metric, each of our clusters is characterized by a mean absolute Kendall correlation. We present the results backwards, starting from the case of 1 cluster and describing the results for 2, 3, 4, 5, 6, and 7 clusters. The effect is that for each k , the analysis with k clusters has all the same clusters as the analysis with k − 1 clusters, except that one of the old clusters has had items removed from it to form the new cluster.

Exploratory factor analysis is a method that is more familiar to social scientists but relies on concrete distributional assumptions about the data. We take a bass-ackwards approach (Goldberg, 2006), in which we fit a 2-factor solution, followed by a 3-factor solution, and so on up to 7 factors, with each solution as a separate model. We then save factor scores and compare them between levels. Each factor analysis (function fa in the R package psych ) is fit with maximum likelihood estimation on a polychoric correlation matrix and uses orthogonal varimax rotation.

Clustering and factor analyses are not usually methods that are used to address the same problems. But by clustering variables, rather than clustering subjects, we make clustering, like factor analysis, an unsupervised way to see how variables can be grouped on the basis of their interrelations. The advantage of using both methods is that clustering is simple whereas factor analysis is complex. The interpretation of clusters does not depend on modeling assumptions such as the shape of items' distributions, but can only speak to relationships between items. Factor analysis relies on the strong assumption of the common-factor model, but it makes stronger claims, including a loading of each item on each factor, and a score of each subject on each factor.