[Epistemic status: silly statistical experiments. Might eventually turn into something useful but for now everything should be taken with a grain of salt.]

[Apology: this is a badly-organized post. The explanation of what gendermetricity and gendermetric correlations are comes in the middle of the post, rather than in the beginning. I find the results in the end really interesting and promising, but it takes a while to get there.]

I love behavioral genetics, because I find the way that it allows you to summarize complex and opaque information into simple variance components interesting and enlightening. For this reason, I got excited when I saw Gwern post a tweet with a link to a study that generalized this approach from behavioral genetics to neuroanatomy. Does this mean we can use this for other domains too?

For some background: typically, behavioral genetics have used the known similarities between monozygotic and dizygotic twins to infer to what degrees various traits are heritable, shared environment or nonshared environment. If more-genetically-similar twins are more phenotypically similar than less-genetically-similar twins, the trait in question is heritable. However, more recently, it has become possible to genotype extremely large numbers of unrelated individuals, which makes it possible to compare similarity without the individuals being related family-wise. This allows the technique of comparing degree of genetic similarity with degree of phenotypic similarity to work with non-twin samples, as long as they are big enough. This statistical tool is called GCTA (genome-wide complex trait analysis).

However, there’s nothing restricting you to genetic similarity. In principle, you can use any similarity metric you want, as long as it satisfies the conditions assumed by the GCTA statistics. This was what they did in the paper Gwern linked, replacing genetic similarity with neuroanatomic similarity, allowing them to study highly interesting questions of how strongly phenotypes can in principle be predicted from neuroanatomy, even though they haven’t yet discovered how to predict these neurotypes. They called this statistic morphometricity.

But if this works with genetics, and it works with neuroanatomy, then surely it works for just about anything! Gwern suggested gut microbiomes and leaf spectral imaging, but given my interests, my attention immediately shifts to personality, life experiences, or generally any sort of data that is sufficiently multidimensional that regressing directly with it becomes difficult.

As masculinity/femininity appears relatively high-dimensional, and as I get more and more interested in exploring massively high-dimensional data, I’m interested in this sort of tool for my surveys. However, the immediate question that comes to mind is, do I have the sample size needed? GCTAs are usually run with thousands of participants, whereas I typically have a few hundred (though I have a project in the works that might yield me thousands…), so it’s not looking promising. On the other hand, it seems that I have way fewer dimensions to work with, so perhaps this helps; after all, this is supposed to be less data-intensive than just plain linear regression…

After trying an failing for a while to translate their matlab code to Python, I decided to just follow Gwern’s advice and abuse the GCTA program to directly give me the results. I loaded it up with data from my survey on Gender, Sexuality and Other Things and gave it some test runs. Here’s some example results:

Demo Trait g^2 SE all gender 56% 6 pp women aap 53% 12 pp women narcissism 47% 11 pp men feminism 46% 9 pp women gender issues 40% 11 pp women self-mf 29% 11 pp women age 27% 10 pp all age 24% 6 pp men age 24% 8 pp all sexual orientation 21% 5 pp men sexual orientation 21% 7 pp men narcissism 20% 9 pp men self-mf 19% 7 pp women sexual orientation 13% 10 pp all quality of life 12% 5 pp women feminism 12% 10 pp men gender issues 8% 6 pp men agp 2% 4 pp

In the above, I used the GCTA program to look at demographics and traits and compute their “””gendermetricity””” (“””g^2″””) – i.e. its estimate for how much variance in the trait can in theory be predicted linearly using the masculinity/femininity items I included in the survey. SE denotes the standard error that GCTA estimated. Self-mf refers to self-assessed masculinity/femininity.

The above table is… not very promising for the usability of this tool. The confidence intervals are very wide (though that’s to be expected with my sort of sample size), there’s relatively little connection to how strongly something appears to be related to masculinity/femininity and how high its gendermetricity is (though this is not what the tools promise either – in principle, they’re supposed to detect any variance that can be predicted from combinations of the items, even if these combinations are completely orthogonal to masculinity/femininity), and it’s kinda opaque if just considered directly. It did have some ups, though, e.g. placing gender as being the most-gendermetric trait, and placing AGP as being one of the least-gendermetric traits, but given the other problems, I wouldn’t trust gendermetricity in these domains either.

GCTA is supposed to have a “genetic correlation” function, which should be usable for figuring out the degree to which the gendermetric variance in two variables is correlated. However, I couldn’t get it to work, and the problems I mentioned before made me a bit uninterested in spending too much effort on making it work.

However… gendermetricity is basically an estimate for how well linear regression can in principle be able to predict the traits in question. If we just ignore the “in principle” part, we can explore gendermetricity-like concepts by performing the relevant linear regressions directly!

Let z be a random vector containing the masculinity/femininity-related variables that we seek define gendermetricity using, and x (and y) be a random variable containing the trait that we seek to predict the gendermetricity of. Let x // z denote residualizing x for z. The gendermetricity of x is simply just the fraction of variance explained by z of x, which can be computed as var_z(x) = (var(x)-var(x//z))/var(x). Similarly, the gendermetric covariance of x and y must then be cov_z(x, y) = cov(x, y)-cov(x//z, y//z), and so their gendermetric correlation be cov_z(x, y)/√(var_z(x)var_z(y)).

To help with dealing with the amount of data I have, I use PCA to reduce the dimensionality of the masculinity/femininity test from 22 to 7. In addition, I residualize the variables in a “leave-one-out” manner, which is to say, I predict each individual with a model that has been fitted to all other individuals. To reduce noise variance, I test giving the regression different numbers of principal components as input, ranging from 1 to 7, and give the number that yields the highest gendermetricity. This yielded the following gendermetricities:

Demo Trait g^2 all gender 42,3% all sexual orientation 20% men feminism 13,5% men sexual orientation 11,6% women gender issues 9,6% women self-mf 9,5% men self-mf 8,2% men age 8,2% all age 4,8% women aap 3,7% women sexual orientation 3,6% women narcissism 3,3% men gender issues 2,5% men narcissism 2,1% all quality of life 1,0% men agp 0% women age 0% women feminism 0%

This doesn’t look too bad, but more importantly, we can now compute gendermetric correlations! But first, what actually is a gendermetric correlation? The best way I can explain a gendermetric correlation between two variables X and Y is the following: Suppose there’s some stuff that makes X correlate with the masculinity/femininity test (i.e. X is somewhat gendermetric). And suppose there’s some stuff that makes Y correlate with the masculinity/femininity test. The gendermetric correlation is then a measure of how much these two “stuffs” is the same stuff. Now let’s take a look at some examples!

So, how do we interpret the above? There’s a number of things that could be said. First, note that the gendermetric correlation between sexual orientation and quality of life exceeds the [-1, 1] bounds that are typically expected of correlations. This is not because gendermetric correlations are somehow able to correlate more strongly than ordinary correlations; rather, it is because my math sucks. (I could have removed these effects, e.g. by just clamping them to the relevant range, or by not doing the leave-one-out thing in my regression, but I think they serve as a useful reminder not to take the statistics in this post too seriously.)

Consider the gendermetric correlation between sexual orientation and gender. It is very close to one, which makes sense when you break it down: The variance in gender decomposes into the gendermetric variance, which boils down to the fact that men are more masculine than women, and the non-gendermetric variance, which boils down to the fact that some women are masculine and some men are feminine. Meanwhile, the variance in sexual orientation decomposes into the same gendermetric variance where men are more masculine and women are more feminine, plus a bit of extra gendermetric variance where gay people are more GNC than the baseline, plus a lot of non-gendermetric variance due to not all queer people being GNC, and not all straight people being gender-conforming.

The gendermetric correlation tells you how much the gendermetric variance in the two variables is shared. Since the main difference in the gendermetric variances is that sexual orientation also contains some GNC gay people, the bulk of the variance (namely that men tend to be more masculine than women) is shared, and so the gendermetric correlation is high. (It’s probably worth adding that I wouldn’t be surprised if the 0.98 number above is an overestimate.)

The residual correlation is much lower. This correlation tells you how much the variables are still correlated after taking the gendermetric variance into account. That is, it tells you the degree to which the non-gendermetric variance is shared. As you can see from the diagram, it is much lower than the gendermetric correlation, and I can also inform you that it is lower than the usual correlation, as in this sample, gender and sexual orientation is correlated at r~0.42.

In the text above, I assumed the gendermetric variance was related to masculinity/femininity. This is likely in the case of gender or sexual orientation, but it doesn’t necessarily need to be the case in general. Since I allowed up to 7 dimensions from the masculinity/femininity test to be included in the regression, it is possible for the linear regression to form predictions that are not based on masculinity/femininity, but instead also on mixes, e.g. taking some “masculine” characteristics and some “feminine” characteristics and using them to form a new “trait”.

As an example, two traits that might be included in a masculinity/femininity test (but which weren’t included in mine) are Expressivity (caring about others) and Instrumentality (high agency and a strong sense of self). One might assume that gendermetricity computed using these traits only use either the traits directly, or use their difference. However, gendermetricity might also instead use their sum to predict things, which corresponds to Extraversion, a relatively ungendered trait.

Now that we understand gendermetricity (hopefully), let’s look at some more examples.

The first thing to note is that including self-mf (i.e. self-assessed masculinity/femininity) in a gendermetric correlation is in some ways strange. Gendermetricity is meant to capture masculinity/femininity, so what exactly happens when this gets combined with self-mf? Well, basically, we’d expect the gendermetric variance in self-mf to be just that, masculinity/femininity. However, there is going to be some additional variance in self-mf, both because any self-report measure has some noise, and because our masculinity/femininity measure might not be complete. Thus, a gendermetric correlation with self-mf tells us something about whether the gendermetric variance in a trait is due to masculinity/femininity, or due to something else (such as the extraversion example earlier).

Thus, what the above diagram suggests to us is that the gendermetric variance in attraction to men, self-sexualization, and gender issues in women is due to masculinity/femininity, but that the gendermetric variance in autoandrophilia and narcissism is partly due to something else. I believe that like for ordinary correlations, the gendermetric correlations have to be squared in order to yield the shared variance; this means that 28% of the gendermetric variance of autoandrophilia is, according to this measure, due to masculinity, while the remaining 72% isn’t.

Despite this, autoandrophilia appears to gendermetrically correlate really strongly with gender issues, even though these should be mainly about masculinity/femininity. This is another example of how you should take the results here with a grain of salt; it is impossible for the “real” gendermetricities to work like this, but the estimates do.

One thing that’s worth noting is that autoandrophilia gendermetrically correlates with androphilia. Furthermore, while gynephilia and lesbianism isn’t statistically significant gendermetrically, if I force it to compute the gendermetric correlation between those and AAP, I also find those to be negatively correlated with autoandrophilia. Thus, autoandrophilia is gendermetrically correlated with heterosexuality; this is despite the fact that I usually find it to be negatively correlated with heterosexuality. I’m not yet sure how to interpret this finding, but I find it very intriguing that we finally have an AAP/heterosexuality “””correlation”””, as the lack of this is one of the arguments against AAP as a concept.

One odd thing is that narcissism, autoandrophilia, and gender issues are all gendermetrically correlated. I’m not sure what’s up with that, and it’s worth keeping an eye on whether this replicates. (Is this pattern predicted by the ROGD model? I don’t know.)

It is also interesting to observe that autoandrophilia is negatively gendermetrically correlated with self-sexualization, even though it is otherwise positively correlated with self-sexualization. This might also be worth keeping an eye on in the future.

If you think about it, the matrix for women appears to suggest a two-factor solution, with a “general gendermetric factor” that all the dimensions load positively on, and a “courtship-vs-GID factor” where self-sexualization/androphilia/narcissism load in the courtship direction, and AAP/gender-issues/self-mf load in the GID direction. This approach might be worth considering looking into (though that would require me to first figure out how to do “gendermetric factor analysis”, which appears to be easy enough but might be trickier than it looks).

I don’t know if it was a fluke, or what happened, but for some reason of the two exhibitionism items I had, only one, which I’ve labelled “exhibitionism 1”, was gendermetric. This exhibitionism item is related to flashing; its item text is “Exposing my genitals to an attractive stranger”. Meanwhile, the other exhibitionism item, “exhibitionism 2”, is about public sex, “Performing sex acts while stranger watch”.

On to men!

This time, there is a strong, obvious structure in the graph that just screams that it wants to get noticed: There’s a gender nonconformity factor that involves self-mf, sexual orientation, gender issues, feminism, and disliking one’s own appearance (with all but the disliking-appearance dimension being completely gendermetrically correlated), and a courtship factor that involves liking one’s appearance, self-sexualization, being older, narcissism, and having had more female partners.

I think the first of these two factors is very cute; there appears to be a single general factor of gender nonconformity, rather than there being different forms of GNC that are relevant for different traits. (Alternatively, my data analysis is bad enough that I’m not able to detect different forms of GNC.)

The courtship factor is surprising to me. The masculinity/femininity test I’m using doesn’t have any items that are “obviously” related to courtship for men; there’s no “going to the gym” items, or anything similar to this. Presumably there’s an explanation for this that will become clear if I perform some sort of gendermetric factor analysis, but until then my best explanation is that gendermetricity is magic.

And it’s not even that the courtship-related variance it’s capturing is tiny. Here’s the gendermetricities of men’s traits:

The number of female partners is the most gendermetric trait according to this analysis. It’s not that I’m complaining, because other than masculinity/femininity itself, courtship would probably be one of the most-relevant things for a masculinity/femininity test to capture. I just don’t understand how it does it.

One constrast between women’s and men’s correlation matrices is that for men, the residual correlations appear to often to be smaller than for women. I’m not sure if that effect is real, but if it is, it indicates to me that this approach works better for men than for women.

I think there’s four obvious followups to this post:

Perform “gendermetric factor analysis”. It seems that this should allow us to extract highly-intuitive factors from the masculinity/femininity test, which might be useful for other things in the future. Plus, gendermetric factor analysis might help reduce some of the potential problems that can arise from overfitting in these cases. (When playing around with changing the number of principal components, it appears that the structure in the men’s gendermetricity matrix is just a result of the existence of the first two principal components. However, in the women’s gendermetricity matrix, the structure appears to require more principal components, despite appearing mostly 2D.) Expand the study of gendermetricity with more traits and better masculinity/femininity tests. Maybe we can discover even more structure within the traits, and at least we can verify the structure that is already found. Attractiveness is an obvious thing that might be worth including, as would sociosexuality. Apply these methods to other domains too; for instance, it would be interesting to see if the AGP/GAMP correlation is due to [attitudes to androgyny]metricity, or something similar for other correlations in sexuality. One complication is that in order to make this system work, the domain that is used must be multidimensional; otherwise the correlations will all be 1 or -1. At the same time, really it’s not the gendermetric correlation that needs to be used to see if some factor is a potential mediator, but instead the residual correlation. Improve the calculations of gendermetricity, e.g. by fixing the cases where gendermetricities greater than 1 or smaller than -1 are computed, or by figuring out a way to use the linear mixed model approach that e.g. the GCTA program uses.