Participating in the #DHPoco OPEN THREAD: THE DIGITAL HUMANITIES AS A HISTORICAL “REFUGE” FROM RACE/CLASS/GENDER/SEXUALITY/DISABILITY was an enlightening experience for both of us. Adeline invited us to contribute a guest post based on analysis of the open thread that we offer in the hopes of further productive dialogue. We are both particularly interested in gender-specific differences as well as others that were impossible to code based on our knowledge of the commenters – differences such as sexuality, ethnicity, but without self-identification we felt uncomfortable attempting that. We recognize our attribution is reliant on binary gender. We conduct our analysis using AntConc, a corpus analysis software to approach the question “Do men and women in the DHPoco thread talk about DH and postcolonialism differently? If so, how?”
Michelle scraped the DHPoco thread off the internet and compiled it into files by author-name, so that all words written by each individual who commented were grouped together. We marked each file based on our knowledge of commenters’ gender as either male or female. Of 38 individual commenters producing a total of 153 comments, we coded 26 commenters as male (68.5%) and 12 (31.5%) commenters as female. 72% of all the comments were written by men compared to 28% written by women. We have not anonymized the corpus.
We wanted to know how members of the DHPoco thread talk about what we are discussing. Michelle and Heather investigated at how participants on the thread wrote about postcolonialism and digital humanities using methods from corpus-driven critical discourse analysis. CDA, defined by Van Dijk as “the way social power abuse, dominance, and inequality are enacted, reproduced, and resisted by text and talk in the social and political context,” offers insights into how we talk about what we discuss. 
Because the male comments were a much larger set than the female comments, Heather used the male comments as a reference corpus and the female comments as an analysis corpus to ask “how do the women’s comments compare to the men’s comments on the DHPoco thread?” Heather did this by calculating keyness in the corpus. This is done by compiling a word list of the analysis corpus and the reference corpus, and comparing the two using a log-likelihood calculation. Words that are comparatively more frequent in the analysis corpus than in the reference corpus are deemed “key”.
Let’s look then at who was talking about what. Here are the most-key terms in the female-authored comments when compared to the male-authored comments:
The most key terms include – among others – “whiteness”, “color”, “white”, “engaged”, “invisible”, “queer”, “woman”, and “intersecting”. This suggests that that in the DHpoco thread, women are talking about those things more than men are. For example, women were more likely to write about race (lexical variants of race included race, racial, racializing, racing, racism, racist ), as well as issues of gender, sexuality, disability.
Men were more like to write about colonialism (which includes colonial, colonialist) as well as variants of imperial and global. Men tended to talk about postcolonialism as theory, object of study, critique, or category, whereas women’s commentary was more about the constituent parts of postcolonialism and limits of acceptance of inclusion of these terms within the “digital humanities”.
These two examples are intriguing for what they suggest about relationships to power: the men’s comments were also more likely to use “colonial”, “imperial”, and “global”. Women wrote more about differential consequences of access to power in terms of identities, through a discussion of race, gender, sexuality, and disability, as evidenced by the KWIC view above. Postcolonialism appears to be a more-abstracted discourse in the men’s comments when compared with ways women related to postcolonialism to enacted identities – as noted above in the positive keyword analysis for women’s comments.
Conducting a detailed discourse analysis on a corpus this small is difficult, but some other interesting gendered patterns emerged. For example, “poco” occurs a small number of times (13 total times in the corpus) and was starkly gendered in its use (5 men, 1 woman), as seen below:
Men use “poco” in different contexts, some of which seem to connote negativity or attempts to define it. Our one female commenter uses the term in five instances that were in general more reflective of the discourse on postcolonialism, linking postcolonialism to other discourse markers such as “digital”, “humanities”, “feminism”, “minority”.
Heather also ran negative keywords – that is, keywords which were less likely to appear in the analysis corpus compared to the reference corpus (negative keyness is marked in blue):
What you see above are are not the MOST negatively-key words. Those were largely function words – the of i it but on from at by my you which can all was most see because etc – which are often interesting in combination with other words, but not particularly so on their own here. We instead highlight some of these other negatively-key words – including “discourse”, “technologies”, “technology”, “institutional”, “research”, “institutions” and “culture” – which are less likely to appear in the women’s comments compared to the men’s comments.
So let’s zoom in on three of these negatively-key words: research, institutional*, and technolog*. The keyword analysis suggests that these are less likely to appear in the women’s comments, and indeed we see that these words are being used in remarkably different contexts between the male and female commenters:
Research, when used by male commenters, is more about the act of doing research (“my research”, “our research”, “for research”), whereas the female commenter above is discussing an institutional issue at hand.
Institution* is interesting because the sheer quantity (52 instances) of male comments using the term nearly drown out the two instances of women’s comments:
Overall, it seems that commenters are interested in ways the institution is engaged with digital projects – but one female commenter mentions the issue of receiving institutional support, whereas the other comments seem to be largely more concerned with ways an institution deals with digital things (is markup a form of institution?, what structures are in place at different universities, research centers, etc?, and in what what have we been institutionalized?) Immediate collocates for institut* include “comparative”, “different”, “history”, “structures”, and “support”, suggesting that there’s a lot of discussion about what’s going on in different places and no consensus has been reached yet.
What about technolog*?
There’s a recurring collocation of “about technology”, “technologies of”, “and technologies”/”technologies and”, so it also appears that we’re not so sure about which technologies are the issues at hand – the commenters are largely trying to define variant forms of technology for their arguments here.
The sheer quantity of male commentators pretty much completely drown out the female commentators here, suggesting that some of these negative keywords may be a little misleading: female commenters on the thread are discussing these topics, but the sample size is insufficient to see if they’re discussing them all that differently. Overall, “digital humanities” discussion predominated over that of “postcolonialism” when measured by usage of those terms or variants of them. However, defining postcolonialism moves us into a complex debate: it is in many ways equal to, and older than, that around the question of what constitutes digital humanities.  The interplay between these two streams seems to have comprised the majority of our discussion thus far.
Michelle also analyzed the the example of “thank*,” used by just over half the commenters, making it one of the more frequent words with a very positive connotation. The collocates for “thank*”’ include “for”, “you”, and “to,” as commenters in the thread expressed appreciation for other commenters’ posts, as shown below.
Commenters are using “thanks” here as a way of indicating a good faith effort to participate productively. Only a few took the “academic speak” of “thanks, but……”, which suggests that politeness formulae (cf Brown and Levinson 1987) are very much enacted in this thread. It is interesting that the uses of “thank*”are gendered: 58% of women use it, compared to 35% of men. This is in accordance with prior work from feminist online communication theory and linguistic politeness theory.
The largest strand of thanks went, deservedly, to Adeline and Roopika for starting the open thread, while a smaller number referred to the remarks of an individual commenter (“thank you X”, or “thanks for that” in a reply to a comment, or a generally expression of thanks to “everyone”). Overall, participants on the thread did this as well: “thanks” occurs with a specific commenter’s name 57% of the time. Attentiveness to this sort of politeness as well as the interaction and good faith it represents is critical if we wish to encourage more people to participate in these online conversations in the future.
We would like to suggest that though a number of patterns which emerge in a keyword-in-context analysis and keyness analysis require a close reading of context and usage to address issues more fully. In our brief analysis here we found that men and women were using the open-thread space in noticeably divergent ways. We suggest that male commenters are talking more about the process of doing digital humanities, whereas female commenters are talking more about the actual topic at hand (“Are digital humanities a refuge from racism/classism/sexism/ability?”). It appears, based on our analysis here, that women are being more critical of the structural and institutional issues at hand than the male commenters are.
In order to TransformDH we need to attend closely to how we participate in these kinds of online discussions. A more complex analysis based on multiple self-attributed identity factors of participants would lead to a far more nuanced analysis, but we offer this analysis as a starting point that could be done quickly, without making assumptions or taking on the potentially impossible to complete task of asking participants to self-identify for our purposes. We hope readers will be inspired to construct additional multifaceted readings of the thread using other tools, as we’d be very curious to see what other patterns emerge from such an important discussion space.
 This was done by intuition from names provided: we apologize to anyone we have imposed a (potentially false or inaccurate) gendered identity upon.
 We aim not to attach any names to any comments in our analysis, but rather to show some broad brush strokes of patterns in the corpus.
 Using definitions of postcolonialism and digital humanities from the website combined with terms for the framing questions of the open thread, Michelle represented the postcolonial stream and digital humanities with defining words. Digital humanities terms still appeared more frequently. “Postcolonialism” occurred 19 times, 45 with variants (postcolonial, postcolonialism, postcoloniality, post-colonial, poco, post-colonialist, post-coloniality), “Digital humanities” appeared 37 times, “DH” 176 for a total of 213. Total occurrences of postcolonial terms= 274 occurrence of digital humanities terms n=702. Spreadsheet of data here.
A useful comparison might be between the open thread and the twitter commentary about it: see #dhpoco tweets Michelle storified.