From preprocessing to text analysis: 80 tools for mining unstructured data
Text mining techniques have become critical for social scientists working with large scale social data, be it Twitter collections to track polarization, party documents to understand opinions and ideology, or news corpora to study the spread of misinformation. In the infographic shown in this blog, we identify more than 80 different apps, software packages, and libraries for R, Python and MATLAB that are used by social science researchers at different stages in their text analysis project. We focused almost entirely on statistical, quantitative and computational analysis of text, although some of these tools could be used to explore texts for qualitative purposes.
Announcing the Upworthy Research Archive
2014 was the year that the digital media company Upworthy “broke the internet” in the words of cofounder Peter Koechley. By publishing positive, progressive news stories and optimizing them with A/B testing, Upworthy came to dominate online attention.
No more tradeoffs: The era of big data content analysis has come
For centuries, being a scientist has meant learning to live with limited data. People only share so much on a survey form. Experiments don’t account for all the conditions of real world situations. Field research and interviews can only be generalized so far. Network analyses don’t tell us everything we want to know about the ties among people. And text/content/document analysis methods allow us to dive deep into a small set of documents, or they give us a shallow understanding of a larger archive. Never both. So far, the truly great scientists have had to apply many of these approaches to help us better see the world through their kaleidoscope of imperfect lenses.
2018 Concept Grant winners: An interview with MiniVan
Following the launch of the SAGE Ocean initiative in February 2018, the inaugural winners of the SAGE Concept Grant program were announced in March of the same year. As we build up to this year’s winner announcement we’ve caught up with the three winners from 2018 to see what they’ve been up to and how the seed funding has helped in the development of their tools.
In this post we chatted to MiniVan, a project of the Public Data Lab.
2018 Concept Grant winners: An interview with Ken Benoit from Quanteda
We catch up with Ken Benoit, who developed Quanteda, a large R package originally designed for the quantitative analysis of textual data, from which the name is derived. In 2018, Quanteda received $35,000 of seed funding as inaugural winners of the SAGE Concept Grants program. We find out what challenges Ken faced and how the funding helped in the development of the package.
Qualitative Data Analysis with ATLAS.ti: Author Interview
Susanne Friese discusses the new edition of Qualitative Data Analysis with ATLAS.ti.
Qualitative Data Analysis with NVivo: Author Interview
NVivo and Atlas ti are popular data analysis software options for qualitative research. Two important books that guide researchers who want to use these CAQDAS are newly updated. In this post we hear from Kristi Jackson, co-author with Pat Bazeley of Qualitative Data Analysis with NVivo. In the next post we will hear from Susanne Friese, author of Qualitative Data Analysis with ATLAS.ti.
Roundup: #text2data - new ways of reading
‘From text to data - new ways of reading’ was a 2-day event organised by the National Library of Sweden, the National Archives and Swe-Clarin. The conference brought together librarians, digital collection curators, and scholars in digital humanities and computational social science to talk about the tools and challenges involved in large scale text collection and analysis.
Training social scientists for the future
Calling all social scientists. How were you trained? How are you keeping up (or not) with new developments in this rapidly changing digital world? How are you training your students?
This was the subject of an event sponsored by SAGE Ocean as part of the ESRC’s 2018 Festival of Social Science. In case you are not aware, Sage, who have been at the forefront of publishing qualitative work, have now launched SAGE Ocean – an initiative “to help social scientists to navigate vast datasets and work with new technologies”.
How to Construct a Stem-and-Leaf Display
“Statistical Analysis“, by Jerome Frieman, intermediate/advanced statistics text, uses real research on antisocial behaviors, such as cyber-bullying, stereotyping, prejudice, and discrimination, to help readers across the social and behavioral sciences understand the underlying theory behind statistical methods. By presenting examples and principles of statistics within the context of these timely issues, the text shows how the results of analyses can be used to answer research questions.
Tomorrow’s news today
Throughout history humanity has had the urge to predict the future. The Greeks consulted the Delphi Oracle, whereas the Romans inspected sheep entrails and modern day sages poke around tea leaves to get the skinny on the future. This desire to predict the future has found its way into finance where modern day Haruspices pop up on television to make confident boasts about the future direction of the share du jour. All, but the very fortunate of these modern day prophets fail at their impossible task.
Automated text analysis: Who is the threatening minority?
News media serves as a window into the society its readership represents. A newspaper’s description of a social group both demonstrates and constructs perceptions of that group within its audience. Understanding long-term trends or spatial differences in the representation of minority groups in news media can contribute to ongoing theoretical debates about the role and perception of minority groups in society.
Social network analysis of the 2017 "Summer of Hate"
Fifty years after the "Summer of Love" transformed American youth culture, Andrew Anglin, the proprietor of the neo-Nazi website The Daily Stormer, announced to his followers that the summer of 2017 would be "The Summer of Hate."
Digital DNA: How to map our online behavior
Nowadays, issues related to the diffusion of fake news, rumours, hoaxes, as well as the diffusion of malware and viruses in online social networks have become so important as to transcend the virtual ecosystem and interfere with our businesses and societies. Currently, we are unable to effectively deal with these issues.