TADA 2022
Conference: October 6-7, 2022 at Cornell Tech, Roosevelt Island, New York City
All events will take place in the Verizon Executive Education Center. From the Tram and F train stops, walk south along the river until you get to the Cornell Tech campus. The VEEC is on your left after the Graduate hotel.
Online registration is available for $25 (for AV support staff) until Wednesday Oct 5.
For questions write to info@tada2022.org
The New Directions in Analyzing Text as Data (TADA) meeting is a leading forum for research on the study of politics, society, and culture through computational analysis of documents. Recent advances in NLP have the potential to revolutionize how we study human society. But using these tools effectively, reliably, and equitably requires continuous dialog between experts across computational methods, social science, and the humanities.
TADA 2022 invites applications for research presentations on new work related to text-as-data methods and applications. TADA is an interdisciplinary conference, drawing scholars from across the social sciences, computer and information science, and related fields. Our programs from past meetings (TADA 2018, TADA 2019, and TADA 2021) show the wide range of work presented at our conference.
Schedule
 | Thursday Oct 6 |
---|---|
8:00 | Breakfast |
9:00 | Opening remarks |
9:15 | Contributed talks 1 |
10:30 | Break |
11:00 | Keynote Speaker, Julia Silge |
12:00 | Lunch (provided) |
1:00 | Contributed talks 2 |
2:15 | Poster session A |
3:15 | Break |
3:30 | Contributed talks 3 |
 | Dinner on your own |
 | Friday Oct 7 |
---|---|
8:00 | Breakfast |
8:55 | Remarks |
9:00 | Cassandra Project @ TADA Roundtable |
10:00 | Poster session B |
11:00 | Contributed talks 4 |
 | Lunch (provided) |
Keynote Speaker: Julia Silge
Julia Silge is a data scientist and software engineer at RStudio PBC where she works on open source modeling tools. She is an author, an international keynote speaker, and a real-world practitioner focusing on text mining, data analysis, and machine learning. Julia loves making beautiful charts and communicating about technical topics with diverse audiences.
Key Dates
- Monday July 18, abstract submission
- Monday Aug 15, notification of selection
- Sept 2, registration opens for participation
- Thursday Sep 22, full papers for discussants
- Thursday Oct 6 – Friday Oct 7, conference
This year’s conference will be held at Cornell Tech on Roosevelt Island and is sponsored by the Cornell Center for Social Science, the Cornell Center for Data Science for Enterprise and Society, and the National Science Foundation. Events will take place at the Verizon Executive Education Center on the Cornell Tech campus.
Accommodations The Graduate Hotel Roosevelt Island is immediately adjacent to the conference location. Roosevelt Island is also accessible on the F train from Manhattan and Queens. Locations around Bryant Park are particularly convenient. Note that subway service may be limited after 9pm.
Proposals are due July 18, and consist of a brief, 300-word abstract in text format rather than a full paper. TADA 2022 is a non-archival conference; there are no formal proceedings, and papers presented at the conference will not be distributed publicly by the conference. Presenters are expected to provide a paper to their discussant two weeks before the conference. We welcome any work, so long as it hasn’t been previously presented at a TADA conference. We also welcome individuals to volunteer to serve as discussants.
In addition to oral presentations and posters TADA 2022 will have a doctoral consortium. PhD students will be matched with experienced mentors from complementary fields to offer critiques to specific work and to provide guidance in how to do effective interdisciplinary work.
Diversity leads to stronger science. We actively seek, welcome, and encourage people with diverse backgrounds, experiences, and identities to apply and attend. While many participants have attended TADA for years, we also eagerly welcome new researchers!
Talks
Talks should be 12-15 minutes, leaving time for discussant remarks and audience questions.
Title | Author |
---|---|
Where Did It Come From? Deep Learning for Event Extraction in Art Provenance | Fabio Mariani |
Immigration and Social Distance: Evidence from Newspapers during the Age of Mass Migration | Elliott Ash, Gloria Gennaro, Dominik Hangartner, Alessandra Stampi-Bombelli |
Do Journalists Overstate Science? Findings from Computational Modeling of Scientific (Un)certainty | Jiaxin Pei and David Jurgens |
 | Discussant: Stephen Downie |
Title | Author |
---|---|
How does rising inflation affect EV charging cost and consumer sentiment? | Sarthak Chaturvedi, Omar Isaac Asensio; Georgia Institute of Technology |
Conceptualization of ESG in corporate discourse: a computational text analytic approach | Ilya Akdemir |
A Graph-Augmented Generative Entity-to-Entity Stance Detection Framework | Xinliang Frederick Zhang, Nick Beauchamp, Lu Wang |
 | Discussant: Ken Benoit |
Title | Author |
---|---|
Challenges in Opinion Manipulation Detection: An Examination of Wartime Russian Media | Chan Young Park, Julia Mendelsohn, Anjalie Field, Yulia Tsvetkov |
Strengthening Propaganda and the Limits of Media Commercialization in China: Evidence from Millions of Newspaper Articles | Margaret Roberts, Brandon Stewart, Hannah Waight, and Yin Yuan |
Was It Political? Interpretations of the 1967 Detroit Rebellion by Detroit Residents Fifty Years Later | Tina Law |
 | Discussant: Sarah Dreier |
Title | Author |
---|---|
Towards measuring populism from text | Ines Rehbein, Christopher Klamm, Simone Ponzetto |
Sounding the Bullhorn: Surfacing and Analyzing Dogwhistles with Language Models | Julia Mendelsohn, Maarten Sap, Ronan Le Bras |
News Media Consolidation and Ideological Positioning | Pierre Bodéré, Nicolas Longuet Marx, Marguerite Obolensky |
 | Discussant: Laure Thompson |
Poster sessions
Posters may be in any shape up to A0 size.
Session A (Thursday)
Title | Authors |
---|---|
Computational Text Analysis of Binding Language in Administrative Guidance | Amit Haim |
A Versatile Data Annotation System | Yikai Liu, Mingye Chen, Naihao Deng, Yulong Chen |
“Tell China’s Story Well” on YouTube: How do pro-Beijing influencers (re)shape China’s global narratives | Ryan Wang |
OCR Correction of Historical Texts with Pre-Trained Language Models | Chris Buckley, Melissa M. Lee, Brandon M. Stewart |
Dictionary Enrichment with Word Embedding: Tracking Online Incivility in Hong KongDictionary Enrichment with Word Embedding: Tracking Online Incivility in Hong Kong | Hai Liang, Yee Man Margaret Ng, & Nathan L.T. Tsang |
Quantifying the Causal Effect of Gender on Interruptions in Supreme Court Oral Arguments | Katherine Keith, Ankita Gupta, Erica Cai, Brendan O’Connor, Douglas Rice |
Medical Misinformation during a Pandemic: Text as Data during the Russian Influenza (1889-1890) | E. Thomas Ewing |
Scaling latent political positions from textual data using word embeddings | Patrick Schwabl |
How to stop ignoring automated classification errors: Differential measurement error and inter-coder reliability in measurement error models | Nathan TeBlunthuis, Valerie Hase, Chung-hong Chan |
How Questions Can Propagate Online Mis- and Dis-information | Kaitlyn Zhou and Dan Jurafsky |
Construction and Analysis of a Map-Based Data Corpus for Tracking Linguistic Variation and Demographic Characteristic Identification | Theodore Daniel Manning, Eugenia Lukin, Ross Klein, James Cooper Roberts, Eliana Mugar, Michael Fang, Harleigh Niyu, Alejandro Napolitano-Jawerbaum, Patrick Juola |
The impact of social media reaction design on political discourse: A quasi-experimental analysis of 155 million comments on Reddit | Orestis Papakyriakopoulos, Severin Engelmann, Amy Winecoff |
Removing the Heavy Burden of Corruption: Media, Movements, and Politics in the Grand Corruption Reform in South Korea, 2016-2017 | Hyunsik Chun, Ion Bogdan Vasi, Chanhum Yoon |
Measures and Interventions for improving workplace feedback | Michael Yeomans & Ariella Kristal |
Synthetic text for supervised text analysis | Andrew Halterman |
Causal Attributions in Textual Data | Paulina GarcĂa Corral |
Seeing Like a Topic Model | Bolun Zhang, Yimang Zhou, Dai Li |
“Get Out and Vote,” Or “You Can’t Complain”: Non-Voters on Twitter During the 2016 and 2020 U.S. Elections | Chelsea Butkowski, Sam Wilson, Eric Wiemer |
Dictionary-Assisted Supervised Contrastive Learning | Patrick Y. Wu, Richard Bonneau, Joshua A. Tucker, Jonathan Nagler |
The Rise of and Demand for Identitarian Media Coverage | Daniel Hopkins, Yphtach Lelkes, Samuel Wolken |
Cambridge Law Corpus: A corpus for research on legal AI | Andreas Ă–stling; Holli Sargeant; Ludwig Bull; Alex Terenin; Leif Jonsson; MĂĄns Magnusson; Felix Steffek |
Filtering Technologies and the Fairness of Natural Language Systems | Eddie Yang, Chad Atalla, Su Lin Blodgett, Kate Cook, Kristen Laird, Emily Lawton, Michael Madaio, Samir Passi, Forough Poursabzi, Vyoma Raman, Bella Rideau, Emily Sheng, Dan Vann, Andy Zhao, Solon Barocas, Hanna Wallach |
Poster session B (Friday)
Title | Authors |
---|---|
Multilingual Word Embeddings for Social Scientists: Estimation, Inference and Validation Resources for 157 Languages | Pedro L. Rodriguez, Arthur Spirling, Brandon M. Stewart, Elisa M. Wirsching |
Bridging Topic Modeling and Framing Theory | Arya D. McCarthy and Giovanna Maria Dora Dore |
Do Politicians Collaborate? Measuring Coordination in Political Discourse | Katherine Atwell, Michael Datz, Max Goplerud, Tessa Provins, Malihe Alikhani |
COVID-19 Public Opinion on Turkish Twittersphere | Burak Ozturan, Yunus Emre Tapan |
Aligning Large Natural Language Documents | Tanzir Pial, Steven Skiena |
Exploring conflicting values in the founding of ARPANET and the Internet | Meera Desai |
LegisBERT: A Language Model for the Analysis of Legislative Text | Mitchell Bosley |
What now? - a Twitter textual analysis of the abortion debates with the shifting policies in the US | Jialin Shan, Tuan-he Lee, Hanfei Li |
Narrative Detection Across Political Domains | Maria Antoniak, Elliott Ash |
Spoken Identity: Titular Language Usage and the War in Ukraine | Erin Walk |
Signaled or Suppressed? How Gender Informs Women’s Undergraduate Applications in Biology and Engineering | Sonia Giebel, AJ Alvero, Ben Gebre-Medhin, anthony lising antonio |
Affective Idiosyncratic Responses to Music | Sky CH-Wang, Evan Li, Oliver Li, Smaranda Muresan, Zhou Yu |
Inferring Age from Linguistic and Verbal Cues in Celebrity Interviews | Yunting Yin, Steven Skiena |
A Step-by-Step Protocol for Curation of Topic Models by Subject Matter Experts | Philip Resnik, Pranav Goel, Alexander Hoyle, Rupak Sarkar, Josh Hagedorn, Maeve Gearing, and Carol Bruce |
Who gets a say in this? Speaking security on social media | Natalia Umansky |
Learning from Machines: Differentiating US Presidential Campaigns with Attribution and Annotation | Musashi Jacobs-Harukawa |
Finding the story: Leveraging expert knowledge in computational sensemaking of multi-platform text data | Hope Schroeder, Tobin South |
What’s a Parent to do? Measuring the Cultural Logics of Parenting with Biterm Topic Models | Orestes P. Hastings, Luca Maria Pesando |
Decoding matrimonial advertisements: Individual preferences entrenched in socio cultural biases | Pranathi Iyer |
Gendered Information in Resumes and Hiring Bias: A Predictive Modeling Approach | Prasanna Parasurama, Joao Sedoc, Anindya Ghose |
Roundtable Discussion
The Cassandra Project at Johns Hopkins has organized its third roundtable in the Learning How to Play with the Machines series on computational social science at TADA.
Title | Authors |
---|---|
Misinformation and dataset biases | Panelists: Kathy McKeown, David Mimno, Sarah Shugars, Arthur Spirling |
 | Chairs: Giovanna Maria Dora Dore, Eva Klaus, Arya D. McCarthy |