I am a computational social scientist. My substantive research focuses on comparative political communication. My methodological research focuses on text-as-data, machine learning, Bayesian statistics, and social network analysis. Combining the two lines of interest, my current work theorizes the dynamics of citizen-citizen interactions in social media political talk in authoritarian China and tests it with original social media text and network data. In addition, I am part of an interdisciplinary effort in understanding political polarization in democracies using observational and experimental studies on social media.
Ph.D. in Political Science, September 2019
MS in Statistical Science, September 2019
Bachelor of Social Sciences, May 2013
The University of Hong Kong
Lies to Friends, Truths to Strangers: Anonymity, Preference Falsification, and Opinion Polarization in Authoritarian China
I argue that with new communication technologies, citizens in authoritarian regimes perform a new type of preference falsification in political talk: lying to friends and telling the truth to strangers. Extending the classic preference falsification theory to political communication in the social media era, I theorize that the behavior is prevalent because new communication technologies change the structure of social punishment for expressing different political opinions by connecting people to a large audience of “strangers.” I posit that the behavior favors dictators because it reveals information with low risk of anti-regime collective action. I test the theory with an original dataset collected from a popular Chinese social networking site that offers users a distinct option to join political discussions anonymously. Using a combination of text-as-data methods, I find content posted anonymously (i.e., expression to “strangers”) more likely to be politically sensitive while being less popular among its audience.
Embedding Concepts and Documents in One Space: A System for Valid and Replicable Text-as-data Measurement
I develop a system for generation of valid and replicable measures from text data by integrating distributed semantics with unsupervised and semi-supervised clustering models and researchers’ prior information. The system for text-as-data measurement has two design features: informative representation and stepwise guidance. With informative representation, it learns distributed semantics to represent words, concepts, and documents in a low-dimensional space. With stepwise guidance, it integrates distributed semantics with clustering algorithms, seed dictionaries, and researchers’ manual selection into a workflow that reliably links concepts of substantive interest with messy text data and produces codebooks for convenient replication and extension.
Bayesian Dynamic Network Modeling for Social Media Political Talk
I develop a Bayesian method for real-time monitoring of dynamic network data in social media streams. Extending upon the latest development of dynamic modeling, the method combines flexible count mixture models with gravity models using a decoupling-recoupling strategy. This extension tackles over-dispersion problems caused by eruptive and sporadic social media traffic. It is applied to network data of a Chinese social networking site to examine how political attention shifts and is manipulated in authoritarian regimes. With online learning capacities, the model captures behavioral responses to political and non-political events among Chinese social media users and detects signs of government censorship and fabrication in real time.
Bail, Christopher A., Lisa P. Argyle, Taylor W. Brown, John P. Bumpus, Haohan Chen, M.B. Fallin Hunzaker, Jaemin Lee, Marcus Mann, Friedolin Merhout, and Alexander Volfovsky. 2018. “Exposure to Opposing Views on Social Media Can Increase Political Polarization.” Proceedings of the National Academy of Sciences, vol. 115, no. 37: 9216–9221. Link. PDF.
Media: NYTimes, Washington Post, LATimes, BBC
Winner of APSA Political Communication Section Paul Lazarsfeld Best Paper Award, September 2019.
Aldrich, John H., Haohan Chen, Victoria Dounoucos, Joshua Lerner, Pedro Magahaes, and Greg Schober. “Institutional Influences on Behavior and Selection Effects.” R&R.
Chen, Haohan, and Herbert Kitschelt. Under review. “Political Linkage Strategies and Social Investment Policies: Clientelism and Educational Policy in the Developing World.” Written for The World Politics of Social Investment, edited by Bruno Palier and Silja Haeusermann. Oxford: Oxford University Press. Working paper
Chen, Haohan. “Why the Poor Tolerate Inequality in Developing Democracies: Weak States and Clientelism.” Awarded Annual Best Prelim Exam Paper in Political Science, Duke University, 2017. Working paper
How to Teach Computational Methods to Political Scientists?
An Envisioned Curriculum of Computatoinal Social Science for Political Scientists
Experiences at Duke Political Science
Lab Instructor and TA: Probability and Regression,* Fall 2018. Handouts and Code
Lab Instructor and TA: Advanced Regression,* Spring 2018. Handouts and Code
Instructor: Methods Bootcamp for first-year graduate students, Summer 2014, Summer 2015. LaTeX Tutorial
TA: Business, Politics, and Economic Growth, Spring 2017. Syllabus
* Graduate methods course
Experiences at Duke Statistical Science
TA: Probability (undergraduate course for statistics majors), Fall 2017.
Experiences in Computational Social Science
TA: Summer Institute in Computational Social Science, Summer 2018. Program Website
Mind Maps of Political Economy
Below are mind maps summarizing a selection of important topics of political economy. I drew them when I reviewed the literature for my qualifying exam in 2015.