Hands-on Methods for Text-as-Data Research

Artur Baranov, PhD Candidate
Department of Political Science, Northwestern University

Instructions

  1. Open URL (or Scan QR code)

  2. Prepare RStudio

January 20th, 2026

Motivation (i)

How do government and opposition parties differ in their sentiment in statements on the past, present, and future (Mueller, 2022)?

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data

Motivation (ii)

How do autocracies adapt their political messaging during periods of leadership succession (Boussalis et al, 2023)?

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data

Motivation (iii)

Under what conditions elites “overpraise” the ruler and imitate their rhetoric (Baturo et al, 2025)?

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data

Text Analysis Assumptions

  • Texts represent an observable implication of some underlying characteristic of interest

    • An attribute of the author
    • A sentiment or emotion
    • Salience of a political issue
  • Texts can be represented through extracting their features

    • Most common is the bag of words assumption
    • Many other possible definitions of “features” (e.g. word embeddings)
  • A document-feature matrix can be analyzed using quantitative methods to produce meaningful and valid estimates of the underlying characteristic of interest

(Benoit, 2018)

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data

Text Analysis Principles

  1. All quantitative models for textual analysis are wrong – but some are useful
  2. Quantitative methods for text amplify resources
  3. There is no globally best method for automated text analysis
  4. Validate, validate, validate (!!!)

(Grimmer and Stewart, 2013)

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data

This Workshop

  • is applied

    • follow and execute the code
    • do the exercises
    • ask questions
  • focuses on “traditional” QTA methods

    • we will cover AI only briefly
    • otherwise, we have to have another class
  • Time to turn to practice

In the Meantime:

Open URL & Prepare RStudio

artur-baranov.github.io/text-as-data