Skip to content

These resources cover topics that include programming languages like R and Python, introductions to computational methods of text analysis, and natural language processing methods, such as topic modeling and word embedding models.

Guides and Tutorials

Tools and Methods

  • AntConc – A concordancing and text analysis toolkit created by Laurence Anthony.
  • CasualConc – A Mac OSX-native toolkit (AntConc’s Mac version is ported from the PC, and has some bugs).
  • Lexos – A tool for scrubbing, chunking, and tokenizing text, in addition to performing modest analysis and visualizing clusters. See: How to Create Topic Clouds with Lexos – Blog post by Scott Kleinman on using Lexos for topic modeling word clouds.
  • Voyant Tools – A simple, yet powerful web-based text analysis and visualization tool.
  • Word Tree – A tool that creates word trees from a block of text.
  • TEI Publisher – Practice and view demos of text encoding with this online tool.
  • Guided Tour – A comprehensive guide to topic modeling by Scott Weingart of Carnegie Mellon University.
  • Topic Modeling Made Just Simple Enough – An introduction to topic modeling written by Ted Underwood of the University of Illinois, Urbana-Champaign.
  • Topic Modeling Toolbox – An alternative to MALLET for LDA topic modeling from Stanford University.
  • Pulling Out the Stops – An article questioning the utility of highly customized or comprehensive stop lists by Alexandra Schofield, Måns Magnusson, and David Mimno.

Journal of Digital Humanities’s Special Issue  – Special issue of JDH on Topic Modeling in the humanities published in 2012.

  • Topic Modeling: A Basic Introduction – Introductory article by Megan R. Brett from JDH’s special issue explaining the basic concepts of topic modeling.
  • Words Alone – Article on Latent Dirichlet Allocation’s (LDA’s) limitations by Ben Schmidt.

MALLET – Website for downloading and installing MALLET, an open-source and Java-based Latent Dirichlet allocation (LDA) package.

  • Topic Modeling Tutorial – Tutorial by Shawn Graham, Scott Weingart, and Ian Milligan’s on setting up a command-line environment for using MALLET.

GUI Tools that use MALLET:

  • Google’s Topic Modeling Tool – A graphical user interface for doing topic modeling.
  • Serendip – A system for visualizing topic models by Eric Alexander and Joe Kholmann of the University of Wisconsin-Madison.