IRiSS 2013 Workshop: Shell, Text1, and Text2

Materials for the Text units 1 and 2 for the IRiSS CSS workshop 2013

This project is maintained by rjweiss

IRiSS CSS Workshop

In 2013 I taught a few parts of a computational social science workshop at Stanford hosted by the Institute for Research in the Social Sciences. All of the in-class demonstration code are now available for download.

You can either check out the content from shell with git installed or with the GitHub client. If you've never used git or GitHub before, here are some instructions.

You can also download the .zip or .tar.gz. You will need something that will let you open IPython Notebooks. That will allow you to work with the code directly.

I tried to make the code easy enough for complete Python novices to get up and running. Think of these lessons as a jumping-off point and a reference for future text analysis.

Shell

Coming soon!

Text 1 and 2

Text 1 was encoding, file formats, and regex. Text 2 was text cleaning, TF-IDF, and simple classification. If you don't want to download the data, you can check out the materials by following these links to the notebooks as static webpages:

  1. Encoding
  2. File formats
  3. Regex
  4. Text cleaning
  5. TF-IDF
  6. Simple text classification