These slides make up much of the content of the course. If you’d like to print out the slides, click on the link, add ‘?print-pdf’ to the end of the URL, and then you can use your browser’s Print function to print it out or save as a PDF.
This lecture introduces the structure of the class and shows the link between our underlying model of the world and the data that the world produces for us.
This lecture introduces the concept of the “data-generating process” and how we can use data to get at underlying truths.
Here we start working in R and get familiar with RStudio and some basic commands. These are the building blocks we’ll be continuing to use later!
R works by creating and manipulating objects. We cover the kinds of objects and the functions we can run them through here.
We start getting comfortable with data frames and tibbles, the main way of handling data in R. And we’ll need to be handling data!
In this lecture we start getting comfortable with manipulating data using the dplyr commands select(), filter(), and mutate().
We don’t want to just look at raw data, we want to summarize it so we can make sense of it! Here we cover ways of describing the distributions of single variables.
You didn’t think you were going to get away without doing a little data visualization, did you?
Most of what we do in econometrics has to do with not just one variable, but looking at how multiple variables interact. Let’s do some of that.
What does it mean to “explain one variable with another” and how do we do it? There are many ways, of course, but conceptually they all boil down to variations on some simple concepts we’ll be going over here.
If we want to understand whether our methods are actually capable of uncovering the data generating process, we need a situation where we know what the answer is so we can check it. How about we just make up our own?
This is just a recap lecture for the programming content of the course before the midterm.
This lecture introduces the fundamental problem of identifying causal effects from observational data.
We will be representing our underlying models using causal diagrams. This lecture goes over what those are and how they work.
And of course we need to be able to get the models in our own heads down on paper too, right? We need to know how to draw our own causal diagrams.
What keeps us from just being able to look at correlations in the data and call them causal? Why, it’s those nasty back doors. What are they and how can we find and close them?
We need a little muscle memory in order to be able to put models down on paper. Let’s do the work.
An important part of causal inference is being able to close back doors by controlling for variables. How does that work? When should we do it, and when shouldn’t we? What are “colliders”?
Now we’re starting to get into the standard econometrician’s toolbox. How can you possibly measure everything you need to control for? You can’t! But sometimes you can control for things that you can’t measure. That’s where fixed effects comes in.
A major concept in causal inference is the idea that we’re comparing a “treated” and an “untreated” group. What do we mean by that, and is it possible to create those groups artificially using matching?
One of the most important tools for econometricians is difference-in-differences, where you compare a treated and untreated group across time to isolate a causal effect.
It’s hard to find a treatment and control group that are really the same except that one got treatment. One way you can do it is by finding two groups standing just next to each other when the scimitar cleaves juuuust between them.
I’ve thrown a lot of tools at you in the past few weeks. Let’s take a chance to slow down and try to apply them.
One last tool for the toolbox: instrumental variables. It’s like something out in the world did the randomization in our experiment for us!
When should we use IV? When should we trust it? When should we use something else from the toolbox?
Just a review period before the causal inference midterm. We’ll be covering again, and practicing, the methods and material we’ve gone over so far.
We’ve been explaining one variable with another all term. One common high-octane way of doing this is with regression. We’ll cover regression conceptually, as a preview of future classes, and see what it can do for us to aid our causal inference.
We apply regression to all of our causal inference methods here, and even get a little peek at some other ways of explaining with machine learning.