The final project will take the form of a written description of the data to be used for your yearlong project. I expect the document to be written to the standards of a journal article, although this will be by necessity longer than the standard description of data in a journal article.

Your paper should be no longer than five pages (approx. 1500 words). Make sure that you attend to the stylistic guidelines for an APA manuscript, including headers, page numbers, and guidelines for tables and figures.

Your paper should have the following components:


In one or two paragraphs, describe the nature of the problem and your research question or questions.

Directional Statements

You’re not required at this time to have full blown hypotheses, but you should describe broadly what you expect to see. What are the expected relationships between your dependent and independent variables?

Description of Survey

Describe the survey you plan to use for your study. How and when were the data collected? What was the overall sample size? How much information is available from this survey? What methods were used to correct for non-response and to “freshen” the sample?

Description of Dataset

Your dataset is a subset from a larger survey. How did you choose to subset? Is your data limited to a certain group? A certain year or wave of data collection? Which questions are included?

Sampling considerations

What should the reader know about the sampling design used and its relationship with your dataset? Which sampling weights, PSU, and strata are you using and why? This should be technically competent, but should also display an ability to describe what you’ve done in readily comprehensible terms. You also need to comment on any issues generated by the selection of your particular subset.

Considerations and Limitations

In this section, you should describe how your data fall short from what you would ideally like to have. What are the imperfections in the available data? Are there measurement problems? Do you have samples from the right population? Also make certain to describe patterns of missing data.

Description of the Data

In this section, you should accomplish two things. First, for the dependent variable(s) and key independent variables describe measures of central tendency and distributional measures. Second, try to show in graphical and/or tabular form whether or not the data support the directional relationships you described above. Steer far clear of causal language here. Just describe the patterns that you see.

Do file and codebook

I will also want your final do file and a codebook describing all of the variables in your analysis dataset as part of this assignment.