Home > Writing an Empirical Paper
Writing an
Empirical Economics Paper
This document contains instructions
for writing an empirical economics paper.
The Structure of the Empirical Paper
Empirical papers in economics have a consistent look and feel. Follow the usual outline:
Title Page: Includes title, your name, date, and anyone you want to thank for help
Abstract: In 100 words or less, state the main contribution or finding.
II. Literature Review
Choose some form (e.g., chronological or thematic) to organize the literature review. Mere listing and summary of several sources is not acceptable. A good literature review interweaves the various articles in a seamless way.
Present a brief version of a model or highlight the theoretical source of the hypothesis to be tested. In many cases, you may wish to combine the literature review and theoretical analysis into a single section. For example, a paper you review may contain a version of the model you wish to adapt for your own analysis.
All data and analyses must be completely documented and available for inspection.
IV. Empirical Analysis (the main and longest part of the empirical paper)
Key Style Issues
Choosing a Question
Your paper will rely on the
Current Population Survey (CPS). Here is a sample list of questions
that can be answered with data from the CPS:
These questions can be narrowed
further. For example, you might take the first question in the list
and narrow the scope to: Do women with children work less than
comparable women without children, and, if so, how much less?
Or, with regard to question 6., you might ask, Does the return to a
college education vary according to age?
Students sometimes are simply
overwhelmed by the task of choosing a topic. If it all seems too
abstract or nothing seems to grab you, consider replication. You
find a paper that has already been published and update it with new
data. This can be easier than working with your own topic because
the published paper adds structure. You do exactly what the author
did and then compare your results with the latest data to the original
results. In addition, your literature review will include a discussion
of how the paper actually fared by figuring out who cited it and how
it was received. This approach can be extremely rewarding and
interesting.
The replication strategy begins
with a search of the JSTOR journal database (www.jstor.org) for papers
in a field of interest, for example, Political Science or Economics,
using the search terms “Current Population Survey” and “ordinary
least squares.” This will improve your chances of finding a
replicable paper that used regression analysis with CPS data.
An example of such a paper is:
How Computers Have Changed the Wage Structure: Evidence from Microdata, 1984-1989
Alan B. Krueger, The Quarterly Journal of Economics, Vol. 108, No. 1. (Feb., 1993), pp. 33-60.
Stable URL: http://links.jstor.org/sici?sici=0033-5533%28199302%29108%3A1%3C33%3AHCHCTW%3E2.0.CO%3B2-Q
Literature Review
A literature review is a summary of what other people have thought about your question or questions closely related to your topic. More specifically, it should explain how others have dealt with the issues you will be addressing in your paper. The literature review usually serves two equally important purposes. First, it will explain how others have tackled your question. Second, it will provide you with some theory (economic or otherwise) which you can use in trying to answer the question or test someone else's answer.
Your literature review should be anywhere from 10% to 30% of the body of your paper (excluding, of course, references, charts, and figures). One common strategy is to present a theory or claim, discuss those papers that find support, and then discuss those that disagree and why. You should review at least three papers that have tackled your question, reporting procedures and answers. “At least three papers” is a minimum; you may need to discuss more papers. The quality of your review depends on the quality of the papers you include, how on point they are, and your ability in distilling and presenting the findings in the literature.
Search in more than just econ—stats, pop studies, etc.)
You may certainly read nonprofessional sources like Newsweek or google your research question in order to stimulate the development of a policy topic, but these sources are not suitable for upper-level undergraduate research. Do not rely on mass media sources for your literature review.
For your literature review, you need work published in professional,journals. JSTOR, www.jstor.org, is an archive that contains the full text of a select group of journals in economics and other disciplines up through about four years ago (this varies from journal to journal). It is a good place to start, but you will want to go beyond JSTOR.
The references of the papers
you find can lead you to other interesting papers and make your literature
search easier. Once you find a single paper that addresses your
research question, its bibliography is a gold mine of other papers that
asked that question, or related questions.
Citation is important. After
paraphrasing findings or explicitly quoting text, give credit by simply
listing the last name of the author (use “et al.” when there
are more than two authors) and year of publication. Do not include
the entire reference in the text of your paper or in a footnote. Here
is an example: “Smith [2003] finds that more schooling lowers
the probability of smoking.”
In the references, a full citation of the Smith [2003] article is presented. A standard referencing format you can use is the Chicago style: http://www.wisc.edu/writing/Handbook/DocChicago.html
Be warned of the dangers of plagiarism. It is very easy to plagiarize someone's work unintentionally; but this fact does not make plagiarism any less serious of an offense. Make certain that you either directly quote and attribute the quote, or paraphrase the source (no more than three consecutive words alike). Remember this: In general, direct quotation should be used sparingly in an economics research paper. Repeated use of direct quotation gives the impression of laziness and is often disruptive of your own style and method of organization.
A good strategy is to make
sure that you paraphrase the work when you are actually taking the notes
from the source, in case you forget to do so later on. Remember that
the whole point of a literature review is to present others' work—your
contribution will come a bit later. It is perfectly acceptable to say
something like, "In his recent book on medical malpractice, Smith
[2003] contends that ..."
Theoretical Section
"Why on earth is a theory
section needed in an empirical paper?" Because a complete answer
to your question must rely on theory and data. You will need some theory
to guide you in deciding which variables are relevant for your question.
Common sense alone is not a sufficient reason for including or excluding
certain variables in your analysis. Theory can also help in choosing
the functional form and whether or not autocorrelation or heteroskedasticity
are part of the data generation process.
In some cases, the theory section
is quite clear. For example, earnings function papers have a solid theoretical
foundation that underlies the use of the semi-log functional form. If
your paper utilizes a measure of earnings as the dependent variable,
you can present a theoretical argument for using the semi-log form as
well as, for comparison, a regression that uses wage as the dependent
variable.
However, it is also possible
that there is no well developed theory for your question. In this case,
it is common to see the literature review and theoretical sections combined.
Your functional form and explanatory variables are chosen based on the
work of others.
The theoretical section is
a difficult piece of the empirical paper because some questions have
precious little theory behind them. Even those questions that
do have a solid theoretical foundation are often difficult to explain.
When deciding what to say in terms of the theory section, remember that
you are writing an empirical paper so the main function of the theory
is to justify your empirical work. In other words, use the theoretical
section to explain why you chose the particular explanatory variables
you selected and the functional forms you used.
Empirical Results
See MeasuringPay.doc (in the Basic Tools folder) for guidance in constructing a variable based on earnings.
This is the most important
part of your paper. It is always divided into two main subsections:
the data and the results.
The Data
Do not forget to provide the sources of your data and to help the reader by making a table that offers summary statistics on each variable. You should define each variable carefully and, if necessary, point out how the empirical measure deviates from its theoretical counterpart. Typical summary statistics that are offered include: max, min, average, and SD values for each variable. It is not unusual to offer histograms and other information for variables with skewed distributions. Excel is a fabulous tool here, and it is easy to get carried away. Remember, your goal should be clarity!
This subsection is the place to offer interesting information about the data. You should also point out the limitations, if any, of your data. You will want to describe your procedure in obtaining the data, making sure to point out key decisions in how you drew your sample. For example, in describing the wage variable, you might explain that you decided to remove all observations with negative values. You will want to clearly state the time period (survey month and year, if CPS data) of your data set.
Do not go into excruciatingly painful detail on every step of your data collection. These details should be included in your Excel workbook that has the data, recode information, and results.
Presentation and Interpretation of Results
This subsection is the heart of an empirical paper. Having set out the question, reviewed the previous literature, explored the theoretical perspective, and collected data, you are finally ready to do some econometrics.
Use subheadings to lead the reader through the different levels of your analysis. You might start with a table that compares averages for two groups, then move to a regression analysis, considering a variety of specifications and different sets of explanatory variables. You may also want to have a subheading for advanced analyses, such as robust standard errors.
You do not need to report every regression you run.
You do need to run several models and use a table to report your results. The table is used to easily display the results from various models and invites comparison of coefficients. Below is a template you can use to organize your results:
Model 1: Dependent Variable | Model 2: Dependent Variable | Model 3: Dependent Variable | Model 4: lnDependent Variable | |
Intercept | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
X1 | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
X2 | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
|
X3 | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
|
X4 | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
|
X42 | Est. Coefficient
(est. SE) |
Est. Coefficient
(est. SE) |
||
n | ||||
R2 |
The table shows how the first regression has no control variables. It is a simple, bivariate regression of X1 on the dependent variable. Model 2 adds three explanatory variables (presumably selected on the basis of some theoretical reasoning) and Model 3 adds a squared term for X4. The last model uses a semi-log functional form.
Notice how the table invites
comparison of the models. In the discussion of the results, you would
explain the results from each model and offer your opinion on the best
answer to your research question.
The table can be augmented
with asterisks (for statistically significant coefficient estimates)
or other information (e.g., DW statistics for autocorrelation).
You can add notes at the bottom as needed. In this table, you could
add a note that said, “The R2 from Model 4 cannot
be directly compared to the R2 of the first three
models.”
How Many Decimal Places?
An important issue in reporting
regression results is the number of decimal places to use for coefficients
and other statistics. In principle the theory of significant figures
resolves this issue. However, that theory is complicated and most
papers in economics do not follow the rules of significant figures anyway.
Therefore we offer a compact, basic set of dos and don’ts.
Don’t report 1.23456789
Don’t report the many decimal
places displayed by your software. Doing this is called false precision
and is a serious mistake. It is almost never true that the number is
correct to that many decimal places, so when you report all the decimal
places you are potentially misleading your reader.
Once you understand that reporting
many decimal places is wrong, the natural question is: How many decimal
places should be reported? This turns out to be a difficult question.
In practice, economists round
by applying a variety of rules of thumb that boil down to a guiding
principle of enhancing readability. Decisions on display turn
on creating a table that is pleasing to the eye, for example one in
which every number is reported to the same relatively small number of
decimal places. Although this practice is not well grounded logically,
it does usually avoid the sin of reporting too many decimal places.
The desire to enhance readability
leads to a suggestion to avoid coefficients with many leading or trailing
zeroes. Thus, a number like 0.00123456 is typically reported as 1.23456
and the units of the variable associated with that coefficient are appropriately
modified. For example, the coefficient of 1.23456 might correspond
to Income measured in thousands of dollars and it is interpreted as
the effect of a one thousand dollar increase in income (instead of a
one dollar increase in income giving a 0.00123456 increase in predicted
Y), holding other included X variables constant. (Of course,
you might well end up reporting this number as 1.23 instead of 1.23456)
Do use the SE as a guide 1.23456789 +/- 0.203040506 1.2 +/- 0.2
If you prefer a more logical
approach in reporting your results, we recommend that you follow a
modified version of a common practice in the hard sciences of letting
the SE be your guide. The basic idea behind this often-used approach
is that the SE is a measure of the precision of the estimated coefficient.
Thus, the SE is used to determine how many decimal places are reported.
To use the SE as a guide, scientists
employ the following simple procedure: They find the first non-zero
digit in the SE. If it is greater than one, this is the decimal place
to which they will report the coefficient. They round the SE to this
decimal place and report the estimated coefficient rounded to as many
decimal places as the SE. This is the rule applied in the underlined
example above. Here is another example: 0.00456789 +/- 0.0089
0.005 +/- 0.009. The first non-zero digit in the SE is 8, so we
round the SE to 0.009 and then we report the coefficient rounded to
that decimal place, 0.005.
If the first non-zero digit
in the SE is a one, then you apply the same rules to the next
decimal place in the SE: 12345.6789 +/- 12.3456789 12346 +/- 12. The first non-zero
digit in the SE is 1, so we go to the next digit, 2, and round the SE
to 12. Then we use the SE as our guide to rounding the coefficient.
Note that this rule means that 12345.6789 +/- 1234.56789 should be reported
as 12300 +/- 1200. (When you need to round up from 1 to 2, keep
the next digit, e.g., if the SE is 0.196, report the SE as 0.20.)
Here’s our modification to
the scientists’ rule of thumb: add one additional decimal place to
the results you report beyond what the above rule would give you.
Thus, 12345.6789 +/- 12.3456789 12345.7 +/- 12.3 and 12345.6789 +/-
1234.56789
12346 +/- 123. We make this modification to deal with a disadvantage
of the scientific rule: it is hard to compute accurate t-statistics
when there are only a limited number of decimal places. Here’s an
example: Suppose the true values of the estimated SE and the estimated
coefficient are 0.344 and 0.663 respectively; then the true t-statistic
for the null that the parameter value is 0 is about 1.93. If one
were to follow the scientific rule stated above, the estimated SE and
the estimated coefficient would be reported as 0.3 and 0.7 respectively.
This would lead to a t-stat of about 2.33. Reporting an
additional decimal place gives you values of 0.34 and 0.66, which would
lead to a t-stat of 1.94.
To be sure, there is no consensus
on the matter of significant figures in the economics profession. One
thing that is quite clear is that reporting ten or fifteen decimal places
is silly and embarrassing. Avoid this. Some rounding must be applied
to computer output. While applying “pleasing to the eye,”
the common practice in the social sciences, is better than nothing,
you can do better by considering the likely size of the error in the
results.
Writing up your results
Here is a list of good practices
that should help you in writing up your results:
At the end of the empirical section of your paper, you should be able to draw a conclusion, even if it is a negative one. For example, you may find that there is no relationship between divorce and schooling in your data; this is still worth reporting.
Remember this: No study is
absolutely perfect, but if you have done a thorough job in your empirical
section, you should be able to reach some answer to your research question.
This conclusion will then be inserted into your introductory paragraph
in a slightly different form.
Things to Avoid
Here are common mistakes and
poor practices to avoid like the plague:
Be sure to consult with
your professor early and often.
Conclusion
In the conclusion, your job is to give the paper's greatest hits. That is, you should restate your research question, give the high points of the literature survey and theory, remind your reader of the data and the methods you used, and restate your conclusion.
You may then go on to talk about the limitations of your analysis, any data you wish you could have garnered but couldn't, and what you would have liked to have done with your analysis but couldn't given the time limitations. This needn't be a long section—do not apologize for your work, but do suggest avenues for further research.
After writing the conclusion, you should then go to the beginning of the paper and write the Introduction. It should be a snap.
"And then I turn it in?" No, not quite yet. The last thing you should do is PROOFREAD your paper. Even after spell checking the paper with your word processor, you should take the time to read it one last time before turning it in. Fix typographical errors, improve wording, and make sure the numbers make sense.
Your paper will be evaluated with the rubric on the last page of these instructions. The rubric relies on the material presented above. Follow these instructions and you will write a solid paper!
Referee Report
Rubric
Author’s Name:_______________________ Paper Title: __________________________
F | Poor | OK | Good | Super | |
Title and Abstract | |||||
Introduction | |||||
Clear statement of research question | |||||
Literature Review | |||||
Content of lit review | |||||
Organization of lit review | |||||
Citation usage and style | |||||
Theoretical Analysis (may be in lit review section) | |||||
Discussion of underlying theory | |||||
Empirical Analysis | |||||
The Data | |||||
Variable Description and Summary Data Table | |||||
Description of procedure for data collection | |||||
Data limitations and problems | |||||
Empirical Results | |||||
Discussion of DGP/error term (e.g., CEM or Random Xs) | |||||
Regression results table | |||||
Rounding numbers and ease of reading display | |||||
Interpreting Results | |||||
Explanation of a particular coefficient | |||||
If Dummy Dependent Variable, predicted probabilities | |||||
Including units in discussion of variables | |||||
Appropriate functional forms used and explained | |||||
Discussion of violations of CEM | |||||
Testing for heteroskedasticity or autocorrelation | |||||
Use of robust SEs (if heteroskedasticity present) | |||||
Hypothesis test on estimated coefficient (if applicable) | |||||
Confidence interval on estimated coefficient (if applicable) | |||||
Discussion of economic importance | |||||
Computation and explanation of elasticity (if applicable) | |||||
Comparison of results to previous work (from lit review) | |||||
Answering the research question | |||||
Conclusion | |||||
Summary of paper | |||||
Future work | |||||
References: Content and Style | |||||
Overall Aspects of the Paper | |||||
Content | |||||
Writing | |||||
Style (figures, tables, look and feel) |
Best Parts of the Paper: Things That Need Work:
Final Evaluation: Reviewer’s Name: ____________________
All Rights Reserved Powered by Free Document Search and Download
Copyright © 2011