Echidne of the Snakes 

The Gender Gap. Part Two: Empirical Evidence 

You can read the first part in this series of posts here. It talks about the economic explanations that have been proposed to explain why on average men earn more than women. This post discusses the empirical evidence on the same topic and how it is collected.

By "empirical evidence" I mean all the zillions and zillions of studies that have looked at the reasons why women, on average, earn less than men by getting some real data and by analyzing it. Some of the wingnuts, including Steven Pinker in his Blank Slate book, give the impression that economists haven't done any of the relevant work, so that wingnuts can just declare whatever results are most pleasing to them. This is blatantly incorrect and makes me very angry, not because I am a feminist, but because I am also an economist and all that hard work of economists is buried in such flippant comments on the topic.

How is this empirical evidence created? Some sciences get their evidence from laboratories, but social sciences usually can't do that. Laboratory circumstances are not like the real world and by stripping away the whole environment and the time dimension we also strip away the central questions we are trying to answer. For social sciences are social, and this means that we will never be able to put people into an empty room with, say, computers, in order to come up with answers about how their whole lives went out there, in the society. I don't mean that laboratories wouldn't provide some useful information for economists, but on the whole we are limited to doing studies that employ data collected from actual living people out there.

The quality of this data limits what we can find out by analyzing it. Some data sets are just bad, full of holes and based on unreliable sources. But even with good data sets we are always going to have the problem that many things we are interested in measuring just can't be easily measured. Take the example of education. As I mentioned in my previous post on the theories of gender gap, one reason some earn more than others is because of greater education levels. But measuring education in actual data sets usually means employing the number of years a person has gone to school and to college as the proxy, or approximation, of education. This is better than nothing, and so is the alternative of using the highest degree the person has as the proxy, but neither of these takes into account the contents of the education or the quality of the institutions the person attended.

A much more severe problem presents itself when we try to measure discrimination in these data sets, because employers or coworkers or consumers that discriminate are not going to say so in a survey, and there will be no obvious proxy for discrimination itself. Sometimes researchers can use the number of court cases filed as such a proxy, but most large data sets have nothing on discrimination. So how do we go about measuring it at all?

There are two common answers. The first, and the best one, is to use audit studies. These are studies that have been extensively used to see if firms discriminate in their hiring of workers. The idea is to take a bunch of actors (or people who can act) and to train them all to act the same with a prospective employer. They are also given similar paperwork and they are told to give the same education and experience data. In short, these people try to be exactly the same in all the characteristics that might affect whether they get hired, except whatever the characteristic is that we want to evaluate. If it is sex discrimination in hiring, we would send out both female and male actors to apply to jobs in the same firms.

A study done in the restaurant industry in Philadelphia did exactly this to see if women and men who apply for server jobs are treated the same by the firms. What they found out was that the higher-priced restaurants discriminated against women. These are the restaurants where the servers would also earn the most. Thus, discrimination in hiring may cause women to earn less as restaurant servers. Incidentally, the researchers suggested that the reason for this discrimination is in customer discrimination.

Audit studies are good, and when they are done well they give us actual evidence on discrimination. The problem with audit studies is that they cannot last for years. This makes them useless in attempts to analyze promotion discrimination or differential treatment on the job.

The second answer, and the most common one, is to approach the problem from the other end. What if we could get data on people's wages and on all the variables that we know affect these wages: education, experience, whether the local job market is good, the occupation of the worker, age and so on? If we did this, and if we could standardize for all these characteristics, by holding them constant in the analysis, wouldn't we expect to find that after all these variables are taken into account there should be no gender wage difference left to be explained? And if there was such a difference, wouldn't it be due to discrimination?

This is the approach that is usually taken. It has its problems, and the main one is that if we don't have data on all the relevant variables that legitimately affect wages then any wage difference that still remains unexplained could be due to that lack of data and not due to discrimination. Or the total unexplained part could consist of some discrimination and of the lack of some important information. As information will never be perfect we are always going to have an argument about what the unexplained residual difference in wages between men and women means. But the better the data sets get the more it begins to look like discrimination. More about this later when I look at one study in greater detail.

A few more words on those tricky concepts of "holding constant" and "controlling for variables". As I mentioned in my theory post, there are several explanations for the gender gap. Each of these explanations suggests some things that might account for the wage difference. We call such things variables, because they take different values for different individuals. Age would be a variable, and so would belonging to a trade union, though the latter one is usually only coded as taking two values: yes or no. If we went through all the theories and made a list of all the variables that might affect earnings of men and women differently we would have the list of variables that we want to hold constant in our analysis.

To see what this entails, consider a simple example. I go out and buy some apples at the store. I then tell you that I spent a total of $5.30 , and ask you to tell me what the price is per pound of apples. Now, you can't do this, because I haven't told you how many pounds I bought. But if I also give you the pounds of apples I bought, the problem becomes very easy.

This is the task economists have when they analyze the gender gap in wages. It is as if they start with the total shopping bill and want to find out the individual prices of all items. To get there, they need information on the amounts purchased and on the types of goods purchased and on the quality of the goods. They also need information about the stores; whether they are in a city like New York where local prices are higher or in a rural area in the South or whatever. By getting all this additional information and by fitting it into a mathematical model it is possibly to arrive at a a good estimate of each price. If, at the end of this analysis, some consumers shopping bills still seem too large, then something else is going on at certain stores or with certain consumers or both.

I don't know if that made the idea of "controlling for" or "holding constant" any easier. The point I'm making is that when we hold, say, the years of education, constant in the analysis and we still find a remaining wage difference between men and women, then that remaining difference cannot be due to education. The more variables we hold constant this way, the less unexplained residual there should be. If our data were perfect, any unexplained residual would be caused by discrimination, because we would have taken all the other causes of different wages into account.

Let's look at one study in greater detail. I have picked the General Accounting Office (GAO) 2003 study as an example, because it is fairly recent, because it uses quite good data and because, if anything, it is biased to the right. There are many other quite similar studies with fairly similar findings. Thus, talking about this one study also covers most of the general points I'd like to make. Where it does differ in most other studies is that it also includes data on part-time workers and on some self-employed workers. This makes the data set richer for our purposes.

The GAO study uses data from

the Panel Study of Income Dynamics (PSID), a nationally representative longitudinal data set that includes a variety of demographic, family, and work-related characteristics for individuals over time. We tracked work and life histories of individuals who were between ages 25 and 65 at some point between 1983 and 2000.

Using our statistical model, we estimated how earnings differ between men and women after controlling for numerous factors that can influence an individual's earnings.

The number of included individuals is in the thousands. This means that the study is large enough to allow for some fairly fine-tuned analyses. The researchers report their results as follows:

In summary, we found:

Of the many factors that account for differences in earnings between men and women, our model indicated that work patterns are key. Specifically, women have fewer years of work experience, work fewer hours per year, are less likely to work a full-time schedule, and leave the labor force for longer periods of time than men. Other factors that account for earnings differences include industry, occupation, race, marital status, and job tenure. When we account for differences between male and female work patterns as well as other key factors, women earned, on average, 80 percent of what men earned in 2000. While the difference fluctuated in each year we studied, there was a small but statistically significant decline in the earnings difference over the time period. (See table 2 in app. II.)

Even after accounting for key factors that affect earnings, our model could not explain all of the difference in earnings between men and women. Due to inherent limitations in the survey data and in statistical analysis, we cannot determine whether this remaining difference is due to discrimination or other factors that may affect earnings. For example, some experts said that some women trade off career advancement or higher earnings for a job that offers flexibility to manage work and family responsibilities.

This is a teeny-weeny bit biased, for reasons that I am going to discuss later. But notice that the study was unable to explain about one half of the total gender gap in wages. The initial gender gap in the data set showed that men earned 44% more than women, on average. After using all the data on variables that might explain this gap, we are still left finding that men earned 20% more than women, on average, due to mysterious reasons. So it's not quite true that "work patterns are the key", unless quite a small key is enough to open the locks in the labor market.

So what variables did this study control for? The answer can be divided into three groups:

To determine why an earnings difference between men and women may exist, our model controlled for a range of variables, which can be grouped into three variable sets.

The first set of independent variables consisted of demographic characteristics, including gender, age, and race. We also included an education variable that indicated the highest number of years of education each respondent attained by the end of the sample period. Family-related demographic variables included marital status, number of children, and the age of the youngest child in the household. We also included other income (defined as family income minus a respondent's own personal earnings), the region where individuals lived (i.e., in the
South or not), and whether they lived in a rural or urban area (i.e., in a metropolitan area or not).

The second set of independent variables pertained to past work experience. Total work experience was defined as the actual number of years an individual worked for money since age 18. This variable was computed as self-reported experience as reported in 1984 (or the year the individual entered the panel), augmented by hours of work divided by 2,000 in each subsequent year. We also included a variable measuring job
tenure, defined as the length of time an individual had spent in his or her current job.

The third set of independent variables included labor market activity reported in a given survey year. Variables included hours worked in the past year, weeks out of the labor force in the past year, and weeks unemployed in the past year. For our analysis, we considered time spent unemployed and time out of the labor force as work "interruptions," but we did not include time off for one's own illness or a family member's illness, vacation and other time off, or time out because of strike. We also included a variable that accounted for an individual's full-time or part-time employment status, defined as the average number of hours an individual worked per week on his or her main job. Individuals were considered to have worked part-time if they worked fewer than 35 hours per week and full-time if they worked 35 hours or more per week. Other variables in this category included the individual's industry, occupation, and an indicator of union membership. We also accounted for self-employment status, defined as whether respondents worked for someone else, for themselves, or for both themselves and someone else.

Ok. The study controlled for some variables which are measures of productivity and effort at work: education level, total work experience and tenure on the job, as well as several variants of hours worked in the recent past. The hours worked equals the pounds of apples in my earlier example, and need to be held constant to arrive at a wage measure. All the other variables mentioned are ways to test the first theory I mentioned in Part I: that men might be more productive workers. Age is controlled for partly the same reason. If older workers are less healthy they might also be less productive. On the other hand, experience tends to go up with age (there are more years available for experience) so if age was not controlled for separately the experience variable would pick up both the effect of increasing experience and the effect of getting older and these might cancel each other out.

The study also controlled for many variables that relate to the second theory I discussed in Part I: that women prefer jobs with lower wages because such jobs might offer more flexibility which is useful for mothers who are the major caregivers of their children. Where do you see those, you might ask. Here:

Marital status, number of children and the age of the youngest child are all variables that should pick up pressures on women to focus more time on their families than on their jobs if the second theory is correct. Marital status might matter if the culture expects married women to do the household chores for their husbands, and if this makes married women more stressed and less energetic at work. The more children there are the more stress the mother should feel, and the age of the youngest child will pick up the pressure for more hands-on parenting needs.

Being a part-time worker is also related to this theory. If taking care of the children is the mother's duty then we'd expect more women to be in the part-time category. Part-time work pays less even in the per hour sense.

Finally, and this is important, the study controlled for occupation. This means that the sex-segregation in jobs is at least partially controlled for. The fact that women and men may not, on average, work in the same occupation is at least partially held constant here. Very important to point out, because it turns out to be relevant for criticizing the conclusions of the researchers.

The other variables that are controlled for relate to things like local labor market circumstances (urban vs. rural, say) and unionization rates. These can affect earnings but unless men and women have different geographic locations or unionization rates, on average, their effect should be neutral. Race is controlled for to remove any specifically racial discrimination from the final results, because this study focused on sex rather than on race discrimination.

After applying all these corrections, the study found that it could account for almost one half of the existing gender gap. What does this mean?

The usual argument would be that the remaining 20% difference between the average earnings of men and women could be due to discrimination, but it could also be due to "omitted variables", things, which we believe matter but which we can't measure in the data set. Remember the conclusions I quoted earlier? These:

Even after accounting for key factors that affect earnings, our model could not explain all of the difference in earnings between men and women. Due to inherent limitations in the survey data and in statistical analysis, we cannot determine whether this remaining difference is due to discrimination or other factors that may affect earnings. For example, some experts said that some women trade off career advancement or higher earnings for a job that offers flexibility to manage work and family responsibilities.

But they did standardize for a large number of things which relate to the work-and-family responsibilities of women: marital status, the number of children, the age of the youngest child, part-time work and the occupation the person has. Don't these measure flexibility at all? Isn't the usual argument that the occupations women choose are chosen because of their flexibility? Well, we are holding that choice constant here, and we still get a 20% unexplained gender gap.

It may not all be due to sex discrimination. But it's unlikely to be all due to some miraculous measure of job flexibility that isn't reflected on how flexible a job is in allowing people to work part-time or in the actual occupation the person has! Bangs head against the wall in frustration.

On the other hand, the variables that the study did control for are not necessarily non-discriminatory. Consider occupation. If women don't really "choose" their occupations but are steered into them through career counselors, schools and families, or if women are discriminated against in hiring and in promotions, then the variable "occupations" is not something we should hold constant when we analyze discrimination. Because it could be affected by discrimination itself.

I hope that this short survey has given you an idea of how we go about analyzing the gender gap in earnings. Many other studies have arrived at very similar results: showing that some of the gender gap can be accounted for by other reasons than discrimination but that there remains a large unexplained residual. Whether one believes that it is all due to omitted variables reflecting job flexibility or ability or whether one believes that at least some of it is due to unfair treatment of women in the labor force or elsewhere seems to depend on the assessor's political bias. But the audit studies do show that sex discrimination in hiring is real, and so do the many sex discrimination court cases which are decided for the plaintiffs.

I could have discussed other studies which are better in some ways than the GAO study, worse in other ways. As an example, I know of studies which include much more data on education of individuals, including SAT scores, grade point averages and the person's major while in college. The findings are not fundamentally changed by such inclusions: there still remains a large unexplained difference in earnings by gender.

My last post will discuss the right-wing's "interpretations" all this research in a little bit more detail.
Now that I read through this I'm wondering if it is at all clear. It's hard to explain what multiple regression analysis does with just words. Do ask questions in the comments if you want clarification on any of the issues. Thanks for reading something this long and dry.