UNITED STATES DISTRICT COURT FOR THE DISTRICT OF

UNITED STATES DISTRICT COURT

FOR THE DISTRICT OF MASSACHUSETTS

STUDENTS FOR FAIR ADMISSIONS,

INC.,

Plaintiff,

PRESIDENT AND FELLOWS OF

HARVARD COLLEGE (HARVARD

CORPORATION),

Defendant.

Civil Action No. 1:14-cv-14176

REPORT OF DAVID CARD, Ph.D.

December 15, 2017

CONFIDENTIAL Page 1

Table of Contents

1. QUALIFICATIONS .................................................................................................................................................. 3

2. ASSIGNMENT AND SUMMARY OF OPINIONS ............................................................................................ 5

2.1. Assignment .................................................................................................................................... 5

2.2. Overview of report and summary of findings ............................................................................... 6 

3. AN OVERVIEW OF HARVARD’S APPLICANT POOL AND ADMISSIONS PROCESS .................... 12

3.1. Harvard’s admissions process is highly competitive, and academic achievement is

abundant in its applicant pool ................................................................................................ 12



3.2. Harvard seeks candidates with a wide range of skills beyond academic achievement ............. 16

3.3. Harvard’s decision process is labor-intensive and seeks to understand the full context

of each applicant’s high school achievements ....................................................................... 23



3.4. Harvard’s ratings reflect important and otherwise unobservable information about the

academic and non-academic qualifications of applicants ..................................................... 25



3.5. Prof. Arcidiacono’s statistical model fails to account for numerous dimensions of

Harvard’s admissions process ................................................................................................ 31



4. ACCOUNTING FOR NON-ACADEMIC AND CONTEXTUAL FACTORS IS CRITICAL IN

MODELING HARVARD’S ADMISSIONS PROCESS .............................................................................. 33



4.1. There is no statistically significant difference in admission rates for the vast majority

of Asian-American and White applicants .............................................................................. 34



4.2. White applicants have relatively stronger qualifications on non-academic dimensions .......... 35

4.3. Prof. Arcidiacono’s model excludes available measures of life circumstance and

context ..................................................................................................................................... 40



5. A MORE COMPLETE STATISTICAL MODEL SHOWS NO EVIDENCE OF BIAS AGAINST

ASIAN-AMERICAN APPLICANTS .............................................................................................................. 46



5.1. Important differences between Prof. Arcidiacono’s methodology and mine ............................ 46

5.2. My enriched model finds no statistically significant evidence of bias ....................................... 62

5.3. Analysis of key subgroups of the data further contradicts SFFA’s claim of systematic

bias ........................................................................................................................................... 75



5.4. Conclusion .................................................................................................................................. 79

6. AVAILABLE DATA DO NOT INDICATE THAT RACE IS A DETERMINATIVE FACTOR IN

ADMISSIONS AT HARVARD .......................................................................................................................... 81



6.1. Race is less important than other factors in admissions decisions ............................................ 82

6.2. Race is less important than unmeasured, individualized factors .............................................. 85

6.3. Prof. Arcidiacono’s claim about a “floor” for the admission rate of African-American

applicants is not supported by available data ......................................................................... 87



6.4. Conclusion .................................................................................................................................. 93

7. ANALYSIS OF POTENTIAL RACE-NEUTRAL ALTERNATIVES .......................................................... 95

CONFIDENTIAL Page 2

7.1. Race-neutral alternatives identified in academic literature and by SFFA ............................... 95

7.2. Academic research indicates that race-neutral alternatives diminish universities’

ability to select for quality ....................................................................................................... 97



7.3. Analysis of race-neutral alternatives using Harvard’s admissions data ................................. 103

7.4. Mr. Kahlenberg’s simulated race-neutral practices, like others considered above,

could achieve a comparably diverse class only by changing the class in significant

ways and compromising its quality ....................................................................................... 151



7.5. Conclusion ................................................................................................................................ 153

8. APPENDIX A.......................................................................................................................................................... 155

9. APPENDIX B.......................................................................................................................................................... 171

9.1. Documents relied upon ............................................................................................................. 171

10. APPENDIX C ....................................................................................................................................................... 178

10.1. Parent occupations.................................................................................................................. 178

11. APPENDIX D ....................................................................................................................................................... 180

11.1. Primary activities .................................................................................................................... 180

12. APPENDIX E ....................................................................................................................................................... 181

12.1. Variables used in logit model of admissions .......................................................................... 181

13. APPENDIX F........................................................................................................................................................ 187

13.1. Mr. Kahlenberg’s Simulations ............................................................................................... 187

Page 3

1. QUALIFICATIONS

1. I received a B.A. degree in Economics from Queen’s University (in Canada) in 1978 and a

Ph.D. in Economics from Princeton University in 1983. From 1982 to 1983, I was an Assistant

Professor at the University of Chicago Graduate School of Business. From 1983 to 1997, I held

positions as Assistant Professor and Professor of Economics at Princeton University. Since 1997, I

have been the Class of 1950 Professor of Economics at the University of California, Berkeley.

2. I have published more than 110 articles and book chapters, co-authored one book, and co-

edited seven others, including the Handbook of Labor Economics. The majority of my publications

are focused on labor economics—the field of economics that addresses questions related to

discrimination in various contexts, including education. My articles have appeared in the leading

journals in economics and econometrics, including Econometrica, the American Economic Review,

the Quarterly Journal of Economics, the Journal of Political Economy, and the Journal of

Econometrics. I served as co-editor of the American Economic Review from 2002 to 2005 and co-

editor of Econometrica from 1993 to 1997. I have also served on several editorial boards and

government advisory committees for statistical issues, including the National Academy of Science

Committee on National Statistics (2012 – 2015), the U.S. Census Advisory Committee (1991 –

1996), Statistics Canada’s Labour Statistics Advisory Committee (1990 – 2002), and the National

Institutes of Health Social Sciences, Nursing, Epidemiology, and Methods Review Panel (1998 –

2003).

3. My research has been recognized by several awards and prizes, including election as a

Fellow of the American Academy of Arts and Sciences in 1998, a Fellow of the Econometric Society

in 1992, and a Fellow of the Society of Labor Economics in 2004. In 1995, I received the John Bates

Clark Medal, widely regarded as one of the highest honors in the field of economics, which is

awarded by the American Economic Association to the outstanding economist in the United States

under the age of 40. In 2006, I was awarded the IZA Prize by the Institute for the Study of Labor in

Bonn for outstanding academic achievement in the field of labor economics. In 2008, I was awarded

the Frisch Medal by the Econometric Society for the best article in applied economics published in

Econometrica in the previous two years. I was the co-recipient of the 2015 BBVA Foundation

Frontiers of Knowledge Award in economics.

4. My research focuses on statistical analysis of the labor market and related data pertaining to

such issues as wages, hours of work, employment, education, and immigration. I have published

multiple studies analyzing differential labor-market outcomes across race and gender (including

questions of discrimination), as well as a study of the effects of race-conscious admissions. In my

capacity as a journal editor, member of an editorial board, and member of proposal review

Page 4

committees, I have also edited, refereed, and critiqued many studies that address questions of

discrimination, education, and/or college admissions. My complete CV, which includes a list of

publications I have authored within the past ten years, is attached in Appendix A.

5. I am being compensated at my standard billing rate of $750 per hour. I have been assisted

in this matter by staff of Cornerstone Research, who worked under my direction. In addition to

compensation at my hourly rate, I receive compensation from Cornerstone Research based on its

collected billings for supporting me in this matter. None of my compensation in this matter is in any

way contingent or based on the content of my opinion in this or any other matter or the outcome of

this or any other matter. A list of my testimony in the last four years is attached in Appendix A.

CONFIDENTIAL Page 5

2. ASSIGNMENT AND SUMMARY OF OPINIONS

2.1. Assignment

6. Harvard’s counsel have asked me to assess the following questions related to Harvard’s

admissions process, which I understand are relevant to the claims of the Plaintiff, Students for Fair

Admissions, Inc. (“SFFA”), in this matter on the basis of the complaint and SFFA’s expert reports:

• Does statistical evidence support SFFA’s claim that Harvard

discriminates against Asian-American applicants in undergraduate

admissions decisions?

• Does statistical evidence support SFFA’s claim that race is the

determinative factor in undergraduate admissions decisions for many

applicants?

• Is there statistical evidence that Harvard has engaged in racial balancing

in its undergraduate admissions process?

• How would the racial composition and other attributes of Harvard’s

admitted class be expected to change if Harvard stopped considering

race and instead pursued a variety of race-neutral ways of seeking to

increase the racial diversity of its admitted class?

• Are the analyses and conclusions offered by SFFA’s experts reliable?

7. In attempting to answer these questions, I have relied on several sources of information,

including deposition testimony in this matter, documents produced by Harvard in this matter,

database information produced by Harvard in this matter (covering all applicants to the classes of

2014 to 2019),

College Board data on neighborhood and high school demographics and high school

quality produced in this matter, relevant public information and data, and academic research. I have

also reviewed the reports submitted by SFFA from Professor Peter Arcidiacono and Mr. Richard

Prof. Arcidiacono states that the list of data Harvard produced and omitted can be found at HARV00006413,

HARV00006471, HARV00006541, HARV00006607, HARV00006695, and HARV00006759. A list of additional

database fields produced by Harvard is available at HARV00001322 – HARV00001361.

CONFIDENTIAL Page 6

Kahlenberg and their relevant supporting materials.

8. Appendix B to this report lists the documents on which I relied in forming the opinions

expressed in this report.

2.2. Overview of report and summary of findings

9. SFFA’s Complaint

and expert reports claim that Harvard’s undergraduate admissions

decisions exhibit bias against Asian-American applicants, that race is a determinative factor in the

Harvard admissions process for many applicants, and that Harvard can achieve its diversity goals

without considering race by using a variety of race-neutral admissions practices.

10. SFFA’s claim of discrimination against Asian-American applicants relies most

fundamentally on the premise that Asian-American applicants are admitted at a lower rate than White

applicants, while possessing higher academic credentials than White applicants on average. As I

explain in this report, however, there is a critical flaw in SFFA’s reasoning: as I understand from my

review of the documents and testimony in this matter, and as my empirical analysis corroborates,

Harvard’s admissions process values many dimensions of excellence, not just prior academic

achievement.

11. As I detail in Section 3 below, Harvard’s applicant pool is full of students with

outstanding academic credentials. More than 8,000 applicants for the class of 2019 had perfect GPAs,

approximately 3,500 applicants had perfect SAT math scores, and nearly 1,000 applicants had perfect

ACT and/or SAT composite scores. In that pool, having strong academic credentials is not sufficient

to make an applicant a strong candidate for admission. The record in this case makes clear that it is

often the non-academic aspects of a candidate’s application that determine whether the candidate is

admitted from this academically exceptional pool, that the evaluation of each candidate takes into

account the full context of his or her life experiences, and that Harvard’s ultimate goal is to admit a

student body that exhibits excellence in a variety of forms and includes students with diverse

experiences, backgrounds, skills, and interests. Harvard’s admissions data are consistent with these

facts. They show, for example, that candidates who are strong on dimensions other than academics

are rarer than academically strong candidates. They also show that candidates who receive high

ratings in at least three of the four categories rated by admissions officers (academic, extracurricular,

athletic, and personal)—referred to in this report as candidates who are “multi-dimensional”—have a

Expert Report of Peter S. Arcidiacono, Students for Fair Admissions, Inc. v. President and Fellows of Harvard College,

October 16, 2017 (“Arcidiacono Report”); Expert Report of Richard D. Kahlenberg, Students for Fair Admissions, Inc. v.

President and Fellows of Harvard College, October 16, 2017 (“Kahlenberg Report”).

Complaint, Students for Fair Admissions, Inc. v. President and Fellows of Harvard College (Harvard Corporation); and

the Honorable and Reverend the Board of Overseers, November 17, 2014 (“Complaint”).

CONFIDENTIAL Page 7

high admission rate and compose a much larger share of the admitted class than candidates who are

exceptional on just one dimension.

12. Prof. Arcidiacono reveals a significant misunderstanding of Harvard’s admissions process

by focusing so much of his analysis on academic achievement. For example, four of the six

regression models that Prof. Arcidiacono offers do not include controls for the three non-academic

ratings (extracurricular, personal, and athletic), which are central to Harvard’s evaluation of

candidates for admission. And Prof. Arcidiacono accounts in only a crude and limited way for

considerations of high school quality and socioeconomic background that Harvard uses to place in

context each applicant’s prior academic achievement. Such analyses are fundamentally flawed and

unreliable because they fail to account for the multi-dimensional evaluation Harvard employs when

rendering its admissions decisions.

13. As I explain in Section 4, Prof. Arcidiacono attempts to justify his focus on academics by

presenting a variety of basic descriptive analyses that purport to show a broad correlation between

Harvard’s academic index and non-academic qualifications that Harvard considers. He then argues

that it is reasonable to assume that Asian-American applicants are stronger than applicants of other

races in non-academic respects (including factors he cannot measure and include in his model)

because they are stronger on academic measures. That is a central assumption of his analysis—and,

as I demonstrate in Section 4, it is wrong. A more careful examination of the data shows that White

applicants are stronger than Asian-American applicants, in aggregate, across the three non-academic

dimensions that Harvard rates (athletic, extracurricular, and personal), and that they are more likely to

exhibit multi-dimensional excellence (i.e., receive high ratings in at least three of the four categories).

In fact, Prof. Arcidiacono’s own analysis shows that, across all of the non-academic variables he

includes in his regression model, White applicants in aggregate are stronger than Asian-American

applicants. Because non-academic factors are much harder to quantify than academic factors, and

thus fewer of them are observable in the Harvard admissions database, there is a strong possibility

that statistical models like those developed by Prof. Arcidiacono will exclude important non-

academic factors, and will therefore be biased in favor of finding a race-based disparity in admissions

between Asian-American and White applicants. That is, it is quite possible that if one could control

more extensively for non-academic factors, those factors—and not race—would explain any disparity

in the admission rate between Asian-American and White applicants.

14. In Section 4, I also explain how Prof. Arcidiacono’s models include very little

information that can account for the overall context of each candidate’s application, such as the

quality of the applicant’s high school, the applicant’s socioeconomic circumstances, and the

resources and opportunities available to the applicant as a result of his high school, neighborhood,

and family background. This contextual information is critical in the admissions process, because

CONFIDENTIAL Page 8

Harvard recognizes that one cannot evaluate a student’s grades, standardized test scores, or other

attributes without understanding the circumstances in which the applicant grew up. For that reason,

the admissions process is designed to ensure that admissions officers have detailed knowledge of

many of the high schools and neighborhoods from which applicants apply, and that admissions

officers examine each applicant’s file in light of that context. Importantly, as I show below in Section

4, Prof. Arcidiacono failed to make use of a variety of such contextual factors that were available in

data produced to him, and that differ on average between Asian-American and White applicants.

15. In Section 5, I turn to a more formal statistical analysis of the difference in admission

rates between White and Asian-American applicants. This analysis shows that the purported “penalty

against Asian Americans” identified by Prof. Arcidiacono does not actually exist.

Prof.

Arcidiacono’s finding is instead driven by two limitations of his model.

16. First, as noted above, his model does not account for numerous critical factors in the

available data that provide important context for each application, including measures of applicants’

socioeconomic status (such as the demographics of their neighborhoods), the quality of their high

schools, and other variables that can reflect differences in life experiences and opportunities. Prof.

Arcidiacono’s own models show that the factors of this type that he does include in his model help

explain the disparity in admission rates between White and Asian-American applicants, which is why

it is problematic that Prof. Arcidiacono does not control for more of them. Once Prof. Arcidiacono’s

model is modified to account for these additional factors, it finds no evidence of a racial disparity in

admissions decisions.

17. Second, Prof. Arcidiacono’s model combines data from multiple admissions cycles, thus

imposing the assumption that Harvard compares applicants across years rather than simply within

each year’s application pool. As I detail below, that assumption is unreasonable. Each admissions

cycle is different, and the data confirm as much, showing that the estimated effect of various factors

on an applicant’s probability of admission changes substantially from year to year. Importantly, when

I analyze the data year-by-year, as the evidence supports, I find that the model’s predictive accuracy

increases. My year-by-year analysis finds that the estimated effect of Asian-American ethnicity on

applicants’ probability of admission is not statistically significant in any year, or even on average

across all six years, and is actually positive in four of six years.

18. It is important to note that even when I enrich the model to account for additional control

variables and to account for differences in the admissions process from year to year, the model still

does not perfectly capture all of the information on which the Harvard Admissions Committee relies

when making admissions decisions. This problem is what I refer to throughout the report as a

Arcidiacono Report, p. 61.

CONFIDENTIAL Page 9

“missing data” problem—a problem that exists when modeling any complex decision process (like

admissions to Harvard) in which decisionmakers consider many factors that are hard to quantify. The

data I am discussing are “missing” because they are not quantified in Harvard’s database or because

they are inherently difficult to quantify. Importantly, because non-academic factors that are a relative

strength of White applicants (on average) are harder to quantify than academic factors, it is likely that

additional such factors remain missing from the model even after I enrich the model to capture more

information on non-academic factors.

19. In Section 5, I also address Prof. Arcidiacono’s claim that Harvard’s personal and overall

ratings are biased against Asian-American applicants. In the case of the personal rating, the statistical

evidence Prof. Arcidiacono offers in support of this claim is weak for two key reasons. First, the

ordered logit models that Prof. Arcidiacono uses to try to isolate the effect of race on the personal

rating are, by his own measure of statistical reliability, weak—that is, the models explain only a

relatively small fraction of the differences across candidates in the personal ratings. A key reason for

this is that the available admissions data include only a few quantitative variables that can be used to

model variation in the personal rating. In essence, the “missing data” problem I describe above is

particularly severe for the assessment of personal ratings, which depend largely on qualitative factors

that cannot be captured in Harvard’s database. For example, testimony in the record indicates that the

applicant’s essay is an important consideration in the personal rating, but there is no quantifiable

measure of the essay in the data I analyze. This means that the disparity Prof. Arcidiacono labels

“bias” may very well be explained by factors other than race that the model does not include.

Importantly, Prof. Arcidiacono’s own model finds that the estimated negative effect of Asian-

American ethnicity on the personal rating shrinks as non-academic factors are added to the model.

This pattern suggests that the estimated effect would shrink further if one could quantify the missing

data that the Harvard admissions officers use to form their assessments.

20. Another reason to be skeptical of the reliability of Prof. Arcidiacono’s model of the

personal rating is that his model of the academic rating—which is the most reliable of any of his

ratings models—shows that Asian-American ethnicity has an estimated positive and significant effect

on that rating.

So does his model of the extracurricular rating. Given these results, one of two things

must be true. Either (1) Harvard is engaging in an exceptionally unusual form of discrimination, in

which it is favoring Asian-American applicants in the academic and extracurricular ratings only to

penalize them in the personal and overall ratings, or (2) Prof. Arcidiacono’s ratings models are

simply not reliable enough to measure all of the differences between Asian-American and White

applicants on the various dimensions valued by Harvard that drive the assignment of ratings.

21. While Prof. Arcidiacono provides no reliable evidence that the personal rating is biased

against Asian-American applicants—and while excluding that rating from a model of admissions is

CONFIDENTIAL Page 10

problematic because the rating plays a significant role in the admissions process and incorporates

data on the qualities of the applicants that are otherwise missing—I agree with Prof. Arcidiacono that

the overall rating should be excluded from the model. Testimony in this case indicates that an

applicant’s race may have a direct effect on her overall rating, and it is a well-accepted statistical

practice to exclude variables from a regression model that may themselves be directly influenced by

the variable of interest (here, race). While I have excluded the overall rating from my admissions

model, I also believe that the model of overall ratings developed by Professor Arcidiacono is too

weak to provide reliable statistical evidence of “bias” in the assignment of this rating. Like Prof.

Arcidiacono’s models of the ratings in general, the overall-rating model leaves unexplained a large

proportion of the variation in the overall ratings and cannot control for numerous factors that may

influence the overall rating and may be correlated with race.

22. Despite my view that removing the personal rating from the model is a flawed approach, I

also implement an analysis that assumes for the sake of argument that the personal rating may be

biased and removes it (as well as the overall rating) from the model altogether. This is an extremely

conservative approach, because it removes the personal rating from the model entirely—not just the

supposedly biased component of the rating—even though Prof. Arcidiacono’s own analysis shows

that, when the supposed bias is statistically eliminated from the personal rating, White applicants’

personal ratings are still on average slightly higher than those of Asian-American applicants.

Nevertheless, using this very conservative model, I still find no evidence of a statistically significant

negative association between Asian-American ethnicity and applicants’ likelihood of admission in

five of the six admissions cycles for which data are available.

23. In Section 6, I assess how the race of African-American, Hispanic, and Other (non-Asian)

minority race (AHO) candidates affects their likelihood of admission, in order to respond to Prof.

Arcidiacono’s argument that race has a large effect for such candidates.

I reach several conclusions

on this issue. First, consistent with testimony from Harvard witnesses, I find that although AHO

ethnicity is associated with a significantly higher likelihood of admission, the importance of race in

explaining admission decisions is much smaller than that of many other factors Harvard considers.

Second, I show that race plays only a small role in admissions outcomes for the vast majority of

applicants. And for the small number of applicants for whom race plays a more significant role, other

non-race factors also substantially affect the applicants’ likelihood of admission. Third, I find that the

estimated effect of race for almost all AHO applicants is smaller than that of individualized

“unobservable” factors that cannot be quantified by a statistical model. Taken together, these results

suggest that, while race plays a role in admission decisions—by design—it is just one of many factors

Other minority race applicants include applicants classified as Native American or Hawaiian/Pacific Islander under

Harvard’s “old methodology,” the race definition that Prof. Arcidiacono uses throughout his report (Arcidiacono Report,

p. 2

3).

CONFIDENTIAL Page 11

Harvard considers in its whole-person review of each candidate. I also examine Prof. Arcidiacono’s

claim that Harvard has recently imposed a floor on the admission rate of African-American

applicants and find no evidence to support that claim.

24. In Section 7, I turn to a final question: are there race-neutral admissions practices that

Harvard could implement that would allow it to achieve its diversity objectives, without lowering the

quality of its class on other dimensions that it values? Using the admissions model developed in

Section 5, I simulate how various race-neutral admissions practices (both alone and in combination)

would affect the demographic and other characteristics of the admitted class. I show that Harvard

could achieve comparable ethnic and racial diversity by other means, but that doing so would

produce a student body that is less exceptional on multiple dimensions that I understand Harvard

values (such as academic credentials, extracurricular achievement, and personal qualities).

25. In performing my analysis in Section 7, I also assess the literature analyzed and

simulations offered by Mr. Kahlenberg. I generally agree with Mr. Kahlenberg that race-neutral

alternatives can sometimes be used to help universities increase racial diversity. As I explain below,

however, the relevant question here is not whether some universities could achieve diversity without

considering race but whether Harvard could do so, and furthermore whether doing so would harm

Harvard’s other institutional and educational objectives. A direct analysis of Harvard’s data is needed

to answer that question. With regard to Mr. Kahlenberg’s simulations of race-neutral alternatives, I

show using Mr. Kahlenberg’s own data that the proposed alternatives he considers either lead to a

significantly less diverse class, or to a class that is comparably diverse but far weaker in other

dimensions that I understand Harvard values, such as academic quality.

CONFIDENTIAL Page 12

3. AN OVERVIEW OF HARVARD’S APPLICANT POOL AND ADMISSIONS PROCESS

26. The first step in my analysis is a careful review of the discovery in this case concerning

Harvard’s admissions process. The purpose of this review is to understand what factors Harvard

values when admitting students. As noted above, SFFA’s claim of bias against Asian-American

applicants relies centrally on the premise that Asian-American applicants have the strongest academic

qualifications on average across racial groups, but are admitted at a lower rate than applicants of

other races. SFFA’s expert, Prof. Arcidiacono, focuses much of his analysis on academic

qualifications. It is essential, however, to understand what other factors Harvard considers when

evaluating candidates, and how important those factors are relative to academic credentials in

explaining the variability in admissions outcomes.

27. In the remainder of this section, I summarize the key features of Harvard’s

decisionmaking process. I start with an analysis of the size of Harvard’s applicant pool and the

competitive nature of admissions decisions. I show that superb standardized test scores and GPAs are

abundant among Harvard applicants, with thousands of candidates having perfect GPAs and/or SAT

and ACT scores. It is impossible for Harvard to admit all applicants with exceptional academic

credentials, and so focusing too much on such credentials when trying to understand admissions

decisions (as Prof. Arcidiacono does) is the wrong approach.

28. I then summarize relevant information in the record that identifies the broader set of

characteristics that Harvard seeks in the students it admits. Documents and testimony show that

Harvard values candidates who can contribute to both academic and non-academic dimensions of

campus life, and that Harvard considers the full context of an applicant’s life experience (including

the quality of her high school, the characteristics of her home neighborhood, and her family

background) when deciding whom to admit. Those facts will be critical to the statistical analyses I

offer in Sections 4 and 5. An important difference between my analysis and Prof. Arcidiacono’s is

that my analysis includes a much richer set of control variables, including more detailed controls for

applicants’ socioeconomic status (as measured by the demographic characteristics of their

neighborhoods and high schools as well as their parents’ occupations) that more accurately reflect

and account for the many different factors Harvard weighs in its whole-person admissions process.

3.1. Harvard’s admissions process is highly competitive, and academic achievement is abundant in its

applicant pool

29. Harvard’s admissions process is one of the most competitive and selective in the country.

For example, more than 37,000 high school students applied to Harvard for admission to the class of

CONFIDENTIAL Page 13

2019, but only 2,003 were admitted, leading to an admission rate of 5.37%.

According to U.S. News

and World Report, Harvard had the third-lowest admission rate among U.S. universities in Fall

2016.

30. Exhibit 1 shows the number of domestic applicants, number of domestic admitted

students, and the admission rate to Harvard for domestic applicants each year for the classes of 2014

to 2019 (the years for which admissions data were produced in this matter).

As the table shows,

Harvard’s domestic applicant pool has grown since the class of 2014 admissions cycle, while the

number of admitted domestic students has fallen, making Harvard’s admissions process for domestic

students even more competitive in recent years. More domestic candidates now apply each year for

fewer spots, and as a result Harvard’s admission rate has declined consistently over time from 8.75%

to 6.61%.

Domestic applicants, admitted students, and admission rates at Harvard by year

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 using Professor Arcidiacono’s expanded sample.

The admission rate of 5.37% includes all applicants and admitted students, including international students. Analyses in

the remainder of this report are limited to domestic applicants, consistent with Prof. Arcidiacono’s definition (see

workpaper).

U.S. News and World Report, “Top 100 Lowest Acceptance Rates,” available at https://www.usnews.com/best-

colleges/rankings/lowest-acceptance-rate, accessed December 7, 2017.

I follow Prof. Arcidiacono by defining “domestic” applicants as those who are U.S. citizens or permanent residents, and

in limiting my analyses to domestic applicants. Throughout my analyses, I primarily rely on Prof. Arcidiacono’s

produced, processed dataset (the “Arcidiacono Data”), which is constructed using the data produced by Harvard in this

litigation. I also use a version of Prof. Arcidiacono’s produced dataset that is augmented with additional variables from

the College Board and Harvard’s underlying data and that reflects a few technical corrections, which I refer to as the

“Augmented Arcidiacono Data.”

CONFIDENTIAL Page 14

31. In addition to having a relatively small number of places available in its freshman class

for a large number of applicants, Harvard also has an applicant pool with extraordinary academic

qualifications. As shown in Exhibit 2, nearly 3,500 domestic applicants to the class of 2019 had

perfect math SAT scores. Additionally, more than 8,000 domestic applicants had a perfect converted

GPA (based on Harvard’s GPA index, which normalizes GPAs across high schools), 625 earned

perfect composite ACT scores, 361 earned a perfect 2400 on the SAT, and thousands had average

SAT subject test scores of 700 or higher. As shown in Exhibit 3, domestic students admitted to

Harvard’s class of 2019 had mean and median SAT scores of 2241 and 2270, respectively, and mean

and median ACT scores of 33 and 34, as well as an average converted GPA of 77 out of 80.

32. These data show that even if Harvard wanted to admit every student with elite academic

credentials, it could not. Harvard admits roughly 1,800 domestic students each year, yet thousands of

applicants have impeccable academic qualifications.

For example, based on the statistics in Exhibit

2, even if Harvard sought to admit only applicants with a perfect GPA, it would need to reject at least

6,000 such applicants and all other domestic applicants. Similarly, even if Harvard sought to admit

only applicants with a perfect Math SAT score, it would need to reject nearly 2,000 such applicants

and all other domestic applicants.

See workpaper.

CONFIDENTIAL Page 15

Many applicants to the class of 2019 had outstanding standardized test scores and grades

Source: Arcidiacono Data

Note: Data are from applicants to the class of 2019 using Professor Arcidiacono’s expanded sample. Harvard converts applicant GPAs to a

35–80 scale.

Admitted students have strong academic credentials

Source: Arcidiacono Data

Note: Data are from admitted students to the classes of 2014 – 2019 using Professor Arcidiacono’s expanded sample.

CONFIDENTIAL Page 16

3.2. Harvard seeks candidates with a wide range of skills beyond academic achievement

33. Given the extraordinary academic credentials of the Harvard applicant pool each year, the

key question for any statistical analysis of the admissions process (and for assessing SFFA’s

analyses) is: What other characteristics does Harvard evaluate when trying to differentiate among

academically capable students, and how scarce are those characteristics in the applicant pool relative

to the abundance of academic credentials? In this sub-section, I summarize testimony and documents

from Harvard that detail the characteristics it seeks in individual applicants, as well as the broader

diversity in life experiences, perspectives, and interests it seeks for each class as a whole.

3.2.1. Harvard’s whole-person evaluation relies on an “expansive view of excellence,” and seeks to identify

a wide variety of “distinguishing excellences”

34. The guiding principle of Harvard’s admissions process, as I understand it, is to evaluate

each applicant as a whole person, not just in terms of her academic qualifications but in terms of all

other attributes. Documents from Harvard indicate that a central goal of Harvard’s whole-person

evaluation process is an assessment of each applicant’s potential to contribute in various ways to

Harvard’s educational environment and campus community. This process requires a careful

assessment of the aspects of each applicant that distinguish her from other applicants, as well as an

assessment of the context in which the applicant’s achievements occurred, such as the availability of

opportunities to the applicant and the difficulty of the challenges the applicant has faced. Importantly,

documents indicate that academic strength on its own is generally not sufficient to distinguish an

applicant. In fact, Harvard’s 2014 – 2015 Interviewer Handbook (“Interviewer Handbook”) notes that

35. The Interviewer Handbook summarizes what Harvard refers to as its “Search for

Distinguishing Excellences” as follows:

Interviewer Handbook, 2014 – 2015, HARV00001392 – 1438 (“Interviewer Handbook”) at HARV00001401. Other

documents from Harvard support this account of the admissions process. For example, in a presentation given to guidance

counselors at schools in the Sarasota, Florida area, Harvard admissions officer Kanoe Williams explained that test scores

are just a “small piece” of Harvard’s whole-person evaluation; that, “in general, we can tell pretty quickly if a student will

be an academic fit for our school”; and that “the lengthier part of the conversation typically focuses on intangibles, the

qualitative pieces” (Sarasota Presentation, “KLW - Sarasota Presentation,” HARV00013561 – 65 at HARV00013563 –

64).

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 17

36. The Interviewer Handbook then goes on to list a variety of examples of “distinguishing

excellences” that admissions officers look for when reviewing application files:

Interviewer Handbook at HARV00001400 – 01.

Deposition testimony indicates that the personal essay is also a key factor in evaluating personal qualities. See, for

example, Deposition of Roger Banks, May 4, 2017 (“Banks Deposition”), pp. 79–80 (“Q. And for each of those

categories, can you tell me how they were assigned a numerical score?...[A] Extracurricularly, quality of achievement,

strength of performance in any particular domain, personal qualities, some grasp of the candidate’s personality, interest in

other people, cooperation with others, a sense of responsibility as gleaned from teacher recommendations, personal

interview, personal essay, et cetera. Q. Okay. So for the last category, the—the main inputs you would look at were

recommendations, interview, and anything else? A. The candidate’s essay.”); Deposition of Brock Walsh, June 28, 2017

(“Walsh Deposition”), p. 60 (“Q How would you calculate that score?…[A.] I would like to take into consideration

whatever relevant information I had were that his essay, her essay, her interview, and the opinions about that applicant as

expressed by others.”); Deposition of Tia Ray, June 7, 2017 (“Ray Deposition”), pp. 21–22 (“Q. What are the materials

that you use—materials or considerations that go into determining this person’s score?…A. For example, content in

recommendation letters, personal essays.”).

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 18

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 19

3.2.2. Harvard evaluates applicants’ distinguishing excellences within the context of their full life

experience, including their high school, community, family, and other factors

37. Harvard’s assessment of each applicant’s overall qualifications and distinguishing

excellences takes into account the full context of the applicant’s life experience. My understanding is

that Harvard seeks to understand the opportunities and challenges each applicant has faced so that it

can better evaluate each applicant’s achievements and potential to contribute to Harvard. For

example, William Fitzsimmons, Harvard’s Dean of Admissions and Financial Aid, testified that the

context of each high school is particularly important when evaluating the qualifications of any given

applicant:

Given the fact that we want to understand as completely as possible

what the … applicant has accomplished both in school, out of school,

you know, throughout his or her life, getting to know the school, the

opportunities within the school, academically, extracurricularly, and in

other ways, what they might learn from fellow students, all the usual

things that you might look for in a college that would be of interest. And

also is interesting for the—helpful for readers to understand which

courses might be tougher than others, things of that sort, the full

context.

38. Marlyn McGrath, Director of Admissions, also testified that the Admission Committee’s

assessment of the context of each applicant’s family life and community is crucial to the evaluation

of her achievements:

The most important thing to say is that when an applicant has applied,

each applicant is really considered as an individual, including—whose

candidacy will always include, generally include, many factors, family

Interviewer Handbook at HARV00001401 – 02.

Deposition of William Fitzsimmons, August 3, 2017 (“Fitzsimmons Deposition”), pp. 233–234.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 20

background, which will include whatever we know of race, whatever

else we know about family circumstances and education, whatever we

can know about the nature of the school and the kind of community the

student grew up in. Those context features, those features of the

student’s setting are always important to us in imagining how well he’s

achieved in the circumstances that he started with to us as a candidate.

3.2.3. Documents from Harvard identify specific examples of qualifications that help applicants distinguish

themselves from others

39. To help train admissions officers and alumni interviewers to identify the types of

“distinguishing excellences” detailed above, as well as how to evaluate each candidate’s

accomplishments in context, Harvard maintains a Casebook and Casebook Discussion Guide that

highlight examples based on actual application files.

The discussion guide aims to highlight

40. Below are a variety of examples from applications in the Casebook that illustrate the wide

variety of factors Harvard considers in order to distinguish among the many academically strong

candidates in its pool. These factors include, for example, personal qualities like intellectual

arrogance or social charm, economic resources and family hardship, personal essays and interviews,

artistic qualities, maturity and ability to balance multiple commitments, and the degree of parental

involvement:

Deposition of Marlyn McGrath, Volume I, June 18, 2015 (“McGrath Deposition 2015”), pp. 231–232.

2012 Casebook, HARV00000212 – 321 (“Casebook”); Discussion Guide to the 2012 Casebook, HARV00018164 –

176 (“Casebook Discussion Guide”).

Casebook Discussion Guide at HARV00018165.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 21

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 22

3.2.4. Harvard seeks diversity of life experience and perspectives for each class on numerous dimensions

41. As noted above, my understanding is that Harvard seeks to admit not just a set of

individuals with distinguishing excellences, but also a class that includes individuals with a wide

range of life experiences and perspectives.

42. For example, the 2016 Report of the Committee to Study the Importance of Student Body

Diversity, chaired by Dean of Harvard College Rakesh Khurana (“Khurana Report”), states:

The mission of Harvard College is to educate the citizenry and citizen

leaders for our society. We take this mission very seriously and firmly

believe it is accomplished through the transformative power of a liberal

arts and sciences education. That transformation begins in the classroom

with exposure to new ideas, new ways of understanding and new ways

of knowing. It is further fostered through a diverse residential

environment where our students live with peers who are studying

different subjects, who come from different walks of life, and have

different identities. This exposure to difference not only deepens a

student’s intellectual transformation, but also creates the conditions for a

social transformation as students begin to question who they are and

how they relate to others.

43. One form of diversity that Harvard seeks is racial diversity. For example, President Faust

testified: “It’s important that we have a class that represents diversity along a number of dimensions,

and race is one of those dimensions. Economic status is another. Artistic ability is another. Life

experience is another. Interest in a variety of fields that we represent is another.”

44. Dean Fitzsimmons described how the Admissions Committee considers an applicant’s

self-reported race as one among many factors as it seeks to admit a diverse class:

Casebook Discussion Guide at HARV00018165 – 169, HARV00018174 – 175.

“Report of the Committee to Study the Importance of Student Body Diversity,” HARV00008048 – 69 (“Khurana

Report”) at HARV00008048.

Deposition of Catherine Drew Gilpin Faust, March 10, 2017 (“Faust Deposition”), p. 196.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 23

We know that race is one factor among many as we review each

application…There are students who might write an essay on how

formative and important race was. There are students who might not

present themselves in such a way. But as one were to look at the

application in its entirety, you could come to the conclusion that race

certainly may have been a factor in their person’s life and may help that

person be a better educator of others during college and beyond. Each

application is different, one from the next.

3.3. Harvard’s decision process is labor-intensive and seeks to understand the full context of each

applicant’s high school achievements

45. Based on my review of deposition testimony and documents produced in this matter, I

understand that, to implement its whole-person assessment of each applicant, Harvard has

implemented a multi-stage decision process with input from a large team of admissions officers.

Dean Fitzsimmons has described this as a “rigorous comparative process.”

46. The Admissions Committee is divided by geographic region into twenty subcommittees,

known as dockets.

Each subcommittee normally includes four to five members and a chairperson,

who are collectively responsible for the initial evaluation of all candidates from the geographic area.

Each member of a subcommittee is responsible for performing the initial read of all applications from

a set of high schools on the docket. My understanding is that admissions officers often sit on multiple

subcommittees. The admissions officer who conducts the first read of a given application (the “first

Fitzsimmons Deposition, pp. 87–88.

Applicants have the option to apply “Early Action” to Harvard. Early Action applications are due in November, and if

an applicant applies early to Harvard, he may not apply to any other private university’s Early Action or Early Decision

program. Offers of admission to Early Action candidates are announced in December, and are non-binding (that is, an

applicant offered Early Action admission may still apply to other universities in the Regular Decision cycle). Early Action

applicants who are not accepted in December are either denied admission or “deferred” – that is, shifted into the Regular

Decision pool and reconsidered during the Regular Decision admissions cycle. I understand that the subcommittee and

full committee processes for Early Action applicants are primarily the same as described above for Regular Decision, but

with far fewer applications (Harvard College, “Restrictive Early Action,” available at

https://college.harvard.edu/admissions/apply/application-timeline/restrictive-early-action, accessed August 14, 2017).

William Fitzsimmons, “Guidance Office: Answers From Harvard’s Dean, Part 1,” New York Times, September 10,

2009, available at https://thechoice.blogs.nytimes.com/2009/09/10/harvarddean-part1/, accessed November 10, 2017.

Two of the twenty dockets (U and V) are comprised entirely of applicants from international high schools.

William Fitzsimmons, “Guidance Office: Answers From Harvard’s Dean, Part 1,” New York Times, September 10,

2009, available at https://thechoice.blogs.nytimes.com/2009/09/10/harvarddean-part1/, accessed November 10, 2017.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 24

reader”) can choose to pass the application on to the subcommittee chair for review if the first reader

believes that the application merits further consideration.

47. I understand that admissions officers focus on specific high schools in their geographic

regions, gain detailed knowledge of those high schools, and rely on that knowledge when evaluating

applications.

In particular, I understand that admissions officers rely on such knowledge to better

evaluate candidates within the context of the academic and non-academic opportunities and

challenges that they have encountered at their particular high schools.

As I discuss below,

accounting for high school context in a statistical model of the admissions process is critical because

it is one of the important ways in which admissions officers distinguish among candidates.

48. Once all applications from a particular docket have been reviewed, the subcommittee for

that docket meets to discuss the applications. My understanding is that during this process, the first

reader summarizes the strength of the applications he or she has read. Subcommittee members

discuss applications, and then vote on each application to recommend an action to the full

Committee. The degree of support expressed for applicants is noted to allow for comparisons with

applicants from other subcommittees.

The full Admissions Committee then meets to discuss the

candidates recommended by each subcommittee. For Regular Decision applicants, full committee

meetings take place over the course of approximately two weeks during March.

49. My understanding is that during the full committee process, the first reader, or area

Deposition of Caroline A. Weaver, Volume II, March 6, 2017 (“Weaver Deposition, Volume II”), p. 221 (“If I read an

application and thought that it was a strong application, I would pass it to the chair of the docket.”).

Fitzsimmons Deposition, p. 233 (“The beginning piece of the evaluation, you know, would be as, for example, if I

covered Chicago, that I would typically be the first reader of an application from that area. Q. And, in fact, the readers

within a particular docket are divided up by high schools within the docket? A. Yes. Q. So the same reader is supposed to

read all the applications from a particular school? A. Yes. Q. Is that done so that there's better understanding of the way

the school works and the level of classes and information that is going to apply to all applicants? … A. That’s certainly

one of the reasons.”).

Fitzsimmons Deposition, pp. 233–234 (“Q. Is that done so that there’s better understanding of the way the school

works and the level of classes and information that is going to apply to all applicants? … A. That’s certainly one of the

reasons. There are others. Q. What are the others? A. Given the fact that we want to understand as completely as possible

what the applica—what the applicant has accomplished both in school, out of school, you know, throughout his or her

life, getting to know the school, the opportunities within the school, academically, extracurricularly, and in other ways,

what they might learn from fellow students, all the usual things that you might look for in a college that would be of

interest. And also is interesting for the—helpful for readers to understand which courses might be tougher than others,

things of that sort, the full context.”).

William Fitzsimmons, “Guidance Office: Answers From Harvard’s Dean, Part 1,” New York Times, September 10,

2009, available at https://thechoice.blogs.nytimes.com/2009/09/10/harvarddean-part1/, accessed November 10, 2017.

Admissions Calendar 2013 – 2014, HARV00031933.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 25

person, for an application generally presents the applicant’s file to the full Committee, and may

choose to project portions of the application on a screen during the discussion so that the Committee

can review important components of the application.

For example, deposition testimony indicates

that the admissions officer presenting the case might use excerpts of visual art or music submissions

or academic papers to highlight an applicant’s skills,

and that discussions in subcommittee or in full

Committee on a single applicant may range in length up to a half hour or more.

The full Committee

compares all candidates across all subcommittees.

50. According to Dean Fitzsimmons, “[t]his rigorous comparative process strives to be

deliberate, meticulous, and fair. It is labor intensive, but it permits extraordinary flexibility and the

possibility of changing decisions virtually until the day the Admissions Committee mails them.”

3.4. Harvard’s ratings reflect important and otherwise unobservable information about the academic

and non-academic qualifications of applicants

51. To help quantify and formalize the evaluation of each applicant by the Admissions

Committee, Harvard employs a numeric rating system. Each admissions officer who reviews an

application rates the applicant on four key dimensions: academic, extracurricular, athletic, and

Deposition of Chris Looby, June 30, 2017 (“Looby Deposition”), pp. 33–34 (“Q. Do you ever put a summary sheet on

a projection screen? A. Yes, we do.”).

Deposition of Roger Banks, May 4, 2017 (“Banks Deposition”), pp. 197–198 (“A. The area person would begin with

an overall summary of the case, its significant features, academically and extracurricularly, arguments to admit, and

proceed to point the committee toward evidence to support those arguments. Q. Would the members have any other

materials that they’re looking at during that conversation, or is it just what’s presented here? A. It would be what’s

presented here in addition to supplemental information, music tapes, visual art supplements, academic papers, things of

that kind.”).

Fitzsimmons Deposition, p. 157 (“But, again, there’s no way to, you know, when 40 people are listening in some cases

for half an hour or more to a single application and discussing that application, exactly why they would choose to admit

that applicant—just impossible to quantify that kind of thing.”).

Fitzsimmons Deposition, pp. 297–298 (“And so, in the end, all of those students are—have to be compared against all

of the other people from all the other dockets, and lots of times there’s new information available. You know, there could

be any number of new pieces of information, new interview or whatever, and that might make for a different case. So

every one ultimately gets compared to everyone else in the same process that I have mentioned earlier today, where you

would literally—if you were, say, the area person for a candidate from a school, there would be a docket that people could

look at but then all the information about that applicant would have to go up on the screen and you would have to make

your argument in front of the full committee.”).

William Fitzsimmons, “Guidance Office: Answers From Harvard’s Dean, Part 1,” New York Times, September 10,

2009, available at https://thechoice.blogs.nytimes.com/2009/09/10/harvarddean-part1/, accessed November 10, 2017.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 26

personal.

These are referred to as “profile ratings.” Admissions officers also assign numerical

ratings to the applicant’s “school support”—that is, recommendation letters submitted by high school

teachers and guidance counselors.

Applicants who receive alumni interviews also receive ratings

from their interviewers, and some applicants may receive additional ratings from interviews by

admissions staff.

Applicants who submit recordings of musical performances may also receive a

numerical rating assigned by a member of Harvard’s music faculty.

52.

53. Admissions officers and alumni interviewers also assign applicants an overall rating.

Deposition testimony indicates that the overall rating (a) takes into account the profile ratings but is

not a formulaic summation or average of those ratings, and (b) can reflect other aspects of an

application that the reviewer considered but that are not captured in the profile ratings (including

race).

I understand that the numerical ratings in the database may not include certain other

These ratings are generally assigned early in the application-reading process, so they do not always reflect

information—such as a faculty evaluation of an applicant’s academic work, or an alumni interview—that may arrive later

on (2018 Reading Procedures at HARV00015414 – 15, HARV00015423 – 24).

2018 Reading Procedures at HARV00015416.

Interviewer Handbook at HARV00001418; Interview Information Sheet Class of 2017, HARV00000008 – 09 at

HARV00000009; Deposition of Sarah Donahue, June 6, 2017 (“Donahue Deposition”), pp. 193–195 (“Q. … Do the

alumni interviewers themselves assign scores for the applicants which they interview? A. Yes. Q. And is that also on the

four-point scale or the four-number scale? A. Yes. … Q. When there are staff interviews, does the staff assign numbers in

the same way that the alumni interviewers do? … A. They are the same two categories.”).

2018 Reading Procedures at HARV00015424.

2018 Reading Procedures at HARV00015414 – 16.

2018 Reading Procedures at HARV00015415 (“Extracurricular, Community Employment, Family Commitments …

5. Substantial activity outside of conventional EC participation such as family commitments or term-time work…”).

2018 Reading Procedures at HARV00015414; Interviewer Handbook at HARV00001429; Interview Information Sheet

Class of 2017, HARV00000008 – 09.

Fitzsimmons Deposition, pp. 249–250; McGrath Deposition 2015, pp. 172–173; Deposition of Lucerito Ortiz, June 14,

2017 (“Ortiz Deposition”), pp. 28–29; Deposition of Kaitlin Howrigan, June 20, 2017 (“Howrigan Deposition”), pp. 32–

33; Deposition of Brock Walsh, June 28, 2017 (“Walsh Deposition”), pp. 61, 66–67. For testimony addressing how race

may be taken into account as one of many factors considered when assigning an overall rating, see Ray Deposition, pp.

27–28 (“Q. Is race taken into account when you give a student an overall rating? A. It depends. … Q. How so? … A. On

the individual case and the individual admissions officer,” and “Q. Why is it different—why do you take race into account

in the overall rating but in none of the other ratings? ... A. It depends on the individual case. And we may take it into

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 27

assessments that the Admissions Committee may receive during the course of the admissions

process—for example, evaluations that Harvard faculty members may provide of academic work that

an applicant has submitted.

54. Each rating is designed to capture numerous characteristics of the applicant that Harvard

values, many of which extend beyond easily quantifiable measures like test scores. For example,

documents and testimony in this case reveal that the academic rating can reflect not only the

applicant’s grades and test scores but also the admissions officer’s knowledge of the applicant’s high

school (and thus ability to place in context the applicant’s academic accomplishments, given the

applicant’s opportunities), as well as the officer’s knowledge of the strength of the candidate’s high

school curriculum, appraisals of the candidate’s academic work by Harvard faculty (to the extent

such appraisals are received before the academic rating is assigned), and the candidate’s receipt of

academic honors or awards.

It may also reflect the applicant’s writing skills.

The extracurricular

rating, likewise, reflects not only the number of extracurricular activities in which an applicant has

participated and the number of hours the applicant has devoted to those activities, but also the nature

of the applicant’s activities, whether the applicant has held leadership roles, and whether the activities

are highly selective.

55. A written set of “Reading Procedures” summarizes the protocols that admission officers

are to follow when reviewing an application and sets forth “coding guidelines” that guide how

admissions officers assign profile ratings. The coding guidelines provide standards for when to assign

each rating. For example, a “1” academic rating means:

Only about 100 applicants per year receive a 1 academic rating, despite the

large numbers of applicants with extraordinary GPA and SAT/ACT scores, reflecting the critical

importance of information beyond grades and standardized test scores that the readers incorporate

account in that overall rating to reflect the strength of the case and to provide a slight tip for some students.”); Howrigan

Deposition, pp. 35–36 (“Q. So is your answer yes, as long as you knew the student’s race, you would take it into account

[in assigning the overall rating]? … A. If the student opted to share that information on their application, that was

something that was taken into account, with hundreds of other factors that were being taken into account.”); Weaver

Deposition, Volume II, p. 194 (“Q. How does the applicant’s race factor into the overall score? … A. I wouldn’t say that

it factors in directly. Q. But it does factor in indirectly in instances? … A. An applicant’s race becomes important in cases

where the applicant makes that an important part of their folder, if it’s an important part of their identity and the way they

express themselves in their application.”).

Harvard Memo, “RE: Faculty Readings,” November 9, 2013, HARV00009879 – 80.

2018 Reading Procedures at HARV00015414; Fitzsimmons Deposition, pp. 240–241; McGrath Deposition 2015, pp.

161–162, 166, 168–169; Banks Deposition, p. 80.

Banks Deposition, p. 80 (“A. For academics, … some sense of the student’s writing skills.”).

2018 Reading Procedures at HARV00015415; McGrath Deposition 2015, pp. 163, 169–171; Donahue Deposition,

p. 160; Ray Deposition, p. 19.

2018 Reading Procedures at HARV00015414.

Redacted

CONFIDENTIAL Page 28

into the ratings.

56. The importance of the ratings in the decision process can be seen in their correlation with

admissions decisions. Exhibit 4 shows how admission rates vary for applicants with different

combinations of profile ratings. For example, it shows that candidates who are exceptionally strong in

a single dimension (reflected by an academic, athletic, extracurricular, or personal rating of 1 and no

other ratings of 1) and candidates who are multi-dimensional (i.e., have at least three profile ratings

of 2) are admitted to Harvard at rates much higher than those of candidates with no ratings of 1 or 2.

Applicants with an academic rating of 1 and no other ratings of 1 are admitted 68% of the time.

Applicants with an extracurricular, personal, or athletic rating of 1 and no other ratings of 1 also have

high admissions rates (48%, 66%, and 88% respectively). Applicants with a rating of 2 on all four

profile ratings are admitted 68% of the time. By contrast, applicants whose four profile ratings are all

3 or worse have almost no chance of admission to Harvard (0.1%).

Specific combinations of Harvard’s four profile ratings have a large effect on the admission rate

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 using Professor Arcidiacono’s expanded sample.

57. The ratings also indicate that applicants who are highly rated on non-academic dimensions

are much scarcer than applicants with a high academic rating. Exhibit 5 shows that about 42% of

applicants have an academic rating of 1 or 2, while fewer than 25% of applicants receive a 1 or 2 on

each of the other three profile ratings. Applicants with a rating of 2 or better on at least three

dimensions are even rarer—just 7% of the applicant pool. These data indicate that high ratings on

Ratings Combination

Number of

Applicants Admission Rate

Candidates who Excel on One Dimension

1. Academic rating of 1, no other 1s 663 68%

2. Extracurricular rating of 1, no other 1s 453 48%

3. Personal rating of 1, no other 1s 41 66%

4. Athletic rating of 1, no other 1s 1,340 88%

Multi-Dimensional Candidates

5. Three ratings of 2, one rating of 3 or 4 9,266 43%

6. Four ratings of 2 622 68%

Weaker Candidates

7. No ratings of 1 or 2 55,981 0.1%

CONFIDENTIAL Page 29

non-academic dimensions (and particularly on multiple non-academic dimensions) distinguish

applicants in the pool much more effectively than a high academic rating.

Strong academic ratings are more common than strong extracurricular, athletic, and personal

ratings

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 using Professor Arcidiacono’s expanded sample.

58. Another way to see the importance of non-academic dimensions relative to academic

dimensions of excellence is to examine how important each element is in explaining which applicants

are admitted. As discussed more fully below, a statistic called the Pseudo R-Squared (on which Prof.

Arcidiacono relies frequently in his analysis) captures how well a variable or set of variables can

explain outcomes—in this case, admissions decisions. The statistic takes on values from zero to one;

the closer it is to zero for a given model, the less information the variables in that model provide

about admissions decisions, while a value closer to one means the model explains a higher proportion

of the variability in the actual decisions. In Prof. Arcidiacono’s expanded sample, the Pseudo R-

Squared of a model that includes only the academic rating as a control variable is 0.09, while the

Pseudo R-Squared of models that include each of the three non-academic ratings as the sole control

variables are 0.20 (personal), 0.09 (extracurricular), and 0.08 (athletic), and the Pseudo R-Squared for

CONFIDENTIAL Page 30

a model that includes all three non-academic ratings as control variables is 0.32.

In non-technical

terms, this means that non-academic factors (taken together) explain more than three times as much

of the variation in admissions decisions as the academic rating does. That should not be surprising,

since exceptional non-academic qualities are less common in the applicant pool than exceptional

academic qualities and are thus more likely to distinguish applicants from one another.

59. Consistent with the discussion above, Exhibit 6 shows that only 12% of admitted students

are “one-dimensional stars” with a rating of 1 on one dimension but fewer than three ratings of 2 or

better, while 46% are multi-dimensional applicants with three or four ratings of 2 or better, and 31%

have two ratings of 2 and two ratings of 3. These statistics are yet another way to show the value that

Harvard places on applicants who distinguish themselves on multiple dimensions.

The vast majority of admitted students excel in multiple dimensions

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 using Professor Arcidiacono's expanded sample. Category 2 also includes five

applicants who received two ratings of 1 and two ratings of 3.

60. One final point about the ratings warrants mention. Prof. Arcidiacono argues that the

athletic rating “has little impact on admissions outside of recruited athletes,”

and that “once athletes

are taken out, the relationship between the athletic rating and admissions is weak.”

These assertions

directly contradict both testimony and documents from Harvard, as well as the admissions data.

See workpaper.

Arcidiacono Report, p. 5, footnote 5.

Arcidiacono Report, p. 24, footnote 31.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 31

61. For example, as noted above, the Interviewer Handbook explicitly notes that athletic

ability can be a “distinguishing excellence” and is

This “tip” is not limited to recruited varsity athletes; it also reflects the

value Harvard places on recreational athletics and an applicant’s potential contribution to life in

Harvard’s residential Houses. For example, the Interviewer Handbook notes:

The Reading Procedures also note

62. Harvard’s admissions data confirm the importance of the athletic rating. For example,

applicants with an athletic rating of 2 have an admission rate of 12%. That is substantially higher than

the overall admission rate of approximately 7%, and is the same as the admission rate of applicants

with an academic rating of 2. Further, as shown above, receiving a rating of 2 on all four profile

ratings is associated with an admission rate of 68%, while receiving a rating of 2 on the three non-

athletic ratings and a rating of 3 or worse on the athletic rating is associated with an admission rate of

only 48%. This contrast provides further evidence of the incremental importance of an athletic rating

of 2.

3.5. Prof. Arcidiacono’s statistical model fails to account for numerous dimensions of Harvard’s

admissions process

63. Prof. Arcidiacono’s analysis clearly fails to reflect the complexity of the admissions

process described above.

64. First, although Harvard values academic achievements, academic qualifications are only

one factor in the evaluation of each candidate, and applicants with exceptional academic records are

abundant in the Harvard applicant pool. Harvard’s whole-person evaluation extends beyond test

scores, GPA, and other measures of prior academic achievement.

Yet Prof. Arcidiacono focuses

overwhelmingly on the relative academic strength of Asian-American applicants. For example, in

Interviewer Handbook at HARV00001401.

Interviewer Handbook at HARV00001402.

2018 Reading Procedures at HARV00015415.

See workpaper.

Sarasota Presentation, “KLW - Sarasota Presentation,” HARV00013561 – 65 at HARV00013563.

Redacted

CONFIDENTIAL Page 32

four of his six regression specifications, Prof. Arcidiacono does not include controls for the three

non-academic ratings (extracurricular, personal, and athletic). Such models are incapable of

accounting for the admissions process detailed above, and shed no useful light on the issues in this

case.

65. Second, as I detail in the next section, it is difficult to quantify and include in a statistical

model many of the non-academic and contextual factors that Harvard’s admissions process values.

That is particularly important for assessing racial disparities in admission because, as I show in the

next section, there are significant racial differences in the non-academic and contextual factors that

are measured in Harvard’s admissions database and that Prof. Arcidiacono chooses not to include in

his model. That suggests there may well also be racial differences in the many other non-academic

factors (like the personal essay) that are not observable in the database and that are important to the

admissions process given the large pool of applicants with extraordinary academic achievements.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 33

4. ACCOUNTING FOR NON-ACADEMIC AND CONTEXTUAL FACTORS IS CRITICAL IN

MODELING HARVARD’S ADMISSIONS PROCESS

66. Before turning to my formal statistical analyses in Sections 5, 6, and 7, in this section I

discuss several facts that Prof. Arcidiacono has overlooked (or misunderstood), that provide

important context for the more technical analysis that follows in the remainder of this report, and that

illustrate the flaws in Prof. Arcidiacono’s arguments.

67. I start by examining the differences in admission rates between Asian-American and

White applicants. One of SFFA’s central claims in this matter is that Asian-American applicants are

admitted at a lower rate than White applicants. As I show below, however, that is not true if one

focuses on applicants who are neither lineage applicants, nor recruited athletes, nor children of

Harvard faculty and staff, nor included on the Dean’s and Director’s interest lists—all categories of

applicants that Prof. Arcidiacono believes should be removed from an analysis of bias. Fully 95% of

applicants fall outside those categories.

And among that 95% of applicants, Asian-American

applicants are admitted at slightly higher rates than White applicants.

68. I then explain why the data do not support one of the central assumptions of Prof.

Arcidiacono’s analysis—that Asian-American applicants are stronger on all dimensions of quality,

including non-academic characteristics. As I detail below, White applicants are stronger on average

than Asian-American applicants across the three non-academic profile ratings combined, and are

stronger (in aggregate) across all of the non-academic variables that can be observed in the database

and that are included in Prof. Arcidiacono’s model. As noted earlier, the observable measures of non-

academic achievement are also limited: it is clear from the documents and testimony in this case that

Harvard is using other information such as recommendation letters and the applicants’ personal

essays to form its assessments of each candidate’s non-academic strengths. This information is

“missing data” that cannot be observed in the admissions database. If the racial gaps in these missing

data are similar to the racial gaps in the observed measures of non-academic achievement, then Prof.

Arcidiacono’s model is biased in favor of finding an adverse effect of Asian-American ethnicity on

applicants’ probability of admission, since it omits variables that, if included as controls, would

decrease the size of (or eliminate entirely) the estimated negative effect of Asian-American ethnicity.

69. Finally, Prof. Arcidiacono’s model includes very little information to account for the

overall context of each candidate’s application (such as the quality of the applicant’s high school, the

socioeconomic characteristics of the applicant’s high school and neighborhood, and the applicant’s

family background), even though Prof. Arcidiacono had access to data that shed light on all those

factors. Using these data, I highlight a variety of average differences between White and Asian-

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 34

American applicants that I understand to be relevant to Harvard’s whole-person analysis. For

example, White and Asian-American applicants tend to come from different sets of high schools and

different parts of the country, have parents with different occupational backgrounds, and have

different intended careers. All of these factors provide important context for reviewing applications.

4.1. There is no statistically significant difference in admission rates for the vast majority of Asian-

American and White applicants

70. SFFA’s claim of bias relies heavily on the premise that Asian-American applicants are

admitted at lower rates than White applicants despite having stronger qualifications. But as Prof.

Arcidiacono acknowledges in his report, when exploring whether there is bias against Asian-

American applicants, it is important to account for the fact that Harvard’s admissions process gives

special consideration (independent of race) to children of Harvard or Radcliffe alumnae or alumni

(referred to as “lineage applicants,” and which Prof. Arcidiacono refers to as “legacy applicants”),

applicants recruited to play a varsity sport at Harvard, and children of Harvard faculty or staff

members. The Dean and Director of Admissions also maintain “interest lists” of applicants; I

understand that there are no particular criteria for inclusion on those lists but that they might include,

for example, applicants that the Dean or Director have encountered at recruiting events, as well as

applicants related to donors to Harvard or lineage applicants.

Indeed, Prof. Arcidiacono removes

applicants in those categories from what he calls his “baseline” sample, before exploring the question

of bias.

71. Exhibit 7 shows the admission rates by race, once applicants in the categories noted above

are excluded from the sample. It shows that Asian-American applicants are admitted at a slightly

higher rate than White applicants (though the difference is not statistically significant). Although the

numbers in Exhibit 7 do not settle the question of whether there is bias against Asian-American

applicants (because they do not account for the full set of characteristics of each applicant), the fact

that the difference in admissions rates disappears by controlling for just these factors raises serious

questions about SFFA’s allegations of bias. The remainder of this section explores a variety of other

important factors that differ between White and Asian-American applicants and that, once accounted

for, eliminate the alleged disparity in admission rates.

Fitzsimmons Deposition at pp. 264–267, 278.

Prof. Arcidiacono also removes from his baseline sample applicants who apply during the Early Action cycle

(Arcidiacono Report, p. 2). I do not follow that approach here. I understand that the process for evaluating Early Action

applications is the same as that for evaluating Regular Decision applications except that Early Action applications are

evaluated earlier and have the potential to be deferred to the Regular Decision pool.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 35

Admission rates for applicants who are not lineage applicants, athletic recruits, children of

Harvard faculty or staff, or on Dean’s or Director’s Interest List

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s baseline sample with Early Action applicants.

4.2. White applicants have relatively stronger qualifications on non-academic dimensions

72. A central assumption in Prof. Arcidiacono’s analysis is that, because Asian-American

applicants are stronger on academic dimensions, they are also stronger on non-academic

dimensions—including those dimensions not accounted for by his model.

This assumption leads

Prof. Arcidiacono to focus much of his analysis on academic qualifications, and to conclude that any

difference in admission rates not accounted for by his model must be caused by “bias” against Asian-

American applicants. As I show in this sub-section, however, a proper interpretation of the available

data indicates that Prof. Arcidiacono’s assumption is incorrect. White applicants are in fact stronger,

on average, on non-academic factors that Harvard values.

73. Exhibit 8 shows that Asian-American applicants tend to have higher academic ratings and

slightly higher extracurricular ratings than White applicants, while White applicants tend to have

higher personal and athletic ratings

and are more likely to be multi-dimensional (i.e., more likely to

have a rating of 2 or better on at least three of the four profile ratings). Importantly, the average

Throughout his report, Prof. Arcidiacono presents a variety of analyses that show how non-academic ratings correlate

with Harvard’s academic index. He does not, however, directly examine whether Asian-American applicants are stronger

than White applicants collectively across all non-academic factors in his model. My analysis in this section explores that

question.

As I discuss in Section 5 below, even if one grants Prof. Arcidiacono’s assumption that personal ratings are biased

against Asian-American applicants, his own analysis shows that White applicants still have higher personal ratings even

after statistically eliminating the supposed bias.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 36

difference between Asian-American and White applicants on extracurricular ratings (the one non-

academic rating on which Asian-American applicants perform better than White applicants) is

smaller in magnitude than the average differences in athletic and personal ratings (on which White

applicants perform better than Asian-American applicants).

White and Asian-American applicants excel in different dimensions: percentage of applicants with

ratings of 2 or better

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 using Professor Arcidiacono’s expanded sample. Ratings of 2- and above are

classified as “2 or Better” in this analysis. +/- rating designations are available in the data beginning with the class of 2019.

74. Exhibit 9 presents another way to measure the difference between Asian-American

applicants and White applicants on non-academic characteristics—one that accounts for the collective

strength of each applicant across all three non-academic profile ratings. It shows the proportion of

applicants with a given academic rating whose cumulative non-academic rating—that is, the sum of

the extracurricular, athletic, and personal ratings—is seven or less.

A cumulative non-academic

Applicants with athletic or extracurricular ratings of 5 and 6 are excluded from this analysis because those ratings

indicate that there were special circumstances that caused the applicant to have fewer athletic or extracurricular

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 37

rating of seven or less indicates a candidate who is very strong across all three non-academic

dimensions. The cutoff of seven is also highly informative about admissions probabilities: applicants

whose non-academic ratings add up to seven or less have a 38% admission rate, while those with a

higher sum have only a 4% chance of admission.

75. Exhibit 9 shows that, for a given academic rating, White applicants are much more likely

to have strong non-academic ratings than Asian-American applicants. For example, for applicants

with an academic rating of 1, 25% of White applicants have very strong non-academic ratings,

compared to only 16% of Asian-American applicants (roughly one-third fewer). Similarly, among the

large group of applicants with an academic rating of 2 (representing nearly half of Asian-American

and White applicants), 14% of White applicants, but only 8% of Asian-American applicants, have

very strong non-academic ratings. This gap in non-academic achievement is critically important. As

detailed in Section 3, because academic qualifications are abundant in the applicant pool, it is the

non-academic dimensions that often distinguish academically strong applicants from each other.

Exhibit 9 shows that, for a given level of academic achievement, White applicants are substantially

more likely to have higher ratings across the three non-academic dimensions taken together.

accomplishments, such as significant family commitments or a physical disability. Applicants with profile ratings of 7, 8,

or 9 are excluded from this analysis because those are not valid ratings according to the reading procedures (2018

Reading Procedures at HARV00015414 – 15). In my regression analysis, I treat such ratings as missing.

See workpaper.

Academic research has found that Asian-American high school students are more likely to apply to selective

institutions than White high school students, even controlling for academic qualifications. In other words, even

accounting for academic qualifications, a different sample of Asian-American and White high school students apply to

institutions like Harvard. This differential behavior in the college application process is one possible reason why, on

average, White and Asian-American applicants in the Harvard pool might exhibit different qualifications across the

different dimensions Harvard evaluates. Sandra Black, Kalena Cortes, and Jane Lincove, “Apply Yourself: Racial and

Ethnic Differences in College Application,” NBER Working Paper #21368, 2015; Sandra Black, Kalena Cortes, and Jane

Lincove, “Academic Undermatching of High-Achieving Minority Students: Evidence from Race-Neutral and Holistic

Admissions Policies,” American Economic Review: Papers & Proceedings, 105(5), 2015, pp. 604–610; Amanda Griffith

and Donna Rothstein, “Can’t Get There from Here: The Decision to Apply to a Selective College,” Economics of

Education Review, 28(5), 2009, pp. 620–628; David Card and Alan Krueger, “Would the Elimination of Affirmative

Action Affect Highly Qualified Minority Applicants? Evidence from California and Texas,” Industrial and Labor

Relations Review, 58(3), 2005, pp. 416–434.

In Appendix C of his report, Prof. Arcidiacono argues that “Harvard applies the label ‘Standard Strong’

disproportionately to Asian-American applicants” and that “Asian-American applicants who are labeled this way are

substantially more qualified academically than ‘Standard Strong’ applicants from other racial groups.” However, if one

considers strength more broadly (as measured by the sum of all four profile ratings), Asian-American and White

applicants who are labeled “Standard Strong” are equally strong. See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 38

For a given academic rating, White applicants tend to have better non-academic ratings than

Asian-American applicants

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s expanded sample.

76. Exhibit 10 presents yet another way to measure the relative strength of White and Asian-

American applicants on non-academic factors, using Prof. Arcidiacono’s own model (specifically, his

Model 6, with the overall rating excluded). In Table 7.3 of his report, Prof. Arcidiacono constructs an

“admissions index,” attempting to quantitatively summarize the overall qualifications of applicants

based on all of the factors in his model.

In Exhibit 10, I have reproduced that same analysis but

focusing only on the non-academic factors in his model. That is, I have removed from his admissions

index the effect of the academic rating, grades, and all standardized test scores. The exhibit shows

that, using Prof. Arcidiacono’s own metric, Asian-American applicants are more likely than White

applicants to have weaker non-academic qualifications (i.e. be in deciles 1 to 5), and that White

applicants are more likely than Asian-American applicants to have strong non-academic

qualifications (i.e. be in deciles 9 and 10). The same pattern is observed if I repeat this analysis but

estimate the non-academic admissions index using Prof. Arcidiacono’s Model 5, which excludes the

personal rating.

In other words, Prof. Arcidiacono’s own models show that White applicants are

stronger than Asian-American applicants on non-academic dimensions, and this finding holds even if

personal ratings (which Prof. Arcidiacono alleges are biased) are excluded from the non-academic

qualifications.

Arcidiacono Report, p. 68, Table 7.3.

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 39

White applicants rank higher than Asian-American applicants on non-academic admissions index

Source: Arcidiacono Data

Note: Data are from applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s expanded sample. The non-academic admissions

index is constructed in the same fashion as Professor Arcidiacono’s overall admissions index, using his model 6 without the overall rating

to calculate applicants’ probability of admission. Applicants with characteristics that guaranteed rejection or admission were assigned to

the bottom or top decile, respectively. In addition to excluding the effect of race, as Professor Arcidiacono did, I exclude the effects of the

academic rating and academic variables (such as Academic Index, SAT scores, and GPA).

77. As noted above, these facts are critical to the interpretation of Prof. Arcidiacono’s model

and SFFA’s broader claim of bias. Throughout his report, Prof. Arcidiacono argues that, because

Asian-American applicants have stronger academic credentials (on average) than White applicants,

he can safely assume that they are also stronger than White applicants on dimensions of quality—

mostly non-academic—that cannot be measured by his statistical model. If that were true, it would

imply that adding more variables to Prof. Arcidiacono’s model to further control for differences

between White and Asian-American applicants would only increase the estimated negative effect of

Asian-American ethnicity on applicants’ probability of admission. But, as I have shown above, Prof.

Arcidiacono’s assumption is demonstrably incorrect.

78. In fact, although Asian-American applicants are stronger than White applicants (on

average) on quantifiable measures of academic performance, they are (on average) less strong than

White applicants on observable non-academic measures (Harvard’s ratings and Prof. Arcidiacono’s

admissions index). Because non-academic factors are harder to quantify and include in the model

than academic factors, any statistical model of the Harvard admissions process is therefore more

likely to have more missing information about non-academic factors than about academic factors.

And if the racial gap in the missing non-academic factors is similar to the racial gap in the measured

non-academic factors (i.e., with Asian-American applicants having less strong qualifications than

White applicants), then a statistical model of the admissions process will be predisposed to find a

negative effect of Asian-American ethnicity on applicants’ likelihood of admission, even though the

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 40

racial disparity in admission rates may actually be due to racial differences in the missing non-

academic information.

4.3. Prof. Arcidiacono’s model excludes available measures of life circumstance and context

79. In the remainder of this section, I highlight a set of important contextual factors—that is,

factors that reflect the wide range of applicant characteristics that may inform admissions officers’

evaluation of each application—that Prof. Arcidiacono excludes from his model and that differ, on

average, between Asian-American applicants and White applicants. As I will show in Section 5, these

contextual factors help explain the disparity in admission rates between Asian-American and White

applicants, and when added to Prof. Arcidiacono’s model lead to the conclusion that there is no

statistically significant negative effect of Asian-American ethnicity on applicants’ likelihood of

admission.

80. As detailed above in Section 3, Harvard seeks to assess the quality of each applicant, in

academic and non-academic respects, in light of the context provided by any available information

about the challenges the applicant has faced, the resources at her disposal, and the opportunities she

has (or has not) encountered. A major limitation of Prof. Arcidiacono’s model is that it includes very

few variables to account for these contextual factors. For example, Prof. Arcidiacono includes only a

very limited set of socioeconomic variables, in addition to control variables that account for only

broad differences across types of neighborhoods and high schools, as reflected in high school and

neighborhood “cluster” numbers assigned by a proprietary algorithm of the College Board. Prof.

Arcidiacono does not make use of the more detailed data about each individual high school and

neighborhood that were produced along with the College Board’s “cluster” identifiers and that inform

the College Board cluster assignments.

For example, his model includes controls for 29 high school

clusters, yet there are more than 14,000 high schools represented in the Harvard applicant pool.

Using the more detailed high school and neighborhood characteristics data can add meaningful

information to the model.

As I show in this section, this modeling decision by Prof. Arcidiacono is

problematic because Asian-American and White applicants (in aggregate) come from different sets of

high schools and different regions of the country, have different career goals, and have different

Prof. Arcidiacono’s model includes the following socioeconomic controls: an indicator of whether the admission

officer believed the applicant to be “disadvantaged,” an indicator of whether the applicant applied for a waiver of the

application fee, an indicator of whether the applicant applied for financial aid, an indicator of whether the applicant is in

the first generation of his family to attend college, and indicators of the applicant’s mother’s and father’s educational

attainment.

See workpaper.

Additionally, Prof. Arcidiacono excludes a variable in the Harvard data indicating the type of high school an applicant

attended (Archdiocese, Public, or Private). I include this variable in my models.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 41

family backgrounds (such as parental occupations).

4.3.1. Detailed controls for differences across high schools and neighborhoods

81. As shown in Exhibit 11, Asian-American and White applicants come from very different

sets of high schools. Nearly half of the high schools represented in the applicant pool have either

White applicants but no Asian-American applicants or Asian-American applicants but no White

applicants. This means that, without better controls in the model for high school characteristics, Prof.

Arcidiacono is missing an important difference between the two groups.

Asian-American and White applicants come from different high schools

Source: Arcidiacono Data

Note: Sample consists of Professor Arcidiacono’s expanded sample.

For example, high school characteristics include the high school’s mean SAT score or the percentage of students in the

high school who require financial aid for college. For a complete list of high school and neighborhood characteristics

included in my model, see Appendix E. The College Board high school and neighborhood data report many high school

and neighborhood characteristics based upon only the set of students from a given high school who take the SAT.

Because the SAT is more common in some areas and the ACT in others, in my model I allow for these variables to have

different effects in states where the SAT is more common than in states where the ACT is more common.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 42

82. Asian-American and White applicants also come from different geographic regions.

Asian-American applicants are more concentrated on the East and West coasts and in major cities. In

fact, approximately 29% of all Asian-American applicants come from California dockets (dockets A,

C, and Z), as compared to only 14% of White applicants.

Exhibit 12 highlights these differences by

showing a map of all locations of Asian-American and White applicants’ high schools. The blue dots

indicate locations with White applicants but no Asian-American applicants (during the 2014 – 2019

admissions cycles). As is clear from Exhibit 12, there are a large number of blue dots in the central

and rural areas of the U.S.

White applicants are more dispersed across the U.S. and rural areas

Source: Augmented Arcidiacono Data

Note: Sample consists of Prof. Arcidiacono’s extended dataset for the classes of 2014 – 2019. Each blue dot represents the city of a high

school from which at least one White applicant applied. Each red dot represents the city of a high school from which at least one Asian-

American applicant and one White applicant applied.

83. Although Prof. Arcidiacono controls for an applicant’s admissions docket (i.e., broad

geographic region), he does not control for the much more detailed neighborhood attributes available

in the College Board data, such as the median income of the neighborhood (defined as a census tract

or collection of census tracts) or the proportion of students in a neighborhood who apply to an out-of-

state college. Nor does he control for whether the applicant attends high school in a rural area or the

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 43

type of high school (public, private, or Archdiocese).

4.3.2. Other proxies for life experience, opportunities, and ambitions

84. In addition to ignoring the detailed available data on applicants’ high schools and

neighborhoods, Prof. Arcidiacono also fails to include in his model several available variables that

reflect differences in applicants’ family background and life goals.

85. For example, Prof. Arcidiacono ignores data on parental occupations, a critical measure of

family background. As noted above, family background provides important context for each

applicant’s achievements. Exhibit 64 and Exhibit 65 (Appendix C) show that the parents of Asian-

American and White applicants tend to have different types of occupations.

33% of fathers and 16%

of mothers of Asian-American applicants work in the fields of “Computer and Mathematical,” “Life,

Physical, Social Science,” or “Architecture and Engineering,” while only 16% and 5% (respectively)

of fathers and mothers of White applicants work in those fields.

86. Such differences can reflect not just differences in a family’s economic prosperity but also

differences in applicants’ life experiences. For example, if the son of a professional writer and the son

of a police officer display talent in writing, Harvard might regard the latter’s talent as more

impressive than the former’s. The same might be true of the daughter of professional scientists and

the daughter of factory workers, both of whom exhibit talent in a scientific field. In fact, one of the

examples from Harvard’s casebook (discussed above in Section 3.2) specifically notes parental

occupation as relevant context for evaluating her achievements:

87. Prof. Arcidiacono also excludes from his model other available data on applicants’ family

background, including whether an applicant’s mother or father is deceased, whether one or both of

the applicant’s parents attended an Ivy League university, whether the applicant was born outside the

United States, whether the applicant has lived outside the United States, whether the applicant is a

permanent resident of the United States, and the hours an applicant spent working at a job.

In the Harvard database, applicants report parental occupations using either a Bureau of Labor Statistics (BLS) code or

a Common Application code. Reported parental occupation codes are harmonized by mapping Common Application

codes to major and minor groups in the BLS’ Standard Occupational Classification System. Major and minor groups are

then combined into broad occupational categories.

Information on how much time an applicant spent working at a job and on whether an applicant was born outside the

United States or lived outside the United States is available only for applicants to the classes of 2017 to 2019 and so can

be included in my year-by-year model but not in Prof. Arcidiacono’s pooled model.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 44

88. Finally, Prof. Arcidiacono does not include a variable for intended career in his model.

Exhibit 13 shows the differences in intended careers between Asian-American and White applicants.

Asian-American applicants are much more likely to intend to pursue a career in medicine or health,

while White applicants are much more likely to intend to pursue careers in the arts, communications,

design, social service, government, or law. The difference in the intended career of medicine or

health is particularly stark—White applicants are 37% less likely than Asian-American applicants to

pursue this intended career, an intended career with the lowest admission rate (5%). As detailed

above in Section 3.2, an applicant’s future plans and fields of interest can be critical to the assessment

of how the applicant will contribute to the Harvard community both inside and outside the

classroom.

For example, the Casebook Discussion Guide notes the following about one candidate:

White and Asian-American applicants have different intended careers

Source: Augmented Arcidiacono Data

Note: Data are from applicants to the classes of 2014 - 2019 in Professor Arcidiacono’s expanded sample. The “Other” category includes

applicants whose intended careers are academic, library, religion, trade, other, or unknown. Categories for intended careers can vary year

to year.

Another factor that reveals an applicant’s interests is the type of extracurricular activities on which the applicant has

focused in high school. Prof. Arcidiacono does not include any measure of the type of extracurricular activities in his

model. As shown in Appendix D, there are significant differences across racial groups in applicants’ primary activities

(defined as those listed first or second on the application). Applicants are instructed to list the activities most important to

them first on the Common Application. Information on activities is available in Harvard’s data only for applicants to the

classes of 2017 to 2019.

Casebook Discussion Guide at HARV0018166.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 45

89. Prof. Arcidiacono’s decision to ignore available information related to non-academic

considerations, including contextual factors, is particularly curious because his own regression

models indicate that such variables can help explain differences in admission rates between Asian-

American and White applicants. For example, as he adds measures of academic achievement to his

model (moving from Model 1 to Model 2), the estimated negative association between Asian-

American ethnicity and likelihood of admission increases. But as he adds more variables that capture

the context of each candidate’s application—such as broad high school and neighborhood

demographics, and ratings that capture non-academic characteristics of the applicant—the estimated

negative effect of Asian-American ethnicity shrinks substantially (Model 4 to Model 6).

90. In the next section of this report, I show that the same general pattern holds in my model:

As I add to the model the additional non-academic variables discussed in this section, the estimated

negative effect of Asian-American ethnicity on applicants’ likelihood of admission disappears. This

finding is consistent with the hypothesis that what Prof. Arcidiacono labels a “bias” against Asian-

American applicants in fact reflects not racial discrimination but differences in non-academic factors

that Harvard considers in its whole-person evaluation.

Arcidiacono Report, Appendix B, Table B.7.2.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 46

5. A MORE COMPLETE STATISTICAL MODEL SHOWS NO EVIDENCE OF BIAS AGAINST

ASIAN-AMERICAN APPLICANTS

91. As detailed in Sections 3 and 4 above, because Harvard’s whole-person admissions

process heavily considers non-academic and contextual factors that are often hard to measure, a

statistical model that can reliably estimate the effect of race on Harvard’s admissions decisions

should seek to include as much reliable information about such factors as possible. In this section, I

develop such a statistical model by starting with Prof. Arcidiacono’s model and then expanding his

set of control variables to include a richer set of characteristics that he did not include in his model,

and that more fully capture the many factors that Harvard considers in its process. I further revise the

model by allowing the coefficients it estimates for different control variables, which reflect the

effects of different applicant attributes on the probability of admission, to vary from year to year. I

then use this more complete model in the remainder of this report to address several questions at issue

in this matter.

92. The first question I examine is whether the alleged negative association between Asian-

American ethnicity and applicants’ likelihood of admission persists when more information is

included in the model. I find that it does not. When more variables are added to the model to capture

differences in key contextual factors (high school, neighborhood, and family background), and when

the model is estimated year-by-year to account for differences in the admissions process from year to

year, the alleged negative effect of Asian-American ethnicity disappears and the predictive accuracy

of my model increases.

93. I then turn to a second question in Section 6: to what extent does an applicant’s race or

ethnicity matter in the admissions process, relative to the many other factors Harvard considers in its

whole-person analysis? While I find that race is significantly associated with the likelihood of

admission for some applicants, the role it plays is less significant than that of other factors included in

my model, as well as that of factors not observable in the model.

94. Before delving into the details of my analysis, I first discuss several important

methodological issues that arise when building an admissions model, with a focus on differences

between my approach and Prof. Arcidiacono’s.

5.1. Important differences between Prof. Arcidiacono’s methodology and mine

95. Prof. Arcidiacono uses a statistical model known as a multivariate logit regression to

estimate the relationship between race and admissions outcomes, while controlling for a variety of

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 47

factors that Harvard considers in admission decisions.

The use of a multivariate logit model makes

sense. Multivariate regression analysis is a widely accepted and common statistical technique in both

academia and litigation.

Courts have relied on multivariate regression analysis in a variety of

discrimination matters. In fact, the Reference Guide for Scientific Evidence dedicates an entire

chapter to multivariate regression analysis, including applications to questions of discrimination.

logit model is a type of multivariate regression model that is appropriate where, as here, the outcome

of interest—in this case admission to Harvard—is binary, taking values of either zero (not admitted)

or one (admitted).

96. Even though I agree with Prof. Arcidiacono’s general approach, I disagree with several of

the specific modeling decisions he makes when building his model. In the remainder of this section, I

discuss these methodological decisions and explain why Prof. Arcidiacono and I reach different

conclusions.

5.1.1. Inclusion of additional control variables

97. A basic tenet of econometric research is that the selection of control variables should be

informed by the research question at hand and the specific outcome that is being modeled.

Thus, the

first step in my analysis is to add to Prof. Arcidiacono’s fullest models (Models 5 and 6) any

variables missing from his models that Harvard considers in the admissions process.

98. As detailed in Sections 3 and 4 above, the most important feature of Harvard’s decision

William H. Greene, Econometric Analysis (Pearson, 2008), pp. 773–774 (“The probit and logit models are still the most

common frameworks used in econometric applications.”); Kenneth E. Train, Discrete Choice Methods with Simulation

(The Cambridge University Press, 2009), p. 34 (“By far the easiest and most widely used discrete choice model is logit.”).

James H. Stock and Mark W. Watson, Introduction to Econometrics (Pearson, 2015), p. 189 (“The multiple regression

model … permits estimating the effect … of changing one variable while holding the other regressors constant… provides

a way to isolate the effect.”); William H. Greene, Econometric Analysis (Pearson, 2008), pp. 8–10 (“The linear regression

model is the single most useful tool in the econometrician’s toolkit. The multiple linear regression model is used to study

the relationship between a dependent variable and one or more independent variables. One of the most useful aspects of

the multiple regression model is its ability to identify the independent effects of a set of variables on a dependent

variable.”).

Daniel L. Rubinfeld, Reference Manual on Scientific Evidence: Third Edition (The National Academies Press, 2011),

pp. 305–307 (“Regression analysis has been used most frequently in cases of sex and race discrimination, antitrust

violations, and cases involving class certification.”).

James H. Stock and Mark W. Watson, Introduction to Econometrics (Pearson, 2015), pp. 232–234 (“The starting point

for choosing a regression specification is thinking through the possible sources of omitted variable bias… A control

variable is not the object of interest in the study; rather it is a regressor included to hold constant factors that, if neglected,

could lead the estimated causal effect of interest to suffer from omitted variable bias.”).

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 48

process that Prof. Arcidiacono’s model does not account for is the substantial consideration Harvard

gives to non-academic factors that help distinguish among the large number of academically strong

applicants in its pool, including a wide variety of contextual factors that account for the life

experience and background of each candidate (e.g., her high school, community, and family

background).

99. The first panel in Exhibit 14 shows the variables that Prof. Arcidiacono includes in his

fullest models (Models 5 and 6), while the second panel lists the additional variables I include in my

model. Both sets of variables are organized into several broad groups: race, base controls (a category

that includes personal and financial variables, such as an applicant’s gender, docket, and parents’

education), Harvard profile ratings (academic, extracurricular, personal, and athletic), other ratings

(such as those assigned by admissions officers to recommendation letters from teachers or guidance

counselors, or those assigned by alumni interviewers), measures of academic qualifications, high

school and neighborhood characteristics, and interaction terms (such as interactions of race with

gender that are included in Prof. Arcidiacono’s models).

As shown, the additional variables that I

add to my model include intended career; staff interview ratings; richer controls for high school and

neighborhood characteristics;

parents’ occupations; applicant’s hours worked (at a job); controls for

specific combinations of profile ratings, specific combinations of teacher ratings, and specific

combinations of alumni interview ratings; indicators for participation in different types of primary

extracurricular activities; and indicators for having parents who attended an Ivy League college,

having parents who attended Harvard for graduate school, having a mother or father who is deceased,

being a permanent resident of the United States, having been born in the United States, and having

lived outside the United States.

Appendix E provides a complete list of variables used in my model with detailed definitions.

For the College Board high school and neighborhood variables most likely to be missing for applicants in the sample

(neighborhood median income, proportion of neighborhood residents below poverty line, and neighborhood median

housing value), I assign the mean value of the variable to those applicants who are missing data and include an indicator

variable identifying those for whom the mean was assigned. This approach of imputing missing values is analogous to

that used by Prof. Arcidiacono in his report.

In my year-by-year models, there is not enough data to estimate separately the effect of some of the specific ratings that

are very rare in the data due to limited sample size. To resolve this problem, rather than include a separate dummy

variable for each individual rating category, I include a dummy variable for each unique combination of the four profile

ratings, each unique combination of the two teacher ratings, and each unique combination of the alumni interviewer

ratings. Unique ratings combinations with fewer than 100 observations are grouped with other ratings combinations such

that the combination is in a group that has an admission rate most similar to that of the combination. I have confirmed that

this approach has no substantive effect on the estimated size of the Asian-American coefficient and yields nearly the same

predictive accuracy as Prof. Arcidiacono’s approach in the pooled model. (See workpaper.) Additionally, as I explain

below, my year-by-year models using this approach more accurately predict Harvard’s admissions decisions than Prof.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 49

100. I also make a series of more technical corrections to Prof. Arcidiacono’s variables and

sample. First, Prof. Arcidiacono includes a number of variables that he interacts with race and gender

in his model.

An “interaction” variable simply multiplies one variable by another variable, to show

how the presence or absence of the second variable modifies the effect of the first. For example, one

could model the effect of male gender on the likelihood of admission, the effect of Asian-American

ethnicity, and the effect of male gender and Asian-American ethnicity—that is, the extent to which

being male decreases or increases the effect of being Asian American, and vice versa. Since there are

hundreds of potential interactions one could add to the admissions model, and it is not

computationally feasible to include all of them, it is unclear why Prof. Arcidiacono chose to include

specific interactions—for example, allowing the effect of gender and “disadvantaged” status to vary

by race—and not others. Decisions to add interactions to a model like Prof. Arcidiacono’s are

typically guided by a clear economic theory or methodological goal. The typical approach in a model

trying to isolate the effect of Asian-American ethnicity on admissions outcomes would be to include

an interaction between race and disadvantaged status only if the effect of being disadvantaged is

different for Asian-American and White applicants (or, equivalently, if the effect of race is different

for disadvantaged and non-disadvantaged applicants). Prof. Arcidiacono’s results, however, show

that is not the case. In my model, I remove the interactions with race and gender. This is a more

transparent approach that requires fewer subjective judgments about which of the hundreds of

interactions that can be included in such a model should be included.

Arcidiacono’s pooled model. This approach has the additional benefit that it can account for any potential “interaction”

effects associated with specific combinations of the four ratings profiles.

Arcidiacono Report, p. 62.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 50

Control variables used in logit models of admission

Source: Augmented Arcidiacono Data

Additional Variables in Card Models

Mother and father occupation

Mother or father deceased

Parent attended Ivy League college

Rural applicant

Intended career

School type (public, private, Archdiocese)

Parent attended Harvard Graduate School

Born in United States, lived outside of United States

Permanent resident of United States

Primary extracurricular activity indicators

Total hours of work

Profile rating combinations

Alumni interview ratings combinations

Teacher ratings combinations

Staff interview ratings

High school characteristics (such as average SAT math)

Neighborhood characteristics (such as median income)

SAT state indicator

Legend:

* Removed from Card Models.

† Included in expanded sample model only.

Mother and father education level

Early Action†

Athlete†, legacy†, double legacy†

Child of Harvard faculty or staff†

Base Control Variables

Year, gender, docket

First generation college

Disadvantaged, fee waiver, and financial aid

Dean or Director’s interest list†

Race Variables

Race

(White, African-American, Hispanic, Native American,

Hawaiian/Pacific Islander, Asian-American, and Missing)

Interactions

Race and intended concentration interacted with gender*

Disadvantaged, early decision†, and legacy† interacted

with race*

Missing data indicators interacted with race*

Profile Ratings

Academic, extracurricular, and athletic ratings

Personal rating (Model 6 Only)

Other Ratings

Teacher ratings

Overall rating (Model 6 Only)*

Missing Data Indicators

Alumni interviewer rating*, cluster IDs*,

and Average SAT Subject Test Score

Professor Arcidiacono’s Models 5 and 6

Alumni interview ratings

Guidance counselor rating

High School and Neighborhood Characteristics

High school cluster ID*

Neighborhood cluster ID*

Academic Variables

ACT/SAT Math and Verbal, Average SAT Subject Test Score

Converted GPA and indicator for value of 35

Academic Index Quadratic

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 51

101. I also correct a variety of technical errors in Prof. Arcidiacono’s sample and control

variables:

•

Prof. Arcidiacono treats profile ratings of 7, 8, and 9 as low ratings, but

7, 8, and 9 ratings do not appear in the reader guidelines and thus are

more likely erroneous data entries.

I treat such entries as missing

ratings.

•

Prof. Arcidiacono drops applicants with blank teacher ratings from his

regressions, rather than including them in the “missing” category of his

teacher ratings variables. I include such entries in the “missing”

category.

•

Prof. Arcidiacono makes an error when importing the ACT science

scores. I correct this error so that they are imported correctly.

•

I remove sample conditions related to the overall rating since the overall

rating is excluded from all of my models and thus there is no need to

exclude applicants with low or missing overall ratings.

5.1.2. A year-by-year model is more appropriate than a pooled model

102. Another important methodological flaw in Prof. Arcidiacono’s approach is his decision

to pool admissions data across years. This decision is flawed for several reasons.

103. First, the admissions process at Harvard is, by its nature, an annual process. Each

applicant is compared to other applicants who applied in that year. A pooled analysis does not reflect

how the process actually works, because it effectively compares applicants from different years to

each other.

104. Second, a closely related problem with a pooled model is that it imposes the assumption

that every factor in the admissions process has the same effect from year to year. Given that the

applicant pool changes from year to year, it is quite possible that the relative abundance and scarcity

2018 Reading Procedures at HARV00015414 – 15.

As noted above, I also exclude the overall rating from all of my models. As discussed above, according to deposition

testimony in this case, race can influence the overall rating. Since my analysis seeks to isolate the incremental effect of

race on admissions decisions, it is inappropriate to include any variables that can themselves be affected by race.

Removing the overall rating from my model is a conservative approach because White applicants have slightly higher

overall ratings, on average, than Asian-American applicants. My analyses show that, even without the inclusion of overall

rating in the models, there is no evidence of bias in admissions decisions.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 52

of relevant factors can also change, which can cause the value Harvard places on any given factor to

also change from year to year. Below I provide several examples of how this dynamic might play out.

105. One example is that, during the period for which I have data, Harvard saw a shift in

applicants’ intended concentrations. (See Exhibit 15.)

Because Harvard seeks to admit a

class that is diverse with respect to intended concentrations, the effect of an applicant’s intent to

concentrate in a given field might well change when the aggregate interests of the applicant pool as a

whole vary over time. Thus, for example, an applicant’s intention to concentrate in the humanities

might distinguish an applicant more or less depending on the overall mix of intended concentrations

in the applicant pool in that year.

One potential factor contributing to this shift in intended concentrations is that in 2007, Harvard elevated the Division

of Engineering and Applied Sciences to the School of Engineering and Applied Sciences. John A. Paulson School of

Engineering and Applied Sciences, “Timeline,” available at https://www.seas.harvard.edu/about-seas/history-

seas/timeline, accessed November 20, 2017.

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 53

The mix of intended concentrations for Harvard applicants has changed over time

Source: Augmented Arcidiacono Data

Note: Sample consists of applicants to the classes of 2014 – 2019 in Prof. Arcidiacono’s corrected expanded sample. Applicants with

missing and “Unspecified” intended concentrations are excluded from this chart.

106. Another example is that the definition of the dockets (the geographical divisions Harvard

uses in its admissions process) changed during the time period for which I have data. Starting with

the class of 2015, Harvard introduced the J docket. For the classes of 2015 – 2019, the J docket

included applicants from Arkansas, Kansas, Kentucky, Mississippi, Missouri, Western New York,

Oklahoma, Tennessee, and West Virginia, but for the class of 2014 those applicants were distributed

across other dockets.

Prof. Arcidiacono’s model cannot account for this change because it estimates

the effect of an applicant’s docket placement on admission only after pooling years together. Thus,

his model estimates docket effects incorrectly because it conflates the two different definitions of

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 54

dockets across years.

107. Additionally, as I discuss in greater detail in Section 7 below, Harvard did not employ an

Early Action admissions process for the classes of 2014 and 2015. Starting with the class of 2016, it

reinstated Early Action.

Prof. Arcidiacono’s model cannot account for these changes because he

pools all the data together into a single model. As a result, the estimated effect of each variable in his

model is calculated using two different admissions regimes—one in which Early Action admissions

existed and one in which it did not. That is problematic for both his expanded and baseline samples.

Excluding Early Action applicants, as he does in his baseline sample, is not sufficient to correct for

the problem, because there is no way to identify which applicants would have applied Early Action

had Early Action existed.

108. Variation in the admission rate across the six admission cycles for applicants with the

same profile ratings combinations provides further justification for estimating the model year-by-

year. For example, consider applicants with ratings of 2 on all four dimensions (academic,

extracurricular, personal, and athletic). Applicants with this ratings combination have an admission

rate that varies between 61% and 77% depending on the admissions cycle.

By pooling data across

year, Prof. Arcidiacono’s model assumes these ratings have the same effect in each year.

109. To formally test whether the effect of various applicant characteristics on applicants’

likelihood of admission is sufficiently similar across years to justify using a “pooled” model as Prof.

Arcidiacono does, I have employed a standard statistical test known as a Wald test (or a chi-squared

test). That test is designed to evaluate the null hypothesis that applicant characteristics have identical

effects on likelihood of admission from year to year. I find that the Wald test rejects that null

hypothesis here, indicating that a pooled model is inappropriate.

Additionally, as I will discuss in

more detail below, I find that my year-by-year models are better able to predict admission, a further

justification of using year-by-year models rather than a pooled model. Given these results, and the

fundamental fact that Harvard’s admissions decisions are made separately for each year, the

Harvard Office of Institutional Research presentation, “Admissions and Financial Aid at Harvard College,” February

2013, HARV00031687 – 1772 (“OIR Presentation”) at HARV00031695.

See workpaper.

To implement this test, I start with Prof. Arcidiacono’s Model 6 for the expanded, pooled sample, and exclude the

overall rating and interactions with race and gender. I then interact all other control variables with each of his dummy

variables for each year to directly test whether the effects of the control variables change from year to year. In

implementing the test, I combine Native American and Hawaiian/Pacific Islander applicants with Hispanic applicants and

combine personal ratings of 1 and 2 into one dummy variable because there are too few applicants in those categories to

allow me to interact each variable with each separate year dummy. See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 55

methodologically sound approach is to estimate a separate model for each year.

5.1.3. Definition of race

110. In my analysis, I generally use the same method for classifying applicants by race as

Prof. Arcidiacono uses, to ensure comparability of results. However, when estimating a separate

model for each year, I have to combine one race group with another due to the fact that there are very

few applicants of a particular race (e.g., Native American or Hawaiian/Pacific Islander) in any one

year.

In Prof. Arcidiacono’s model, applicants are classified into mutually exclusive categories of

White, African-American, Hispanic, Native American, Hawaiian/Pacific Islander, Asian-American,

and Missing.

In my year-by-year models, I use the following mutually exclusive race categories:

(1) White, (2) African-American, (3) Hispanic, Native American, or Hawaiian/Pacific Islander,

(4) Asian-American, and (5) Missing.

I combine Native American and Hawaiian/Pacific Islander

applicants with Hispanic applicants because the increased probability of admission associated with

Native American and Hawaiian/Pacific Islander ethnicity is most similar to the increased probability

of admission associated with Hispanic ethnicity.

To ensure that my estimate of the alleged bias

against Asian-American applicants is robust to this change, I have tested whether using this adjusted

definition of race has any substantive effect on the Asian-American coefficient within Prof.

Arcidiacono’s pooled sample. It does not.

111. I have also considered the possibility (raised by Prof. Arcidiacono) that the fact that

some applicants to Harvard are not classified as belonging to any racial group (i.e. are “Missing”

race) might lead to an underestimate of the alleged bias against Asian-American applicants. For the

purpose of this analysis, I use other variables in the Harvard data with information about an

applicant’s race that Prof. Arcidiacono did not use in creating his definition of race. For example, if

Prof. Arcidiacono himself finds evidence that it is better to estimate the model separately for each year. For example, he

presents a model in his report using the expanded sample in which he interacts year and race (thus allowing each race to

have a separate effect in each year). That model finds that the effect of race differs in a statistically significant fashion

across years (Arcidiacono Report, Appendix B, Table B.8.1).

Prof. Arcidiacono also combines smaller race groups when he estimates a model that interacts his race categories with

year (Arcidiacono Report, p. 69, footnote 69).

This race definition is a variable available in the Harvard database. In 2010, Harvard began using an additional

methodology that allows applicants who self-identified with more than one race to be counted in more than one category

(Deposition of Elizabeth Yong, March 24, 2017 (“Yong Deposition”), pp. 134–137).

I also combine Hawaiian/Pacific Islander applicants with Asian-American applicants, rather than grouping them with

Hispanic applicants, in a sensitivity of my preferred model (discussed in more detail below).

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 56

an applicant reports her race to the College Board when taking the SAT, it is provided to Harvard

along with the test score. Using these variables, it is possible to identify the race of many applicants

that are classified as missing race in Prof. Arcidiacono’s analysis. In fact, I am able to classify nearly

70% of the 10,000 applicants classified as having a “missing” race.

When I re-estimate Prof.

Arcidiacono’s model with these additional applicants’ race information filled in, I find that in fact his

estimates of the effect of Asian-American ethnicity become slightly less negative, not more.

100

That

directly contradicts Prof. Arcidiacono’s claim that the exclusion of these applicants’ races likely

causes his model to underestimate the bias against Asian-American applicants.

5.1.4. Prof. Arcidiacono’s Models 1-4 are not reliable

112. Prof. Arcidiacono offers six different models to estimate the effect of Asian-American

ethnicity on the probability of admission. My analysis builds exclusively on Prof. Arcidiacono’s

Models 5 and 6, for two reasons.

113. First, Prof. Arcidiacono states that “Model 5 is the most useful of [his] models for

determining the effect/impact of race in admissions decisions,”

101

and Mr. Kahlenberg uses Model 6

as his preferred model for simulating race-neutral admissions practices. SFFA’s own experts thus

agree that Models 5 and 6 are the most reliable.

114. Second, as explained above, Models 1, 2, 3, and 4 are unreliable because they do not

account for any of Harvard’s ratings on non-academic dimensions. As detailed in Section 3 above,

Harvard’s admissions process considers a wide variety of non-academic factors, and non-academic

excellence is rarer in the Harvard applicant pool than academic excellence. Harvard’s profile and

school-support ratings play an essential role in capturing non-academic information, much of which

is not otherwise quantified. Because Prof. Arcidiacono’s Models 1–4 ignore that critical information,

they cannot reliably estimate the effect of race.

115. Exhibit 16 helps illustrate this point. It reports the Pseudo R-Squared value for each of

Prof. Arcidiacono’s Models 1–6. The Pseudo R-Squared statistic provides a useful summary measure

of the extent to which the variables included in a model explain the outcome being modeled (in this

case, admission to Harvard). It can take on values ranging from zero to one; the closer it is to one, the

more the model explains about Harvard’s admission decisions. Models 1–4 have Pseudo R-Squared

See workpaper. Although I understand that admissions officers rely on applicants’ self-identification of their race on

the application (see Deposition of Grace Cheng, April 7, 2017, pp. 114–115; Banks Deposition, p. 190; Fitzsimmons

Deposition, pp. 239–240), I use race information reported by the applicant on the SAT, SAT II, and ACT tests for the

limited purpose of this sensitivity analysis. I do not include this additional race information in the rest of my models.

100

See workpaper.

101

Arcidiacono Report, p. 62.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 57

values of 0.34 or lower—very low, and much lower than the Pseudo R-Squared values of Models 5

and 6, which jump to 0.57 and 0.65, respectively (for the expanded sample). That is because Models

1–4 ignore critical information on which Harvard relies when making admission decisions.

Explanatory power of Professor Arcidiacono’s logit models of admission

Source: Arcidiacono Report, Appendix B, Tables B.7.1 and B.7.2.

5.1.5. The expanded sample is more appropriate than the baseline sample

116. Prof. Arcidiacono presents his models using two different samples—one that he refers to

as the “baseline sample” and one that he refers to as the “expanded sample.” The baseline sample

removes lineage applicants, recruited athletes, children of Harvard faculty and staff, candidates who

appear on the Dean’s or Director’s interest lists, and Early Action applicants. My analyses rely on the

expanded sample, for several reasons.

117. First, as a general matter, Harvard compares all of its applicants in each year to all other

applicants in the pool for that year; it does not conduct separate admissions processes for discrete

subsets of the pool. Harvard seeks a diverse class in each year on any number of dimensions—

academic, extracurricular, geographic, racial and ethnic, and so on. Thus, the fact that some

candidates with particular attributes (such as lineage applicants or recruited athletes) have a higher

likelihood of admission does not mean that they should be completely excluded from the analysis.

Such candidates are still compared to other candidates on all dimensions, and their candidacy can

affect how other decisions are made. By throwing such information out of the analysis, the model

cannot use that information to explain why other applicants were or were not admitted.

118. This methodological flaw is particularly a concern for Prof. Arcidiacono’s decision to

remove from his baseline sample applicants for Early Action admission. This decision is inconsistent

with how Harvard’s admissions process works. It is my understanding that Harvard does not have a

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 58

different standard for admission in the Early Action process, and most applicants who apply early and

are not admitted have their applications “deferred” to the Regular Decision phase,

102

where they

compete with applicants who did not apply early.

103

Removing early applicants from the sample thus

has the effect of modeling only part of the regular admissions cycle, excluding many applicants with

whom the included applicants are competing for spots.

119. Second, as noted above, the Early Action process did not exist in two of the years of data

used in Prof. Arcidiacono’s model. Thus, for years in which there was no Early Action process (the

class of 2014 and 2015 admissions cycles), Prof Arcidiacono’s “baseline sample” includes a different

set of applicants than in years for which Early Action was available. Further, because Prof.

Arcidiacono pools data across all years and then excludes Early Action applicants for years in which

Early Action existed, his baseline sample combines multiple years of data that have different

definitions of a “baseline” sample, creating a pooled sample that is inconsistent. That is a major

problem with his “baseline” sample and pooled model.

120. Finally, because it is important to estimate the models separately by year, limiting the

sample to Prof. Arcidiacono’s “baseline” sample unnecessarily reduces the sample size of the year-

by-year models, which reduces the power and precision of the models.

5.1.6. The importance of factors that Harvard values but that are not measured in the data

121. As detailed throughout this report, Harvard’s admissions process considers non-

academic factors that are relatively scarce in the applicant pool and difficult to quantify in a

regression model. Even after enriching my admissions model to capture a variety of such factors that

are missing from Prof. Arcidiacono’s model (and to improve its predictive power relative to Prof.

Arcidiacono’s model), my model still does not perfectly explain all of Harvard’s admissions

decisions. This implies that there are additional factors not measured by my model that are important

102

See workpaper.

103

McGrath Deposition 2015, p. 210 (“Q. And then everything we just said about the information that gets presented to

the subcommittee is the same for regular action as it is for early action? A. Yes.”); Weaver Deposition, Volume II, pp.

172–173 (“Q. Besides the timing, what other variations are there between …early action and regular action? … A. There

are differences between the two in the sense of timeline and the quantity of applications; however, the process and the

way in which a folder moves through the process is similar.”); Ray Deposition, p. 55 (“Q. When you go to subcommittee

in the regular action review process, …do you follow the same format that you did in early action review? … A. Yes. …

Q. And do you typically give the same designations for students—namely, admitted, wait list, rejected, FAO hold—

during the subcommittee process and regular action review? ... A. Yes. The only different action is that there is no defer

action in regular action.”).

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 59

to Harvard’s admissions decisions. The omission of such factors from the model presents a classic

example of a problem known as “omitted variable bias,” or what I have referred to above as the

“missing data” problem.

104

122. Omitted variable bias occurs whenever a regression model omits variables that (1) are

correlated with the variable of interest and (2) affect the outcome variable. In that circumstance, the

effect of the omitted variable on the outcome may incorrectly be attributed to the variable of interest.

Here, the variable of interest is race, so the omission of variables that are correlated with race and

affect admissions outcomes—such as the non-academic factors discussed throughout this report—can

lead the model to misattribute to race differences in admissions outcomes that are in fact attributable

to the omitted variables.

123. Statistical methods can help quantify the importance of unmeasured, individualized

factors in the decision process relative to factors that are more easily measured. These methods can

help us understand the degree to which factors outside of the model might bias the results, and/or

explain the reasons a specific applicant was ultimately admitted or denied admission. Below are four

widely accepted methods that I will use in the remainder of this section, and that will be important in

showing that Prof. Arcidiacono’s model is missing critical information.

• Measures of overall fit and predictive accuracy: These statistics measure

how well the model explains, or predicts, the outcome of interest (in this

case, admission to Harvard). I will rely primarily on two such metrics.

The first is Pseudo R-Squared, a measure of how well the variables

included in the model explain the outcome. The second is the fraction of

admitted applicants for whom the model correctly predicts the actual

admission outcome.

105

• Predicted probability of admission for each individual applicant:

Whereas the metrics discussed above reflect how well the model

104

James H. Stock and Mark W. Watson, Introduction to Econometrics (Pearson, 2015), pp. 183–184 (“If the regressor is

correlated with a variable that has been omitted from the analysis and that determines, in part, the dependent variable,

then the OLS estimator will have omitted variable bias.”); Sharmila Choudhury, “Reassessing the Male-Female Wage

Differential: A Fixed Effects Approach,” Southern Economic Journal 60(2), 1993, pp. 327–340 at p. 327 (“The

conventional approach of economists has been to estimate earnings as a function of various socio-economic

characteristics. The observed wage gap is decomposed into a part explained by productivity related factors and an

unexplained residual, traditionally labelled as discrimination. While it is possible that the unexplained variation earnings

is the result of discrimination, it is also possibly the result of model misspecification ... we address the misspecification

that could possibility arise from omitted variables…”)

105

Because the logit model estimates a probability of admission for each applicant, I compute this statistic by ranking

applicants from highest to lowest predicted probability of admission and considering the top-ranked applicants to be

admitted, such that the number of predicted admitted students matches the number of actual admitted students.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 60

explains admissions outcomes in aggregate, the importance of

unmeasured factors in any individual admissions decision can be

quantified using the estimated probability of admission for each

individual applicant. For example, if the model generates an estimated

probability of admission close to zero for an applicant who is actually

admitted, or vice versa, it suggests that there are unobserved factors that

substantially affected the admission outcome. A particularly useful

exercise is to compare the predicted probability of admission for any

given applicant to the final admission decision. The difference between

the predicted probability of admission and the actual admission decision

is a measure of the importance of unobserved factors that are valued by

admissions officers but not included in the model. The larger the

difference, the more important unobserved factors were in the final

decision.

• Sensitivity of coefficients to inclusion/exclusion of additional control

variables: Another way to assess the influence of unmeasured factors on

a given outcome variable is to estimate “sensitivity” analyses that

include different sets of control variables, testing how the effect of a

particular variable of interest changes when different sets of controls are

included. Prof. Arcidiacono employs this analysis himself when using

his Models 1–6, as will I in order to better understand the effect of

factors that cannot be included in my estimation.

• Subgroup analysis: A closely related method for assessing the

importance of unmeasured factors is subgroup analysis. If racial bias is

the cause of a disparity between racial groups in an outcome like

admission to Harvard, then one would expect to see the disparity persist

across all relevant subgroups, time periods, and outcomes in the data.

For example, a bias against applicants of a particular race should affect

men and women of that race alike, and should affect members of that

race across all years, since race is consistent across gender and time. On

the other hand, if the racial disparity is caused by unobserved factors

rather than by bias, it is much more likely that the disparity will vary

across subgroups because, simply by chance, the relative strength and

weakness of each racial group on unmeasured factors will differ by

subgroup. Similar logic applies if the disparity at issue is not consistent

across different outcome measures––if the admissions process is in fact

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 61

biased, there should be consistent evidence of bias not just in ultimate

admissions decisions but in other types of outcomes that reflect the

judgment of admissions officers, such as profile ratings. I employ these

types of analyses in Section 5.3 below.

5.1.7. The importance of average marginal effects

124. One final technical note warrants discussion. In the appendix tables of his report, Prof.

Arcidiacono reports only the logit coefficients of the race variables from his regression models.

Those coefficients show the marginal effect of a given variable (e.g., an indicator for Asian-American

ethnicity) on the logarithm of the odds (the so-called log-odds) of admission, rather than the marginal

effect of a given variable on the probability of admission for a given candidate. Importantly, in a logit

model, the marginal effect of any given variable on an applicant’s probability of admission varies

depending on that applicant’s other characteristics, and there is no single parameter that measures the

gap in admission probabilities between different subgroups. As a result, simply reporting the logit

coefficient for a given variable does not convey the effect of that variable across all applicants in the

relevant population (here, Asian-American applicants).

125. For example, consider an applicant to Harvard who has an academic rating of 4 or worse.

She will have very little chance of admission given Harvard’s high academic standards. Thus, even if

she was very active in her high school, served as president of the student government, and

volunteered at numerous community organizations (all characteristics Harvard values), she would

still have very little chance of admission. Those factors will have essentially zero marginal effect on

her probability of admission. On the other hand, if the same candidate had an academic rating of 1 or

2, then the marginal effect of her strong extracurricular and community service record on her

probability of admission would be much larger. This is what is referred to as a “non-linear” effect—

the effect of the student’s non-academic achievements depends on whether her academic

qualifications are strong enough for her to be in the running.

126. Because of these non-linear effects, the typical way to summarize the marginal effect of

a variable in a logit regression is to report its average marginal effect across all individuals who

possess the trait in question—rather than simply reporting its logit coefficient, as Prof. Arcidiacono

does.

106

For example, in the hypothetical above, one would report the average marginal effect of a

106

A. Colin Cameron and Pravin K. Trivedi, Microeconometrics: Methods and Applications (Cambridge University

Press, 2009), pp. 467, 501 (“[T]here are several ways to compute an average marginal effect. It is best to use …the

sample average of the marginal effects…Typically these [marginal effects] are then averaged over individuals to give an

average marginal effect[.]”); William H. Greene, Econometric Analysis (Pearson, 2008), p. 775 (“For computing marginal

effects, one can evaluate the expressions at the sample means of the data or evaluate the marginal effects at every

observation and use the sample average of the individual marginal effects…Current practice favors averaging the

individual marginal effects when it is possible to do so.”).

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 62

candidate’s extracurricular achievements across all candidates. If the average marginal effect of a

given variable is not statistically different from zero, one can conclude that on average the variable

does not have a significant effect on the outcome of interest. For this reason, I report all effects of

race in my report as average marginal effects.

127. Another shortcoming of Prof. Arcidiacono’s approach to reporting the logit coefficients

is that, because he estimates the effect of Asian-American ethnicity separately for men and women

and for those who are and are not identified by Harvard’s admissions officers as disadvantaged, most

of his analysis does not quantify the overall effect of ethnicity for the full set of Asian-American

applicants.

107

The Asian-American logit coefficient (-0.367) that he reports in his appendix table

B.7.1 and discusses in Section 3.7 of his report actually refers to the effect of Asian-American

ethnicity only for male applicants who are not disadvantaged, not the effect for the general

population of Asian-American applicants, including those who are disadvantaged and those who are

female.

108

To calculate the effect on the log-odds of admission of Asian-American ethnicity for non-

disadvantaged female applicants, for example, one must add together the Asian-American coefficient

and the Asian-American*female coefficient, yielding an effect on the log-odds of only -0.089—less

than one-quarter the size of the effect that Arcidiacono misleadingly reports.

109

Calculating an

average marginal effect, as I do throughout this report, corrects this problem by reporting a single,

average estimated effect of Asian-American ethnicity on likelihood of admission across all Asian-

American applicants.

5.2. My enriched model finds no statistically significant evidence of bias

128. I now turn to the results of my statistical model. As detailed above, I start with Prof.

Arcidiacono’s model and then include a richer set of control variables that he does not include in his

model and that more fully account for the substantial consideration Harvard gives to non-academic

factors, including contextual factors like high school, neighborhood, and family background. I then

use the model to test whether the disparity between Asian-American and White admission rates can

be explained by factors in the model other than race. As I show below, once additional relevant

factors are included in the model, Asian-American ethnicity has no consistent statistically significant

107

Prof. Arcidiacono’s Table 7.2 is an exception.

108

This also applies to Prof. Arcidiacono’s Table B.7.2 and various other tables reporting logit coefficients in his report,

such as those for his ratings regressions. Additionally, he includes interactions between race and missing variable

indicators for variables such as SAT II average, alumni interview rating and College Board cluster identifiers, so the

coefficients he reports are actually for Asian-American non-disadvantaged male applicants who are not missing these

covariates.

109

-.089 = -.367+.278 summing logit coefficients on Asian-American and female*Asian-American from Prof.

Arcidiacono’s Table B.7.1.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 63

negative effect on applicants’ likelihood of admission.

5.2.1. With better and more complete control variables included in Prof. Arcidiacono’s regression model,

there is no statistically significant gap in admission rates between Asian-American and White applicants

129. Exhibit 17 presents one of the key findings of my analysis: The alleged effect of Asian-

American ethnicity on applicants’ likelihood of admission is statistically insignificant even in a

model that pools all applicants across years as Prof. Arcidiacono does.

130. Each row in Exhibit 17 reports the average marginal effect of Asian-American (relative

to White) ethnicity for a particular specification of Prof. Arcidiacono’s Model 6, including the

additional changes I make to Model 6 described above. The average marginal effect is the average

change in the estimated probability of admission associated with being Asian-American as opposed

to White, calculated across all Asian-American applicants in the sample.

Pooled logit models of admission do not show evidence of bias against Asian-American applicants

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the average marginal effect of race on admission for Asian-American applicants. The Card pooled model uses Professor

Arcidiacono’s corrected expanded sample; all other models use Professor Acidiacono’s expanded sample. Marginal efects are calculated

relative to White applicants (using the same definition of race as Professor Arcidiacono). * indicates significance at the 5% level. Marginal

effects are reported as percentage point values.

131. The first row is calculated directly from Model 6 in Prof. Arcidiacono’s report. It shows

that the average marginal effect of Asian-American ethnicity in Model 6 is -0.46. This means that,

relative to the average White applicant, the average Asian-American applicant has a lower probability

of admission to Harvard—by 0.46 percentage points—controlling for all of the variables in Prof.

Arcidiacono’s model. This effect is statistically significant.

132. The second row also relies on Prof. Arcidiacono’s Model 6, but removes the overall

rating, which should not be included in any model that is attempting to estimate the effect of race

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 64

because (as discussed above) the overall rating may be influenced by an applicant’s race. In this

specification, the average marginal effect of Asian-American ethnicity becomes more negative, at -

0.58, and remains statistically significant. The third row then removes Prof. Arcidiacono’s

interactions of race with other variables (such as disadvantaged status, gender, and missing variable

indicators), and also removes interactions of gender with other variables (such as intended

concentration), which for the reasons discussed above should not be included. In this specification,

the average marginal effect of being Asian-American changes only slightly to -0.53 and remains

statistically significant. Moving forward, when I refer to Prof. Arcidiacono’s model, I will refer to the

version in row 3, as that is the version that I will build on as I enrich the model.

133. The fourth row of Exhibit 17 reports the key results of my enriched model, where I begin

with Prof. Arcidiacono’s Model in row 3 and add in additional control variables detailed in Exhibit

14 above, including better measures of high school quality, high school and neighborhood

demographics, socioeconomic status, and staff interview ratings.

110

134. When I include these additional variables, the average estimated marginal effect of

Asian-American ethnicity falls by over 70% to -0.14, and—crucially—it becomes statistically

insignificant at the conventional 5% significance level. In other words, the model finds that there is

no statistically meaningful effect of Asian-American ethnicity on applicants’ likelihood of admission,

controlling for all of the variables in my enriched model.

111

135. Exhibit 18 shows in more detail how the average marginal effect of Asian-American

ethnicity falls as I add additional controls to the pooled model. The addition of information on

parental occupations causes the average marginal effect to fall from -0.59 to -0.41. Adding detailed

high school and neighborhood information (on top of the parental occupation information) causes the

effect to fall further to -0.29.

112

Further expanding the set of controls to include all the additional

controls I use in my model (e.g. intended career, staff interview ratings, and an indicator of whether

the applicant was born in the United States) causes the effect to fall still further, to -0.14, and to

become insignificant. At each step, I test whether the variables I have added are jointly statistically

110

Some variables in my model (but not Prof. Arcidiacono’s model), such as the detailed College Board high school and

neighborhood characteristics and the rural indicator, are unavailable for some applicants (primarily those on international

dockets or those who are home-schooled). Thus, when I add these variables to the model, applicants missing this

information are no longer included in the regression sample. Such applicants account for only 4.85% of the sample (see

workpaper). Prof. Arcidiacono includes such applicants in his sample by assigning them all to the same high school and

neighborhood clusters. This inappropriately groups applicants from varied backgrounds (such as those who are home-

schooled in the United States and those attending international high schools) into the same cluster identifier.

111

Treating Hawaiian/Pacific Islander applicants as Asian-American applicants attenuates the estimated effect even

further to -0.09 (See workpaper).

112

I reviewed the 60 individual high school and neighborhood variables available in the College Board data and found

several were redundant (with one another or with information available in the Harvard database) or had other limitations

that warranted their not being included.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 65

significant, and in each case they are.

Additional control variables attenuate the estimated effect of Asian-American ethnicity

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Data are from applicants in Professor Arcidiacono’s corrected expanded sample. Marginal effects are calculated relative to White

applicants. * indicates significance at the 5% level. Marginal effects are reported as percentage point values. Other variables include

intended career, school type, parent attended Ivy League college, parent attended Harvard graduate school, parent living or deceased

status, rural indicator, permanent resident indicator, and staff interview rating.

136. As detailed above in Sections 3 and 4, a fundamental problem with Prof. Arcidiacono’s

models is that they put a great deal of weight on academic variables by including both the academic

rating and the various quantitative academic measures that inform that rating, but they include less

information on the critical non-academic factors (including contextual factors like high school,

neighborhood, and family background) that Harvard considers, and that differ on average between

White and Asian-American applicants. As shown in the prior two exhibits, when I address this

concern and include more variables that can capture differences across candidates in life experience

and circumstance, the disparity between Asian-American and White admission rates is fully

explained by the set of control variables in the model.

137. This result should not be surprising because a similar pattern is present (albeit to a lesser

degree) in Prof. Arcidiacono’s own models. Specifically, as he adds non-academic variables to his

model, including measures of socioeconomic status and non-academic ratings, the alleged negative

effect of Asian-American ethnicity is attenuated.

113

My enriched model has the same feature; it

simply adds a more inclusive set of measures of such factors into the model.

138. This is still a pooled model, as opposed to the year-by-year models that I consider

methodologically superior and that I discuss below. In other words, even if I accept Prof.

Arcidiacono’s methodological choice to use a pooled model, the addition of proper control variables

113

Arcidiacono Report, Appendix B, Table B.7.2.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 66

to his model negates any statistically significant negative effect of Asian-American ethnicity.

139. Before moving on, I want to respond to one additional argument Prof. Arcidiacono

makes that is related to this point. Prof. Arcidiacono points to documents produced in this litigation

from Harvard’s Office of Institutional Research (OIR), summarizing statistical analyses performed by

that office, as supposedly corroborating his findings and his methodology. A careful review of the

relevant analyses, however, indicates that OIR’s research methodology actually supports my

methodological approach over Prof. Arcidiacono’s. Specifically, the documents indicate that OIR

understood that its models were “basic” and “preliminary” and that, like Prof. Arcidiacono’s, they

were missing important factors in the admissions process—particularly non-academic factors. For

example, one of the documents states that “[t]here are a variety of factors that quantitative data is

likely to miss or ratings not capture,” and then lists as examples “[e]xceptional talent,” “[t]he role of

context cases,” “[t]he role of the personal statement/essay,” and “[m]easures of socioeconomic

status.”

114

In other words, OIR’s documents recognize the same limitations in its analysis that I

recognize in Prof. Arcidiacono’s, and thus provide further support for my approach of expanding the

set of control variables to help the model better control for the many non-academic factors that are

important to the admissions process.

5.2.2. When the model is estimated year-by-year, it finds no evidence of a statistically significant negative

effect of Asian-American ethnicity

140. As detailed in Section 5.1.2, in my opinion the correct way to model admissions

decisions at Harvard is to examine each year separately. Prof. Arcidiacono’s model does not do that;

instead, it imposes the unrealistic assumption that Harvard’s admissions process compares applicants

across years and that each factor has the same effect in every year. The reality of the admissions

process is quite different. Candidates compete only against the other candidates applying in that year,

and Harvard’s admissions decisions in each year depend on the specific set of applicants in the pool

that year.

115

Moreover, as noted earlier, certain factors (like the use of Early Action) change from

year to year.

141. In Exhibit 19, I report results for the year-by-year models, my preferred methodology.

What I find is generally consistent with the pooled model. The average marginal effect of Asian-

114

OIR Presentation at HARV00031722.

115

A student who is admitted in a prior year but chooses to defer his admission, or a student offered deferred admission in

a prior year, is considered part of the admitted class in the year for which he will enroll but is still compared, in the

admissions process, only against other applicants in the year when he originally applied.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 67

American ethnicity on applicants’ likelihood of admission across all six years of data

116

is statistically

indistinguishable from zero (-0.02), just like the average marginal effect in the pooled model,

indicating no statistical evidence of bias.

117

142. However, by estimating the model year-by-year, I also gain some important information.

Specifically, in four of the six years the coefficients on Asian-American ethnicity are actually small

and positive—in other words, Asian-American ethnicity (relative to White ethnicity) is associated

with a higher likelihood of admission in those years, controlling for all other factors. The years with

positive estimated effects include three of the four years since the reinstatement of Early Action with

the class of 2016 cycle.

118

116

My pooled model generates a single estimate of the average marginal effect of Asian-American ethnicity on

applicants’ likelihood of admission. By contrast, my year-by-year model generates six different estimates—one for each

class. To ensure that my year-by-year estimates are comparable with Prof. Arcidiacono’s pooled estimate, I average the

six year-by-year estimates to obtain an average effect across all six years of data. This approach allows me to use all the

available years of data but estimate models that more accurately reflect Harvard’s admissions process.

117

This result also holds if I include average Advanced Placement exam scores in the 2017 – 2019 models (the only years

for which they are available in the data). Prof. Arcidiacono excludes these from his pooled model analysis because they

were only available in later years, but he argues that excluding such measures likely causes him to underestimate bias

since these are measures on which Asian-American applicants are relatively strong (Arcidiacono Report, pp. 77–78). His

dataset contains a variable for average AP exam scores for the classes of 2018 and 2019. I increase the coverage of this

variable to include 2017 AP scores (which are stored in a different field) and include the expanded variable in my year-

by-year models for 2017, 2018, and 2019. See workpaper.

118

If I estimate this model treating Hawaiian/Pacific Islander applicants as Asian, the estimated effect becomes positive

(though still statistically insignificant) on average across the six years. See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 68

Year-by-year logit models of admission show no consistent or statistically significant evidence of

bias against Asian-American applicants

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the average marginal effect of race on admission for Asian-American applicants relative to White applicants using

Professor Arcidiacono’s corrected expanded sample. * indicates significance at the 5% level. Marginal effects are reported as percentage

point values.

143. The predictive accuracy of my year-by-year enriched model is higher than that of all of

Prof. Arcidiacono’s models. As shown in Exhibit 20, my preferred model with the additional

information correctly predicts the admissions outcome for 74% of applicants, while Prof.

Arcidiacono’s preferred model (Model 5) correctly predicts the outcome for only 67% of applicants.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 69

Card model has higher predictive accuracy than Prof. Arcidiacono’s preferred model

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the total share of admitted students correctly predicted by the model. Card models use Professor Arcidiacono’s

corrected expanded sample; all other models use Professor Acidiacono’s expanded sample. Predictions assume that the applicants are

admitted in order of their predicted probability of admission from the model.

144. My model includes all applicants, including those who are waitlisted and then admitted

or denied admission from the waitlist. Prof. Arcidiacono presents an analysis comparing the share of

applicants of each race who were waitlisted and then denied admission to the admission rate of all

applicants of each race. He suggests that the fact that Asian-American applicants are more likely to

be denied admission after having been waitlisted, while having the lowest overall admission rate,

reflects bias against Asian Americans.

119

That analysis is fundamentally incomplete and misleading,

and cannot be taken as evidence of bias, because it does not account for the many qualifications that

differ on average between Asian-American and White applicants. My admission model discussed

above, which includes all applicants (including those who were waitlisted) and does account for

differences in qualifications, finds no evidence of bias against Asian-American applicants.

5.2.3. Prof. Arcidiacono’s analysis does not support the conclusion that the personal rating is biased

145. The models discussed above include as a control variable Harvard’s personal rating.

Using an ordered logit model that predicts personal ratings, Prof. Arcidiacono has argued that the

personal rating is biased against Asian-American applicants. Based on this result, he then argues that

the inclusion of the personal rating in the model is inappropriate. As discussed in Section 2 above,

there are several reasons why Prof. Arcidiacono’s statistical evidence of bias in the personal rating is

weak and does not justify the exclusion of the personal rating from his model. Here, I expand on this

issue.

146. First, Prof. Arcidiacono’s model of personal ratings cannot reliably explain the

119

Arcidiacono Report, pp. 31–32.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 70

assignment of personal ratings. The Pseudo R-Squared value of the model is 0.28, which is quite low;

for example, Prof. Arcidiacono’s more reliable model of the academic rating has a Pseudo R-Squared

value of 0.56.

120

Additionally the model has very low predictive accuracy. Of the 47 applicants in

Prof. Arcidiacono’s sample who have personal ratings of 1, his model correctly predicts their rating

zero percent of the time, and of the 30,976 applicants with a rating of 2, it correctly predicts their

rating only 45% of the time.

121

147. As detailed above, a common methodological challenge in assessing the potential for

racial bias using regression models is that a model almost always excludes some relevant

information. This concern is particularly significant in attempting to model Harvard’s personal rating,

which considers many individualized and hard-to-quantify factors (i.e., the “missing data” I discuss

above). Thus, if a regression estimates that race affects applicants’ personal ratings, there is a serious

question whether that estimated effect might actually be explained not by race but by racial

differences in some factor that is not included in the model and that affects the personal rating—in

other words, by omitted-variable bias (or “missing data”). One clear example of such missing data is

an applicant’s personal essay, which according to documents and testimony in this case is an

important consideration in the determination of the personal rating.

122

148. As discussed above, one way to determine if the missing data problem is affecting the

estimated effects of race in a particular model is to consider how the estimated effect in the model

changes as more of the available variables are added to the model. Importantly, Prof. Arcidiacono’s

own regression results show that the estimated effect of Asian-American ethnicity on the personal

rating shrinks as non-academic factors are added to his model of the personal rating. This pattern

suggests that, were more information available, the alleged effect could shrink further. For example,

in Table B.6.7 of Prof. Arcidiacono’s report, the coefficient of Asian-American ethnicity is -0.542 in

Model 3 before he has added controls for neighborhood and school background and for the relevant

ratings that feed into the personal rating. When he adds those controls (in his Model 5), the

coefficient falls to -0.366.

123

If the model could account for unobserved factors like the personal

120

Arcidiacono Report, Appendix B, Table B.6.5 and Table B.6.7.

121

See workpaper.

122

See, for example, Banks Deposition, pp. 79–80 (“Q. And for each of those categories, can you tell me how they were

assigned a numerical score?...[A] Extracurricularly, quality of achievement, strength of performance in any particular

domain, personal qualities, some grasp of the candidate’s personality, interest in other people, cooperation with others, a

sense of responsibility as gleaned from teacher recommendations, personal interview, personal essay, et cetera. Q. Okay.

So for the last category, the—the main inputs you would look at were recommendations, interview, and anything else? A.

The candidate’s essay.”); Walsh Deposition, p. 60 (“Q. How would you calculate that score?…[A.] I would like to take

into consideration whatever relevant information I had were that his essay, her essay, her interview, and the opinions

about that applicant as expressed by others.”); Ray Deposition, pp. 21–22 (“Q. What are the materials that you use—

materials or considerations that go into determining this person’s score?…A. For example, content in recommendation

letters, personal essays.”).

123

Arcidiacono Report, Appendix B, Table B.6.7.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 71

essay, the gap could fall further.

149. Another sign that Prof. Arcidiacono’s regression models of the personal and overall

ratings are not capturing actual bias against Asian-American applicants is that his models find a

statistically significant positive effect of Asian-American ethnicity on the academic and

extracurricular ratings. As noted above in Section 5.1.6, such a pattern calls into question whether the

effects his models attribute to race are more properly explained by factors that are missing from his

models (either because he does not include them or because they are unobservable). If Harvard were

in fact biased against Asian-American applicants, it would make little sense for Harvard to give an

unexplained advantage to Asian-American applicants in the academic and extracurricular ratings. On

the other hand, if Harvard were not biased, but the ratings models were simply missing relevant

variables that explain the differences across race in ratings assignments, it would not be surprising to

see an inconsistent pattern of “bias” across the profile ratings.

150. Further, as detailed in Section 3, the essential function of the ratings is to quantify the

otherwise unobservable information about applicants that admissions officers discern from their

intensive review of each file. It is therefore unsurprising that regression models struggle to reliably

explain the ratings; the whole point of the ratings is to capture information that is hard to measure.

151. Despite my view that Prof. Arcidiacono’s analysis does not support an inference that the

personal rating is biased against Asian-American applicants, I have also conducted an analysis that

assumes for the sake of argument that the personal rating is biased, and therefore removes it from the

model. This approach is an extremely conservative analysis that overcorrects for any concern of bias

in the personal rating, because it completely removes from the model the personal rating (a factor on

which White applicants, in aggregate, are relatively stronger than Asian-American applicants), rather

than removing only the allegedly discriminatory component of the rating. In fact, Prof. Arcidiacono’s

Table 6.1––which uses his personal ratings regression to calculate the share of Asian-American

applicants who would receive a rating of 1 or 2 under the assumption that there was no bias in the

personal rating––shows that White applicants are still, on average, a bit more likely than Asian-

American applicants to have a personal rating of 1 or 2.

124

152. As Exhibit 21 shows, even in this very conservative model that ignores an important

dimension of the admissions process on which White applicants are relatively strong, I still find only

weak and inconsistent evidence of a disparity between Asian-American and White admission rates.

Specifically, I find no evidence of a significant negative effect of Asian-American ethnicity in five of

the six years of data I analyze.

124

Arcidiacono Report, p. 57.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 72

Logit model of admissions removing personal rating

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the average marginal effect of race on admission for Asian-American applicants relative to White applicants using Prof.

Arcidiacono’s corrected expanded sample. * indicates significance at the 5% level. Marginal effects are reported as percentage point values.

153. Additionally, Exhibit 22 shows the average marginal effect of Asian-American ethnicity

if I remove the only class for which there is a statistically significant negative effect (the class of

2018) from my sensitivity analysis that excludes the personal rating. When I focus my analysis on the

five admissions cycles other than 2018, the estimated effect of Asian-American ethnicity in each of

those five years is statistically insignificant and the overall, average estimated effect across all five

years becomes statistically insignificant (falling by 21% relative to the estimated effect over all six

years). In other words, even if I exclude the personal rating from the model, there is no statistically

significant gap in admissions between Asian-American applicants and White applicants outside of the

2018 admissions cycle.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 73

Excluding 2018, logit model of admissions without personal rating shows no evidence of bias

against Asian-American applicants

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the average marginal effect of race on admission for Asian-American applicants relative to White applicants using Prof.

Arcidiacono’s corrected expanded sample. * indicates significance at the 5% level. Marginal effects are reported as percentage point values.

154. Before moving on, I want to respond to three other arguments offered by Prof.

Arcidiacono in support of his claim that the personal and overall ratings are biased. First, Prof.

Arcidiacono’s model of the overall rating, like his model of the personal rating and other non-

academic ratings, is weak; it has a Pseudo R-Squared value of just 0.34.

125

Given the evidence

detailed above that the estimated negative effect of Asian-American ethnicity on applicants’

probability of admission shrinks as available non-academic qualifications are added to the model, and

given that non-academic qualifications are harder to measure than academic qualifications, the small

negative effect that the model attributes to Asian-American ethnicity is not reliable evidence of bias;

it is entirely possible and even likely that that effect is attributable to omitted non-academic variables.

Additionally, Prof. Arcidiacono’s overall rating model has very poor predictive accuracy. Of the 109

applicants in Prof. Arcidiacono’s sample who have overall ratings of 1 (including pluses and

minuses), his model correctly predicts their rating only 18% of the time, and of the 8,124 applicants

with a rating of 2 (including pluses and minuses), it correctly predicts their rating only 28% of the

time.

126

Further, as explained above, I have not included the overall rating in any of my regressions

because it is the one rating that may be influenced by applicants’ race (in the sense that, for example,

the overall ratings of African-American, Hispanic, or Other (AHO) applicants may reflect the

contribution they would make to the racial diversity of the student body). As I have shown above,

even without the overall rating in my regression, I find no evidence of systematic bias in Harvard’s

125

Arcidiacono Report, Appendix B, Table B.6.8.

126

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 74

admissions process against Asian-American applicants.

155. Second, Prof. Arcidiacono suggests that the school support (teacher and guidance

counselor) ratings assigned by Harvard are biased against Asian-American applicants because he

observes that Asian-American applicants with the strongest academic qualifications (defined as those

in the top deciles (4-10) of the academic index) are less likely to receive strong school support ratings

relative to applicants of other races.

127

Again, this conclusion depends on Prof. Arcidiacono’s

assumption that candidates who are strong on academic factors are also strong on non-academic

factors—an assumption that, as discussed above, is not supported by the available data. The teacher

and guidance counselor ratings reflect strength across both academic and non-academic dimensions.

Thus, the small gap between Asian-American and White applicants’ school support ratings may well

be attributable to the fact that Asian-American applicants tend on average to be weaker than White

applicants on the available measures of non-academic factors that Prof. Arcidiacono’s analysis

explicitly ignores by focusing on only deciles of the academic index.

156. Third, Prof. Arcidiacono also suggests that differences between the alumni overall and

personal ratings and Harvard’s admissions officers’ overall and personal ratings show that Harvard’s

personal and overall ratings are biased. But that argument once again depends on Prof. Arcidiacono’s

regression models of the ratings—which, again, are quite low in predictive accuracy and do not

reliably control for the many hard-to-measure factors that are likely very important to the

determination of the ratings. Second, the alumni and admissions-officer ratings are based on different

sources. An alumni personal rating reflects only the alumni interviewer’s brief interaction with the

applicant, whereas the personal rating assigned by Harvard admissions officers considers not just the

alumni interview (to the extent it has occurred before the rating is assigned, which is often not the

case) but also the candidate’s essays, teacher recommendations, secondary school report, and so on.

Alumni ratings are also much more generous in general. For example, 62% of applicants receive an

alumni personal rating of 1 or 2, while only 23% of the sample receive a personal rating of 1 or 2.

128

Moreover, the personal ratings given by the Harvard admissions officers explain much more about

Harvard’s admissions decisions than the alumni interviewer personal ratings do. For Prof.

Arcidiacono’s expanded sample, the Pseudo R-Squared value of a model that controls for only the

personal rating is 0.19, while a model that controls for only the alumni personal rating has a Pseudo

R-Squared value of just 0.08.

129

Given all of this, it is not particularly surprising that there exist

differences in the size of various coefficients across the two models.

127

Arcidiacono Report, p. 48.

128

See workpaper.

129

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 75

5.3. Analysis of key subgroups of the data further contradicts SFFA’s claim of systematic bias

157. To further analyze SFFA’s claim that Harvard’s admissions process discriminates

against Asian-American applicants, I have also examined how the estimated effect of Asian-

American ethnicity differs across time periods and subgroups of the applicant pool.

158. As discussed above in Section 5.1.6, a common methodological challenge when using

regression analysis to test for discrimination is that regressions typically cannot account for all

relevant factors that differ between two groups of people—in this case, between Asian-American and

White applicants. Further, as detailed in Sections 4 and 5 above, it is quite likely that both Prof.

Arcidiacono’s and my regression analyses do not fully account for the many non-academic factors

that are critical to admissions decisions in Harvard’s whole-person process (though my analysis

accounts for such factors more fully than Prof. Arcidiacono’s does). As a result, any gap that exists

between Asian-American and White applicants (or any group of applicants) may in fact reflect

average differences across race on factors not accounted for in the model.

159. One way to examine whether a racial disparity is attributable to bias is to assess whether

it is robust and consistent across subgroups and time periods in the data. If discrimination against

Asian-American applicants were the cause of the racial disparity in admission rates, one would

expect to see a systematic and robust racial difference in admission rates across all relevant

subgroups and time periods. By contrast, if the gap instead reflects differences across race in factors

that Harvard considers when making admission decisions—but that are missing from the model—it is

much more likely that the gap will vary across subgroups because, simply by chance, some subgroups

in the data are likely to be particularly strong or weak, in aggregate, on factors not accounted for in

the model.

160. In this section, I highlight a few patterns in the data that suggest the latter hypothesis is

more plausible. Specifically, as I discuss below, I find that the alleged effect of Asian-American

ethnicity is particularly small (and in fact positive rather than negative in most years—though

statistically insignificant) for two very large subgroups of Asian-American applicants—female

Asian-American applicants and Asian-American applicants applying from California dockets. I also

discuss how the fluctuation in the effect of Asian-American ethnicity on admissions from year to year

is inconsistent with the claim that Harvard’s admissions process is biased.

5.3.1. Asian-American ethnicity is associated with, if anything, a higher likelihood of admission for female

applicants

161. When my model is estimated only on female applicants, Asian-American ethnicity is

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 76

associated with a slightly higher probability of admission (though the difference is not statistically

significant). Exhibit 23 shows the results of my model for just the female sample. The effect of

Asian-American ethnicity is positive in five of six years and overall (and insignificant across the

board).

Average marginal effect of Asian-American ethnicity on admission is insignificant for Asian-

American women

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Table shows the average marginal effect of race on admission, for Asian-American applicants relative to White applicants, using

Professor Arcidiacono’s corrected expanded sample. * indicates significance at the 5% level. Marginal effects are reported as percentage

point values.

162. This pattern is particularly interesting because Asian-American women are stronger on

non-academic dimensions than Asian-American applicants as a whole. Exhibit 24 shows that while

Asian-American men are stronger than Asian-American women on the academic rating, Asian-

American women are stronger on two of the three non-academic ratings, including the personal

rating. Additionally, Asian-American women are more likely to be multi-dimensional (i.e. have three

or more ratings of 2 or better) than Asian-American men. In other words, Asian-American women are

a bit less strong on academics than Asian-American men, but make up for it by being relatively

stronger on other dimensions.

163. The fact that Asian-American female applicants are stronger on non-academic factors

than Asian-American male applicants, are more multi-dimensional than Asian-American male

applicants, and, if anything, may have a small advantage over White female applicants is consistent

with my interpretation that any unexplained gap between Asian-American and White applicants in

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 77

the models is in fact driven by average differences in unmeasured non-academic factors, rather than

by discrimination against Asian-American applicants.

Asian-American female applicants are stronger on non-academic measures and more multi-

dimensional than Asian-American male applicants

Source: Arcidiacono Data

Note: Data are from Asian-American applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s corrected expanded sample.

Ratings of 2- and above are classified as “2 or better” in this analysis. +/- rating designations are available in the data beginning with the

class of 2019.

5.3.2. Asian-American ethnicity is associated with a higher likelihood of admission for applicants on

California dockets

164. I also find that Asian-American ethnicity is associated with a slightly (though not

statistically significantly) higher probability of admission for applicants on California dockets —a

useful focal point for analysis because nearly 30% of Asian-American applicants are on California

dockets.

165. If Harvard’s admissions process sought to limit the number of Asian-American

applicants, it would be unlikely to favor Asian-American applicants relative to White applicants in

the region in which Asian-American applicants are most concentrated. Yet, when I estimate my logit

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 78

model on applicants from California dockets only, I find that Asian-American applicants are, if

anything, slightly more likely to be admitted than White applicants with the same observable

characteristics. This result does not suggest that Harvard is biased in favor of Asian-American

applicants on California dockets; it suggests, instead, that any perceived negative effect of Asian-

American ethnicity in the national pool is more likely explained by factors omitted from the model

that vary across regions.

Admission rates for Asian-American applicants on California dockets are, if anything, higher than

those of White applicants once available factors are controlled for

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Sample consists of applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s corrected expanded sample who are applying

from California dockets. Average marginal effects are calculated from the Card Model. * indicates significance at the 5% level. Marginal

effects are reported as percentage point values.

166. Exhibit 25 presents the estimated marginal effect of Asian-American ethnicity for

applicants on California dockets. That effect is positive in five of six years and overall (and

insignificant in all years). These findings provide further evidence that Harvard’s admissions process

exhibits no evidence of systematic discrimination against Asian-American applicants relative to

White applicants.

5.3.3. Evidence of the alleged disparity is also inconsistent across years

167. As noted above, my admissions model also exhibits year-by-year variation in the

estimated effect of Asian-American ethnicity on applicants’ likelihood of admission. For example, in

my preferred specification, the estimated effect of Asian-American ethnicity is negative in some

years and positive in others, with four of the six years exhibiting a positive (albeit still statistically

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 79

insignificant) association between Asian-American ethnicity and applicants’ likelihood of admission,

and two of the six years a negative (albeit still statistically insignificant) association.

168. Even in my sensitivity analysis in which I exclude the personal rating from the model,

the estimated effect of Asian-American ethnicity is not consistent from year to year. As noted above,

the estimated effect of Asian-American ethnicity is statistically significant only in the class of 2018

admissions cycle, and when that cycle is excluded, the average estimated effect across the other five

years is not statistically significant. Additionally, the estimated effect of Asian-American ethnicity

changes from positive to negative between years, with two of the six years being positive in this

model.

169. If Harvard’s admissions process were biased against Asian-American applicants

throughout this whole time period (as SFFA alleges), one would expect see a more consistent pattern

from year to year. The fact that the alleged “bias” fluctuates above and below zero from year to year

is more consistent with applicant pools from different years having a slightly different mix of

unmeasured, non-academic factors across ethnic groups that the model cannot perfectly account for,

than it is with the allegation of systematic bias against Asian-American applicants.

5.4. Conclusion

170. In this section, I have developed a statistical model that improves Prof. Arcidiacono’s

model by including in it a wide variety of factors that Harvard considers when making admissions

decisions and that Prof. Arcidiacono did not include in his model. My model also more accurately

reflects Harvard’s yearly admissions process in which applicants are compared only to other

applicants applying in the same year and not to applicants applying in other years.

171. I find no evidence of systematic bias against Asian-American applicants relative to

White applicants, after controlling for the many differences between these groups. While Asian-

American applicants tend to have stronger academic qualifications, White applicants tend to be

stronger on non-academic dimensions. Prof. Arcidiacono’s model places a great deal of weight on

academic qualifications (including both the academic rating and the academic factors that inform that

rating), while omitting information related to each candidate’s life circumstances, including detailed

variables describing each high school and neighborhood in the data. When I add such measures to the

model to better account for the differences across all dimensions that Harvard considers (and for

which I have data), I find no statistically significant negative effect of Asian-American ethnicity

relative to White ethnicity on applicants’ probability of admission. Furthermore, the estimated effect

of Asian-American ethnicity relative to White ethnicity is positive in four of the six years.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 80

172. I also estimate a version of my model that assumes, as Prof. Arcidiacono alleges, that the

personal rating is biased against Asian-American applicants. I show that, even if the personal rating is

completely excluded from the model, there is at most weak evidence of a negative effect of Asian-

American ethnicity on applicants’ likelihood of admission. In any event, Prof. Arcidiacono’s findings

of bias in the personal rating are weak for several reasons. His models of the various ratings (aside

from the academic rating) have low explanatory power. Additionally, he finds a significant and

positive effect of Asian-American ethnicity on two of the four profile ratings, which casts doubt on

whether the results actually reveal racial bias rather than simply the effect of unobservable factors

that differ across race. Collectively, the results of Prof. Arcidiacono’s ratings regressions are more

consistent with the absence of relevant difficult-to-quantify information from the database (or from

Prof. Arcidiacono’s models) than with systematic bias against Asian-American applicants.

173. Finally, I find that the alleged disparity between Asian-American and White admission

rates is inconsistent from subgroup to subgroup and from year to year. I find particularly weak

evidence of bias against female Asian-American applicants and Asian-American applicants on

California dockets. If anything, Asian-American applicants in those two groups are admitted at

slightly higher rates than comparable White applicants, controlling for relevant factors. Since 30% of

Asian-American applicants are on California dockets, and half are female, it is hard to reconcile those

findings with SFFA’s claim that Harvard intentionally and systematically discriminates against

Asian-American applicants on the basis of their race. I also find that the effect of Asian-American

ethnicity fluctuates from year to year, and is positive in four of six years. I am not aware of any basis

to believe that Harvard’s process was somehow biased in some years but not others. Again, these

results—taken together—suggest that any estimate of a negative effect of Asian-American ethnicity

at the national level reflects not racial discrimination but rather the effect of factors that are omitted

from the model because they cannot be quantified, and that vary across genders, regions, and years.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 81

6. AVAILABLE DATA DO NOT INDICATE THAT RACE IS A DETERMINATIVE FACTOR IN

ADMISSIONS AT HARVARD

174. In this section, I turn to a different research question regarding the importance of race in

Harvard’s admissions process. Using the regression model developed in Section 5 above, I explore

the size of the estimated effect of an applicant’s race or ethnicity on her likelihood of admission,

relative to the effect of the many other factors Harvard considers in its whole-person analysis.

175. Exhibit 26 summarizes the estimated average marginal effect of each racial category on

an applicant’s likelihood of admission. As already discussed in Section 5 above, the estimated effect

of Asian-American ethnicity is statistically indistinguishable from zero in every year. The estimated

effect of African-American ethnicity ranges from 5.20 percentage points to 7.43 percentage points,

and averages 6.12 percentage points, while the estimated effect of Hispanic and Other races (such as

Native American, and Hawaiian/Pacific Islander) ranges from 3.12 percentage points to 4.16

percentage points, and averages 3.73 percentage points.

Average marginal effect of race on the probability of admission

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Tables shows the estimated average marginal effect of race on admission, for each listed race, using Professor Arcidiacono’s

corrected expanded sample. * indicates significance at the 5% level. Marginal effects are reported in percentage point values.

176. In the remainder of this section, I offer a variety of analyses that provide context for how

important or unimportant race is relative to other factors in the admissions process. I find that (a) the

importance of race in explaining admissions decisions is substantially smaller than that of other key

factors Harvard considers; (b) even when race plays a role in admissions decisions, other applicant

attributes play a significant role as well; and (c) the effect of race is smaller than that of

individualized, unmeasured factors that are independent of race. All of these facts indicate that,

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 82

although race plays a role in admissions decisions, it is only one of a variety of factors considered and

is not the determinative factor.

6.1. Race is less important than other factors in admissions decisions

177. A starting point for estimating the effect of race relative to other factors in the

admissions process is to compare how effectively race explains admissions outcomes, relative to

other important factors Harvard considers in admissions. If race were a determinative factor, then

knowing an applicant’s race would allow one to predict with a high degree of certainty whether or not

the applicant is admitted.

178. Exhibit 27 reports the Pseudo R-Squared value for regressions of admissions outcomes

for the class of 2019 that include only racial categories as control variables, as well as regressions that

include only control variables other than race. As discussed above, Pseudo R-Squared is a statistic

that captures how well a variable (or set of variables) can explain admission decisions. It takes on

values from zero to one and is meant to approximate the share of the variation in actual admission

decisions that can be explained by the variables in the model. As shown in Exhibit 27, a regression

that includes only the variables for racial categories has a tiny Pseudo R-Squared value—just 0.002.

That means that race alone explains almost nothing about admissions outcomes. For comparison’s

sake, the profile ratings collectively explain a much larger proportion of the variability in admissions

outcomes (Pseudo R-Squared value of 0.33). School support ratings and alumni interview ratings

have Pseudo R-Squared values of 0.19 and 0.13, respectively. Even contextual factors that I include

in my model but that Prof. Arcidiacono does not include in his—such as College Board high school

and neighborhood variables, parental occupation, and intended career—explain more about

admissions decisions than race.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 83

Many factors better explain admission decisions than race

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Data are from applicants to the class of 2019 in Professor Arcidiacono’s corrected expanded sample.

179. In Table 7.1 of his report, Prof. Arcidiacono shows that, according to his model, many

Asian-American applicants with a 25% estimated likelihood of admission would have an estimated

likelihood of admission of over 90% if they were African-American.

130

That is a misleading and

incomplete way to measure the relative importance of race for at least two reasons.

180. First, Prof. Arcidiacono has misleadingly selected a particular combination of applicant

characteristics for which the effect of race is largest. Exhibit 28 provides a fuller analysis that

examines the effect of race across all applicants, rather than a single example. It shows the average

estimated effect of race on probability of admission for African-American and Hispanic and Other

applicants (relative to White applicants) according to my year-by-year model for each decile of the

admissions index—a metric Prof. Arcidiacono has used in his analysis that measures the predicted

probability of admission absent consideration of race. As is clear, the higher likelihood of admission

associated with African-American ethnicity averages 13 percentage points or less for applicants in the

first nine deciles (that is, 90% of African-American applicants). For applicants in the highest decile

130

Arcidiacono Report, pp. 65–66.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 84

(the strongest applicants), it averages 47 percentage points. For applicants of Hispanic or Other (non-

Asian) minority race, the estimated effect of race averages seven percentage points or less for

applicants in the first nine deciles and 29 percentage points for applicants in the highest decile.

Average marginal effect of race is small for the vast majority of AHO applicants

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Sample consists of applicants to the classes of 2014 – 2019 in Professor Arcidiacono’s corrected expanded sample. Deciles are

constructed by race based on the predicted probabilities of admission when the race factor is turned off. Marginal effects are calculated

relative to White applicants using Card year-by-year admissions model. Marginal effects are reported as percentage point values.

181. Second, the applicants with the largest estimated positive effect of race on their

likelihood of admission are the strongest applicants—i.e., those whose estimated likelihood of

admission is in the top 10% of the applicant pool absent consideration of race. Race is not a

“determinative” factor for such applicants, even if it has a significant positive effect on their

likelihood of admission, because they are strong in other respects. One way to see this fact is that

77% of AHO admitted students have at least two profile ratings of 2 or better.

131

Applicants with at

least two profile ratings of 2 or better already have an admission rate of 23%.

132

131

See workpaper.

132

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 85

6.2. Race is less important than unmeasured, individualized factors

182. It is also possible to compare the effect of race in Harvard’s admissions process to that of

individualized, unmeasured factors—that is, factors not captured by the model. One way to do that is

to examine the predicted probability of admission for each applicant and compare that to the actual

admission decision for the applicant.

183. As discussed in Section 5.1.6 above, if the model generates a predicted probability of

admission close to zero for a candidate who was rejected or a predicted probability of admission

close to one for a candidate who was admitted, one can conclude that the variables in the model allow

the researcher to be very confident about that applicant’s admissions outcome. If, however, the model

generates a predicted probability of admission of, say, 0.10 for a given candidate who was actually

admitted, one can conclude that the variables in the model do not allow the researcher to explain with

any degree of certainty why the applicant was admitted. In other words, it is the unquantifiable

factors that ultimately determined whether the candidate was admitted. More generally, one can

quantify the importance of such factors by using the “error” term from the model—that is, the actual

admission outcome (1=admitted) minus the estimated admission outcome (0.10)—which measures

the relative importance of factors specific to that individual that are not included in the model.

184. To give a concrete example, consider an applicant who is not admitted and whose SAT

scores and GPA are so low that it is essentially impossible for the applicant to be admitted. For such

an applicant, one can conclude that unquantified factors not present in the model are not a major

factor in the decision—the observable information on academic achievements is sufficient to

understand the decision. The applicant’s estimated likelihood of admission will be close to zero, and

the applicant’s actual admissions outcome will be zero (not admitted), so the error term will be very

small. On the other hand, consider an applicant with an academic rating of 3, an extracurricular rating

of 2, and a personal rating of 2. Suppose the model predicts the applicant has a 40% chance of

admission, and ultimately she is in fact admitted. What I conclude from such information is that other

factors that are specific to that candidate that are not observed in the model explain 60% of the

outcome—the difference between the applicant’s actual likelihood of admission (100%) and her

estimated likelihood of admission according to the model (40%).

185. By comparing the marginal effect of race for any given applicant to the error in the

model, it is possible to compare the role of race in the admissions process to the role played by

unobserved factors that are independent of race. Exhibit 29 shows that the portion of the admissions

decision attributable to unobserved characteristics of each individual applicant is greater than the

effect of race for 100% of Asian-American applicants, for 94% of African-American applicants, and

for 96% of Hispanic or Other applicants. In other words, in nearly all cases, race matters less to an

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 86

applicant’s admissions outcome than individualized factors that are not in the model.

Average marginal effect of race is small compared to importance of unobserved characteristics

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Data are from applicants to the classes of 2014 – 2019 in Prof. Arcidiacono’s corrected expanded sample. Absolute deviations and

marginal effects are reported as percentage point values. Table shows the average marginal effect of race on admission relative to White

applicants. Absolute deviation is computed by taking the absolute value of the difference between the actual admitted status and the

predicted probability of each applicant. Absolute deviation is compared with the absolute value of the marginal effect for each applicant.

186. Exhibit 30 shows the effect of race relative to other observed and unobserved factors

focusing only on applicants who were admitted to Harvard. Each bar shows the relative effect of

three different groups of factors in the model: race, observable factors other than race, and

unobservable factors that are specific to individuals and not captured in the model. Even for African-

American admitted students, race explains less than half (42%) of the variability in admissions

outcomes. For Hispanic or Other minority race applicants, race explains only 26% of the variability

in admissions outcomes. In other words, non-race factors play a large role.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 87

Non-racial factors play the dominant role in admissions decisions

Source:

Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Data are from students admitted to the classes of 2014 – 2019 in Prof. Arcidiacono’s corrected expanded sample. Average effect of

race is computed as the average marginal effect of race on admission relative to White applicants. Average effect of non-race observable

characteristics is computed as the average difference between the predicted probability and the marginal effect of race. Average effect of

unobservable characteristics is computed as the mean absolute deviation. Absolute deviation is computed by taking the absolute value of

the difference between the actual admitted status (0 or 1) and the predicted probability of admission for each applicant.

6.3. Prof. Arcidiacono’s claim about a “floor” for the admission rate of African-American applicants is

not supported by available data

187. Prof. Arcidiacono also asserts that, starting with the class of 2017, Harvard intentionally

sought to match the admission rate for African-American applicants to the admission rate for all

applicants. That assertion is not supported by available data.

188. Prof. Arcidiacono claims that the impetus for this practice was that, beginning with the

class of 2017, “Harvard adopted a new methodology for coding race and ethnicity that was consistent

with federal standards for reporting of race and ethnicity.”

133

Under that methodology—known as the

Integrated Postsecondary Education Data System (IPEDS) methodology—students who identify as

133

Arcidiacono Report, p. 27.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 88

African-American and another race are counted as “multiracial,” not as African-American. The

IPEDS methodology contrasts with Harvard’s historical method for classifying race (the “Old

Methodology”), which categorizes any applicant who identifies as African-American as African-

American, whether or not that applicant also identifies with another racial or ethnic group. It also

contrasts with Harvard’s current preferred method for classifying race (the “New Methodology”),

which counts applicants in as many racial categories as they choose to identify on their applications

(so that an applicant who identified as African-American and, say, Asian-American would be counted

in both categories). Professor Arcidiacono argues that the IPEDS methodology “prompted concern at

Harvard that the new reporting would understate the number of African-American admits to

Harvard.”

134

He argues that this concern drove Harvard to impose a floor on the African-American

admission rate.

189. In this section I consider both the substance of Prof. Arcidiacono’s claim, as well as

whether the data are consistent more broadly with the idea that Harvard is imposing a floor on the

admission rate for African-American applicants.

190. As an initial matter, Prof. Arcidiacono does not explain why Harvard would care about

manipulating the admission rate of candidates who are African-American according to the IPEDS

methodology. First, Harvard does not publicly release admission rates by race, so it is unclear why

Harvard would be sensitive to the public perception of its admission rates by race.

135

Second, when

Harvard publicly announces the racial composition of the admitted and matriculating classes (as

opposed to the admission rates), it does so using its own definitions of race—first the “Old

Methodology” (used since at least the class of 1980) and now the “New Methodology” discussed

above. Harvard does not publicly report racial statistics using the IPEDS methodology.

136

Finally, the

IPEDS methodology was not new in the class of 2017 admissions cycle; in accordance with federal

reporting requirements, Harvard had already been reporting race to the government using the IPEDS

134

Arcidiacono Report, p. 28.

135

Fitzsimmons Deposition, pp. 453–454 (“Q. Does Harvard publicly report its admission rate by ethnicity? A. No.”).

136

Yong Deposition, p. 138 (“Q. But you don’t use the IPEDS methodology? A. Not for press releases.”); Fitzsimmons

Deposition, pp. 100–101 (“Q. When Harvard reports its results in the Harvard Gazette, does it use the IPEDS

methodology or the new methodology to describe the ethnic characteristics of the class? A. It would use the new

methodology.”); Table, Aggregate applicant data 1980 – 2018, HARV00023177 – 8.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 89

methodology since the 2010–11 school year (entering class of 2014), three years earlier.

137

191. Furthermore, Prof. Arcidiacono’s selective focus on the admission rate as defined using

the IPEDS methodology presumably reflects the fact that admission rates calculated using Harvard’s

own preferred methodologies do not show the effect he regards as problematic—in other words, the

admission rate for African-American candidates (as defined using the New Methodology and Old

Methodology) does not match the overall admission rate. For example, for the class of 2016, the

African-American admission rate based on both the “New Methodology” and the “Old Methodology”

was nearly a half point below the admission rate of all other applicants.

138

192. In addition, if Harvard had lowered its admission standards to ensure an artificially high

admission rate for African-American applicants, one might expect to see a decline in the relative

quality of African-American admitted students starting in the class of 2017. No such decline

occurred.

139

Further, the estimated positive effect of African-American ethnicity on applicants’

likelihood of admission (based on my regression analysis in Exhibit 26) is generally smaller for

applicants to the classes of 2017 to 2019 than for applicants to the classes of 2014 and 2015. If

Harvard implemented a floor for the admission rate of African-American students starting with the

class of 2017, the regression model should show a larger positive association between African-

American ethnicity and likelihood of admission in later years than in prior years—not a smaller one.

193. Finally, Harvard has produced aggregate admission data by race, using its Old

Methodology, that extend back to 2000. Using that aggregate data I can examine the fluctuations

from year to year in admissions decisions by race, and assess whether such fluctuations are in any

way consistent with a “floor” in admissions for African-American applicants, and/or a substantive

change starting with the class of 2017.

194. Exhibit 31 through Exhibit 34 report the year-to-year fluctuations in the racial

composition of the admitted class. There is no evidence that Harvard has sought to achieve a

consistent proportion of African-American students. To the contrary, the share of admitted students

who are African-American fluctuates considerably from year to year, by as much as 14%. Similar

patterns exist for all races. For example, despite SFFA’s claims that Harvard seeks to limit the share

of its class that is Asian-American, Exhibit 32 shows that the share of the class that is Asian-

American has fluctuated significantly.

137

Harvard Memo, “A Note on the Collection and Reporting of Data on Race and Ethnicity,” HARV00065450 – 52 at

HARV00065450 – 51; “Resources for Implementing Changes to Race/Ethnicity Reporting in IPEDS,” National Center

for Education Statistics, available at https://nces.ed.gov/ipeds/Section/Resources, accessed December 1, 2017.

138

See workpaper.

139

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 90

The fraction of admitted students who are White fluctuates over time

Source: HARV00001848 – 1850; Augmented Arcidiacono Data

Note: Sample consists of domestic applicants who are classified as White under the Old Methodology.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 91

The fraction of admitted students who are Asian-American fluctuates over time

Source: HARV00001848 – 1850; Augmented Arcidiacono Data

Note: Sample consists of domestic applicants who are classified as Asian-American under the Old Methodology.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 92

The fraction of admitted students who are African-American fluctuates over time

Source: HARV00001848 – 1850; Augmented Arcidiacono Data

Note: Sample consists of domestic applicants who are classified as African-American under the Old Methodology.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 93

The fraction of admitted students who are Hispanic or Other fluctuates over time

Source: HARV00001848 – 1850; Augmented Arcidiacono Data

Note: Sample consists of domestic applicants who are classified as Hispanic or Other under the Old Methodology.

6.4. Conclusion

195. As detailed above, I find little evidence that race is a determinative factor in the

admissions process. Specifically, I find that race explains much less about applicants’ likelihood of

admission than numerous other factors Harvard considers.

196. I also examine Prof. Arcidiacono’s claim that the marginal effect of race can be quite

large for certain individual candidates. I find that the marginal effect of race averages 13 percentage

points or less for 90% of African-American applicants and averages 7 percentage points or less for

90% of Hispanic or Other applicants. And for the small number of applicants for whom race plays a

more significant role, other non-race factors also substantially affect the applicants’ likelihood of

admission. Further, I find that the average marginal effect of race is less than that of individualized,

unmeasured factors that are independent of race. For admitted AHO applicants in particular, race

explains only about 34% of the variation in admissions outcomes.

140

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 94

197. Finally, I also consider Prof. Arcidiacono’s claim that Harvard began manipulating its

admission rate for African-American applicants—as defined using the IPEDS methodology—starting

with the class of 2017 admissions cycle. It is highly implausible that Harvard would attempt to

manipulate that particular statistic, which it does not release to the public. And, indeed, a review of

the data using Harvard’s preferred methods for categorizing applicants by race does not show the

effect Prof. Arcidiacono observes.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 95

7. ANALYSIS OF POTENTIAL RACE-NEUTRAL ALTERNATIVES

198. In this section, I address the question of how the racial composition and other attributes

of Harvard’s admitted class would be expected to change if Harvard stopped considering race and

instead pursued a variety of race-neutral ways of seeking to increase the racial diversity of its

admitted class. My findings indicate that, to the extent race-neutral practices can enable Harvard to

achieve racial diversity, they would do so only by altering other characteristics of the admitted class

that I understand matter to Harvard.

199. I begin by surveying the academic literature on race-neutral alternatives—including

papers suggested by SFFA and its expert, Richard Kahlenberg—in order to identify admissions

practices that have been posited to be effective at increasing racial diversity. I also discuss the

literature evaluating these practices and whether they could work at a highly selective university like

Harvard. Next, I use Harvard’s admissions data to demonstrate how various potential race-neutral

admissions practices would be expected to affect the racial composition and other attributes of

Harvard’s admitted class. Finally, I discuss Mr. Kahlenberg’s analysis of race-neutral alternatives.

200. I reach several conclusions. First, Harvard already engages in extensive race-neutral

efforts to increase the racial diversity of its student body. Second, consistent with the academic

literature, I find that Harvard’s use of additional race-neutral efforts to increase racial diversity would

not likely enable it to achieve a comparably diverse class if it did not consider race in admissions. To

the extent that the use of race-neutral alternatives did enable Harvard to achieve a comparably diverse

class, it would likely have a substantial deleterious effect on the quality of the admitted class along

many dimensions. Finally, I find that Mr. Kahlenberg’s proposed race-neutral alternatives do not

depart from this pattern—that is, they either are ineffective at generating a racially diverse class, or

would significantly alter the composition of the admitted class along other dimensions.

7.1. Race-neutral alternatives identified in academic literature and by SFFA

201. To develop a comprehensive list of race-neutral alternatives that Harvard could consider,

I first considered the race-neutral alternatives identified by SFFA and its expert, Mr. Kahlenberg,

then explored the academic literature for additional race-neutral alternatives that SFFA potentially

overlooked. In this section, I summarize the race-neutral alternatives I found.

7.1.1. Race-neutral alternatives identified by SFFA and its expert

202. In Section X of the Complaint and in the Kahlenberg Report, SFFA and Mr. Kahlenberg

list a series of race-neutral alternatives that have been identified in academic literature, and that they

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 96

believe would allow Harvard to achieve racial diversity without considering race.

203. First, SFFA and Mr. Kahlenberg suggest that Harvard should eliminate admissions

practices that supposedly diminish racial diversity—namely (1) the consideration of whether an

applicant’s parents attended Harvard or Radcliffe (i.e., whether the applicant is a “lineage applicant”),

(2) the consideration of whether an applicant’s parents are members of Harvard’s faculty or staff,

(3) the practice of offering applicants deferred admission to a class subsequent to the one for which

they applied, (4) the alleged consideration of whether an applicant’s family has contributed or has the

ability to contribute financially to Harvard, (5) the practice of tracking the admissions status of

candidates of particular interest to Harvard’s Dean and Director of Admissions, and (6) the practice

of Early Action admissions. Mr. Kahlenberg also suggests that removing consideration for recruited

athletes could help foster racial diversity, though he does not include this practice in his preferred

simulation (explaining that it “is sometimes perceived as radical”).

141

204. Second, SFFA and Mr. Kahlenberg suggest that Harvard should increase the

consideration it affords in the admissions process to students of lower socioeconomic status. Mr.

Kahlenberg also suggests that, to do so, Harvard should make available to admissions officers

whatever information its Financial Aid Office may possess about applicant’s family income and

wealth.

142

205. Third, SFFA and Mr. Kahlenberg suggest that Harvard should increase the financial aid

it offers, on the theory that doing so would attract more applicants and matriculants of lower

socioeconomic status.

143

206. Fourth, SFFA and Mr. Kahlenberg suggest that Harvard adopt geography-based

preferences, such as a “percent plan” under which it would admit the top students from each high

school or each ZIP code.

144

207. Fifth, SFFA and Mr. Kahlenberg suggest that Harvard increase its efforts to recruit a

141

Kahlenberg Report, pp. 31–34, p. 41, and p. 46.

142

Kahlenberg Report, pp. 23–29.

143

Kahlenberg Report, pp. 29–31.

144

Kahlenberg Report, pp. 36–39.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 97

diverse applicant pool.

145

208. Sixth, SFFA and Mr. Kahlenberg suggest that Harvard could increase racial diversity by

accepting more transfer applicants, particularly from community colleges.

146

7.1.2. Additional race-neutral alternatives identified in the literature

209. I also reviewed the academic literature discussing race-neutral alternatives, and this

review indicates that SFFA’s list of race-neutral alternatives is generally comprehensive. One race-

neutral strategy for increasing racial diversity that SFFA does not mention but that is discussed in the

academic literature is reducing or eliminating consideration of standardized test scores. That strategy

is predicated on the theory that standardized tests may advantage students who attend better schools

and have more resources for test preparation, who are more likely to be White or Asian-American.

147

For completeness, I include this practice in my analyses below.

7.2. Academic research indicates that race-neutral alternatives diminish universities’ ability to select for

quality

210. Many academics have studied the efficacy of race-neutral alternatives in generating a

high-quality, racially diverse student body without considering race in the admissions process. While

there is general agreement that race-neutral alternatives can help increase racial diversity relative to

an admissions regime that does not consider race, there is little empirical evidence that race-neutral

alternatives have produced diverse student bodies comparable to those attained under race-conscious

regimes at selective institutions, where researchers note that race-neutral policies may be less

effective.

148

Furthermore, the literature indicates that the replacement of race-conscious admissions

with race-neutral alternatives introduces an unavoidable tradeoff between the quality and racial

145

Kahlenberg Report, pp. 39–40.

146

Kahlenberg Report, pp. 41–42.

147

John Brittain and Benjamin Landy, “Reducing Reliance on Testing to Promote Diversity,” in The Future of Affirmative

Action, ed. Richard Kahlenberg (Century Foundation Press, 2014), pp. 160–174 at p. 161; Anthony P. Carnevale, Stephen

J. Rose, and Jeff Strohl, “Achieving Racial and Economic Diversity with Race-Blind Admissions Policy,” in The Future

of Affirmative Action, ed. Richard Kahlenberg (Century Foundation Press, 2014), pp. 187–202 at pp. 189 and 193.

148

Halley Potter, “Transitioning to Race-Neutral Admissions: An Overview of Experiences in States Where Affirmative

Action Has Been Banned,” in The Future of Affirmative Action, ed. Richard Kahlenberg (Century Foundation Press,

2014), pp. 75–90 at pp. 88–89; Thomas J. Kane, “Racial and Ethnic Preferences in College Admissions,” Ohio St. Law

Journal 59, 1998, pp. 971–996 at pp. 972 and 992; Sean Reardon, Rachel Baker, and Daniel Klasik, “Race, income, and

enrollment patterns in highly selective colleges, 1982-2004,” Center for Education Policy Analysis, Stanford University,

2012, pp. 1–25 at p. 4.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 98

diversity of an admitted class. In essence, the literature concludes, universities that attempt to achieve

racial diversity without considering race have lesser ability to choose the highest-quality class than if

they were able to consider race.

149

7.2.1. Race-neutral alternatives do not achieve the same level of racial diversity as race-consciousness at

selective universities

211. Economic research on alternative admissions policies firmly supports the effectiveness

of race-conscious admissions in achieving racial diversity at selective institutions. The economic

literature studying attempts to produce racial diversity without considering race tends to focus on the

efficacy of race-neutral alternatives in generating a substantial fraction of African-American and

Hispanic students, largely because those are the groups whose representation falls most significantly

when universities remove consideration of race from admissions.

212. Thomas Espenshade and Chang Chung (2005), for example, conduct simulations for

three elite private research universities (which they do not identify). They find that eliminating the

consideration of race in admissions would notably reduce the share of African-American and

Hispanic students among admitted students, and that consideration of lineage and athletic-recruit

status has little effect on African-American and Hispanic representation.

150

213. Economic research regarding bans on race-conscious admissions in Texas, Florida,

California, and Washington suggests that those bans adversely affected racial diversity, especially at

more selective schools.

151

A separate analysis of the California ban studied the efficacy of a battery of

alternative admissions practices, including a preference for applicants of low socioeconomic status.

149

Jimmy Chan and Erik Eyster, “Does Banning Affirmative Action Lower College Student Quality?,” American

Economic Review 93(3), 2003, pp. 858–872 at pp. 858–856; Mark Long, “Is There a ‘Workable’ Race-Neutral

Alternative to Affirmative Action in College Admissions?,” Journal of Policy Analysis and Management 34(1), 2015, pp.

162–183 at p. 167; Mark Long, “The Promise and Peril for Universities Using Correlates of Race in Admissions in

Response to the Grutter and Fisher Decisions,” ETS White Paper, 2015, pp. 1–31 at p. 13; Glenn Ellison and Parag

Pathak, “The Efficiency of Race-Neutral Alternatives to Race-Based Affirmative Action: Evidence from Chicago’s Exam

Schools,” NBER Working Paper #22589, 2016, pp. 1–59 at p. 51; Roland Fryer, Glenn Loury, and Tolga Yuret, “An

Economic Analysis of Color-Blind Affirmative Action,” The Journal of Law, Economics, & Organization 24(2), 2007,

pp. 319–355 at p. 1; Roland Fryer and Glenn Loury, “Affirmative Action and Its Mythology,” The Journal of Economic

Perspectives 19(3), 2005, pp. 147–162 at pp. 150–153.

150

Thomas Espenshade and Chang Y. Chung, “The opportunity cost of admission preferences at elite universities,” Social

Science Quarterly 86(2), 2005, pp. 293–305 at p 298.

151

Peter Hinrichs, “The effects of affirmative action bans on college enrollment, educational attainment, and the

demographic composition of universities,” The Review of Economics and Statistics 94(3), 2012, pp. 712–722 at p. 712.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 99

None of the alternative practices analyzed was able to produce a student body with diversity

comparable to that predating the ban on race-conscious admissions practices.

152

7.2.2. Much of the literature focuses on universities far less selective than Harvard

214. While some universities have used race-neutral alternatives with moderate success in

achieving racial diversity, those universities tend to be far less selective than Harvard, making it

easier for them to attract applicants who do not reduce the quality or alter the character of the student

body. Halley Potter, a colleague of Mr. Kahlenberg on whose work he relies, studied eleven flagship

state universities that were barred from using race in admissions. Of those eleven schools, seven were

able to achieve African-American and Hispanic enrollment comparable to that attained before the

ban; four were not. Importantly, the three most selective schools in the sample—UC-Berkeley,

Michigan, and UCLA, the schools most similar to Harvard—were among the four schools not able to

attain pre-ban levels of representation for African American and Hispanic students.

153

As Potter

explains, scholars have yet to identify race-neutral strategies that work well for selective institutions:

Selective colleges have a smaller pool of qualified applicants to begin

with, and these applicants are more likely to be considering a variety of

in- and out-of-state college options. As a result, selective colleges may

face greater challenges in terms of recruiting additional applicants from

underrepresented demographics… [I]dentifying effective diversity

strategies for selective campuses under race-neutral admissions is an

important area for future research.

154

215. Instead of focusing on the efficacy of race-neutral alternatives at selective institutions,

Mr. Kahlenberg chooses to highlight the handful of large, less selective public schools that (he

argues) were able to employ race-neutral alternatives to attain diverse classes comparable to those

before the consideration of race was banned. Examples include Texas A&M, the University of

Washington, the University of Nebraska, the University of Arizona, and the University of Georgia.

155

152

Daniel Koretz, Michael Russell, Chingwei David Shin, Cathy Horn, Kelly Shasby, “Testing and Diversity in

Postsecondary Education: The Case of California,” Education Policy Analysis Archives 10(1), 2002, pp. 1–39 at pp. 27–

28.

153

The fourth university that failed to regain pre-bar levels of representation for both African-American and Hispanic

students was the University of New Hampshire. (Halley Potter, “Transitioning to Race-Neutral Admissions: An Overview

of Experiences in States Where Affirmative Action Has Been Banned,” in The Future of Affirmative Action, ed. Richard

Kahlenberg (Century Foundation Press, 2014), pp. 75–90 at p. 89.)

154

Halley Potter, “Transitioning to Race-Neutral Admissions: An Overview of Experiences in States Where Affirmative

Action Has Been Banned,” in The Future of Affirmative Action, ed. Richard Kahlenberg (Century Foundation Press,

2014), pp. 75–90 at p. 89.

155

Kahlenberg Report, p. 6.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 100

But those schools’ experience sheds little light on how race-neutral alternatives would fare at

Harvard, a far smaller and far more selective institution. As I show below, any strategies used by

larger and less selective universities, such as percent plans or increased transfers from community

colleges, are likely to generate a pool of applicants who are less qualified than Harvard’s current

applicants.

7.2.3. The literature shows that an increased preference for applicants of lower socioeconomic status can

achieve racial diversity only at the cost of reducing the quality of the admitted class on a range of

dimensions that I understand Harvard considers important

216. Mr. Kahlenberg places heavy emphasis on the idea that universities can achieve racial

diversity without considering race by according a significant admissions preference to applicants of

low socioeconomic status (SES). In my view, however, the literature does not support that

conclusion. (Nor, as I will discuss later, do simulations using Harvard’s data.)

217. It is widely understood as a matter of economic theory that if a university is forced to

target an imperfect correlate of race to achieve racial diversity, it is less able to choose the highest-

quality class than if it considered race directly.

156

Giving a strong admission preference to low-SES

candidates can indirectly generate racial diversity because some SES measures are correlated with

race. But because SES is not a perfect proxy for race, universities must place a significant weight on

SES measures to obtain substantial racial diversity—above and beyond what would be optimal for

creating a high-quality class in other dimensions. Even when the link between SES and race is strong,

this high degree of emphasis on SES factors can significantly alter the characteristics of the admitted

class.

218. Mr. Kahlenberg cites literature selectively in attempting to diminish the well supported

principle that targeting correlates of race will always be a more costly way to generate racial diversity

(in terms of the costs it imposes on other attributes of the admitted class) than considering race itself.

156

Jimmy Chan and Erik Eyster, “Does Banning Affirmative Action Lower College Student Quality?,” American

Economic Review 93(3), 2003, pp. 858–872 at pp. 858–859; Mark Long, “Is There a ‘Workable’ Race-Neutral

Alternative to Affirmative Action in College Admissions?,” Journal of Policy Analysis and Management 34(1), 2015, pp.

162–183 at p. 167; Mark Long, “The Promise and Peril for Universities Using Correlates of Race in Admissions in

Response to the Grutter and Fisher Decisions,” ETS White Paper, 2015, pp. 1–31 at p. 13; Glenn Ellison and Parag

Pathak, “The Efficiency of Race-Neutral Alternatives to Race-Based Affirmative Action: Evidence from Chicago’s Exam

Schools,” NBER Working Paper #22589, 2016, pp. 1–59 at p. 51; Roland Fryer, Glenn Loury, and Tolga Yuret, “An

Economic Analysis of Color-Blind Affirmative Action,” The Journal of Law, Economics, & Organization 24(2), 2007,

pp. 319–355 at pp. 319–320; Roland Fryer and Glenn Loury, “Affirmative Action and Its Mythology,” The Journal of

Economic Perspectives 19(3), 2005, pp. 147–162 at pp. 150–153.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 101

For example, Mr. Kahlenberg cites the work of Richard H. Sander and Aaron Danielson, who (in Mr.

Kahlenberg’s words) suggest that “richer measures of socioeconomic status … significantly increased

the correlation between race and socioeconomic status and the racial dividend of class-based

affirmative action.”

157

But Mr. Kahlenberg fails to note that these same authors also assert that “[i]t is

axiomatic that no race-neutral factor or system can be as efficient as using race itself to achieve racial

diversity through an admissions program … The high academic costs of the larger SES preferences in

these models would, we think, render it unpalatable to most selective schools.”

158

219. Mr. Kahlenberg also cites Matthew N. Gaertner’s 2014 study of race-neutral alternatives

at the University of Colorado to support the claim that preferences for applicants of low

socioeconomic status can “achieve even more racial diversity than using racial preferences.”

159

But

Mr. Kahlenberg neglects Gaertner’s warning that such policies are complicated to implement and

may lower the academic quality of the admitted class and the likelihood of success for admitted

students.

160

220. Mr. Kahlenberg draws on the work of Anthony P. Carnevale, Stephen J. Rose, and Jeff

Strohl, who simulate several race-blind admissions regimes. The authors do find that these

approaches can produce racial diversity, but only “if elite colleges are willing to risk lower average

test scores … and thereby lower graduation rates.”

161

221. Mr. Kahlenberg also cites work by Anthony P. Carnevale and Stephen J. Rose to support

his claim that “top universities could nearly quadruple the proportion of students from the bottom

157

Kahlenberg Report, p. 19.

158

Aaron Danielson and Richard H. Sander, “Thinking Hard About ‘Race-Neutral’ Admissions,” University of Michigan

Journal of Law Reform 47(4), 2014, pp. 967–1020, at pp. 968 and 995.

159

Kahlenberg Report, p. 12.

160

Matthew N. Gaertner, “Advancing College Access with Class-Based Affirmative Action,” in The Future of Affirmative

Action, ed. Richard Kahlenberg (Century Foundation Press, 2014), pp. 175–186 at pp. 183–184 (“Table 114.5 suggests

that on average, class-based admits can be expected to perform worse in college than typical undergraduates…These

patterns should not be terribly surprising, given that class-based admits are ‘borderline’ applicants—students on the cusp

of admission whose academic credentials are not stellar, and whose personal qualities weigh more heavily in an

admissions decision[]” and “Across outcomes, strictly overachieving class-based admits can be expected to perform quite

well—better, in fact, than typical undergraduates. The forecasts for strictly disadvantaged admits, however, are not as

encouraging. Their GPAs, graduation rates, and earned credit hours lag far behind the baseline.”).

161

Anthony P. Carnevale, Stephen J. Rose, and Jeff Strohl, “Achieving Racial and Economic Diversity with Race-Blind

Admissions Policy,” in The Future of Affirmative Action, ed. Richard Kahlenberg (Century Foundation Press, 2014), pp.

187–202 at p. 188.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 102

socioeconomic half … without any change in graduation rates.”

162

But he fails to note that in the

simulation he references, African-American representation fell by a third, suggesting that the

simulated admissions regime was ineffective at producing racial diversity even if it generated

socioeconomic diversity.

163

Indeed, Carnevale and Rose concluded that “ultimately there is no better

way to guarantee a certain level of racial diversity than by employing race per se” and that “[w]hile

socioeconomic preferences help produce some racial diversity, a credible procedure that can

reproduce the level of racial diversity that exists in society today without purposely singling out

African Americans and Hispanics at some point in the selection process has yet to be found.”

164

222. Finally, Mr. Kahlenberg also cites the work of Sigal Alon, highlighting a set of Alon’s

simulations, which, he argues, show that “if the most selective 115 American universities instituted

broad reform—including effectively eliminating lineage, athletic, and racial preferences—a

socioeconomic boost ‘could not only replicate the current level of racial and ethnic diversity at elite

institutions but even increase it.’”

165

But Alon’s simulations do not consistently show that African-

American and Hispanic representation would meet or exceed the levels achieved by considering race.

Furthermore, in the one simulation where the fraction of African-American and Hispanic admitted

students exceeds the levels achieved by considering race, Alon notes that the “price” of this racial

diversity “would be a decline in academic selectivity.”

166

He also notes that those policy changes

would substantially increase the cost of providing financial aid.

167

223. Far from buttressing his claim that preferences for low-SES applicants can enable

selective colleges to increase racial diversity without harming the quality of their student bodies, the

literature Mr. Kahlenberg cites specifically highlights the challenges and costs of such policies for a

selective school like Harvard.

7.2.4. Conclusion

224. In sum, my review of the literature indicates that while race-neutral alternatives can be

used to increase racial diversity relative to a regime that does not consider race at all, (1) they

typically do not produce diverse student bodies comparable to those attained using race-conscious

162

Kahlenberg Report, p. 14.

163

Anthony P. Carnevale and Stephen J. Rose, “Socioeconomic Status, Race/Ethnicity, And Selective College

Admissions,” in America’s Untapped Resource: Low Income Students, ed. Richard Kahlenberg (Century Foundation

Press, 2004), pp 101–156 at p. 148.

164

Anthony P. Carnevale and Stephen J. Rose, “Socioeconomic Status, Race/Ethnicity, And Selective College

Admissions,” in America’s Untapped Resource: Low Income Students, ed. Richard Kahlenberg (Century Foundation

Press, 2004), pp 101–156 at pp. 150 and 153.

165

Kahlenberg Report, p. 13.

166

Sigal Alon, Race, Class, and Affirmative Action (New York, NY: Russell Sage Foundation, 2015), pp. 254–256.

167

Sigal Alon, Race, Class, and Affirmative Action (New York, NY: Russell Sage Foundation, 2015), p. 256

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 103

admissions at selective institutions, and (2) both as a theoretical matter and in practice, they reduce

the quality of the admitted class. As I show in the remainder of this section, my analysis of Harvard’s

admissions data bear out the consensus in the literature: Harvard is unlikely to be able to achieve a

comparably diverse student body without considering race and without decreasing the overall quality

of the admitted class on a variety of dimensions.

7.3. Analysis of race-neutral alternatives using Harvard’s admissions data

225. In this section, I evaluate how Harvard’s class would change under race-neutral

alternatives identified in the academic literature discussed above, including those alternatives

suggested by the Complaint and Mr. Kahlenberg. I employ two methodological approaches in my

analysis.

226. First, I simulate how the use of certain race-neutral alternatives would be expected to

change the demographic and other characteristics of the admitted class. Consistent with the broader

academic literature, I find that any of the race-neutral alternatives proposed by SFFA or Mr.

Kahlenberg that would achieve a class with comparable ethnic and racial diversity would do so only

by changing other attributes of the class in ways that I understand matter to Harvard.

227. Second, for race-neutral practices that Harvard has already employed or experimented

with in the past (i.e., increased financial aid and the elimination of Early Action admissions), I

examine historical data to assess whether further changes could help achieve racial diversity. I find

that (a) eliminating Early Action is unlikely to foster additional racial diversity, and (b) given

Harvard’s current financial aid and recruiting practices, further expansions in financial aid and

recruiting are unlikely to increase racial diversity.

7.3.1. Eliminating consideration of race in the admissions process

228. To simulate the effect of removing consideration of race from the admissions process, I

begin by estimating my preferred year-by-year model (developed in Section 5) for applicants to the

class of 2019. I then turn off the estimated coefficients on the race variables, allowing me to simulate

what class would be admitted if Harvard did not consider race in the admissions process. In my

simulation, the share of African-American students in the admitted class would drop from 14% to

6%. The fraction of Hispanic or Other students would fall from 14% to 9%. The fraction of admitted

students who are Asian-American would rise from 24% to 27%. And the fraction of admitted

students who are White would rise from 40% to 48%.

168

See Exhibit 36 for an illustration of these changes.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 104

7.3.2. Eliminating deferred admission and consideration of whether an applicant is a lineage applicant, a

child of Harvard faculty or staff, a recruited athlete, or on the Dean’s or Director’s interest lists

229. Mr. Kahlenberg suggests that one way a school like Harvard could attempt to increase

racial diversity would be to eliminate admissions practices that allegedly benefit White applicants.

The specific practices identified by Mr. Kahlenberg and addressed in his simulations include:

Harvard’s practice offering deferred admission to a small group of candidates, conditional on their

taking a year off before matriculating; consideration of whether an applicant’s parents attended

Harvard or Radcliffe (i.e., whether the applicant is a “lineage” applicant); consideration of whether an

applicant is the child of Harvard faculty or staff; consideration of whether an applicant is a recruited

athlete; and the use of the Dean’s and Director’s interest lists. Mr. Kahlenberg does not remove

consideration of whether an applicant is a recruited athlete in his preferred simulation (explaining that

such an approach “is sometimes perceived as radical”), but I simulate the effect of that change in

order to ensure that I am considering all potentially available race-neutral alternatives.

230. Using my preferred year-by-year model of admissions from Section 5, I simulate how

the elimination of these practices would affect Harvard’s admitted class. My method closely follows

that used by Mr. Kahlenberg in his report. First, I estimate my model of admissions using data on

applicants to the class of 2019.

169

Then, I simulate the effect of eliminating consideration of race,

lineage status, athletic-recruit status, whether an applicant is the child of Harvard faculty or staff, and

whether an applicant is on the Dean’s or Director’s interest lists. (I do so by replacing the estimated

coefficient of the relevant variables—e.g., a coefficient estimating the effect of being a lineage

applicant on an applicant’s likelihood of admission—with zero.) I then simulate the class that would

be admitted using each applicant’s predicted probability of admission in this modified model of

admissions, and examine how the composition of the simulated class compares to that of the actual

admitted class. Note that this method also eliminates the practice of deferred admission because it

simulates filling all seats in the entering class with students who apply in a given year.

231. As Exhibit 35 shows, removing consideration of factors that allegedly benefit White

applicants does little to generate racial diversity. The simulated class has more White students and

many more Asian-American students, but markedly fewer African-American, Hispanic, or Other

(AHO) students than Harvard now admits.

169

Simulation results for earlier years are qualitatively similar, and can be found in the backup for the relevant exhibit.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 105

Simulated racial composition of the admitted class, after eliminating the consideration of race,

lineage, athletic-recruit status, whether an applicant’s parents are Harvard faculty and staff, and

the Dean’s and Director’s interest lists

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Sample consists of applicants to the class of 2019 in Prof. Arcidiacono’s corrected expanded sample who are in my preferred year-by-

year regression model. Simulation eliminates consideration of race, lineage status, recruited-athlete status, whether an applicant’s parents

are Harvard faculty or staff, whether an applicant is on the Dean’s or Director’s interest list, and the proportion of the applicant’s high

school and neighborhood that is African-American, Hispanic, and White. In addition, recruited athletes are reassigned to rating

combinations in the regression sample that contain the next highest athletic rating.

7.3.3. Increasing the weight placed on socioeconomic factors

232. Mr. Kahlenberg also suggests that Harvard could attain a diverse class without

considering race if it increased its consideration of applicants’ socioeconomic status in the admissions

process. To examine the likely effect of doing so, I again use my preferred admissions model to

conduct a series of simulations. The simulations build on the results presented in the prior section by

considering what would happen if admissions officers at Harvard did not consider race, but gave

greater consideration to various indicators of lower socioeconomic status. To conduct the

simulations, I proceed in several steps.

233. First, I estimate my preferred model of admissions. Then, I remove consideration of race,

lineage status, recruited-athlete status, whether an applicant is the child of Harvard faculty or staff,

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 106

and whether an applicant is on the Dean’s or Director’s interest list.

170

Next, I simulate an increased

preference for students who possess various measurable indicators of lower socioeconomic status. I

do this by artificially increasing the probability of admission for applicants who meet one or more of

the following criteria: (1) the admissions officer reading the applicant’s file considered the applicant

to be “disadvantaged,” (2) neither of the applicant’s parents attended college (i.e., the applicant is

considered a first-generation college student), (3) the applicant requested a waiver of the application

fee, or (4) the estimated median family income of students in the applicant’s neighborhood is at or

below $65,000 (Harvard’s threshold for zero parental contribution).

171

234. In the simulations, I introduce a low-SES boost that is proportional to the number of the

criteria that an applicant meets. An applicant who meets all four criteria, for example, gets the full

low-SES boost, while an applicant who meets only two criteria gets a boost equal to one-half of the

full boost. I start by setting the full boost at two additional points to an applicant’s admissions index

(i.e., the input into the logit function that determines her probability of admission). This is about half

the size of the boost simulated by Mr. Kahlenberg.

172

It is about one-quarter the size of the increase in

the admissions index associated with having exceptional profile rating combinations (those with

admission rates between 80% and 96%), and it is nearly one-third the size of the advantage associated

with having very strong profile ratings combinations (those with admission rates between 54% and

67%).

173

An increase in an applicant’s admissions index translates into an increase in her predicted

probability of admission, but the size of the increase to her predicted probability of admission

depends on her initial predicted probability of admission. For example, adding a low-SES boost of 2

points to the admissions index for a candidate with a 1% predicted probability of admission raises her

predicted probability of admission to 7%. Adding a boost of 2 points to the linear prediction for a

candidate with a 50% predicted probability of admission, however, would increase her predicted

probability of admission to 88%.

174

235. As noted above, I start by assuming a low-SES boost of 2 for an applicant who possesses

all four of the indicators of low-SES status. If an applicant meets fewer than four of the criteria listed

170

Following Mr. Kahlenberg’s approach, I recode athletes with an athletic rating of 1 to have an athletic rating of 2,

assigning them to the appropriate ratings combination in the regression sample.

171

I exclude from this simulation an indicator of whether the applicant applied for financial aid, because three-quarters of

all applicants apply for aid, rendering it a poor proxy for socioeconomic disadvantage. The estimated median family

income figures come from data acquired from the College Board and made available to SFFA and its experts. In these

data, an applicant’s neighborhood is determined based on the applicant’s address. A neighborhood is defined by the

College Board and consists of one or more contiguous census tracts.

172

Mr. Kahlenberg simulates a preference that is half the size of the athletic recruit coefficient in Prof. Arcidiacono’s

model.

173

These two sets of rating combinations have the highest admission rates across all rating combinations in my regression

sample. See workpaper.

174

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 107

above, her baseline low-SES boost is lower. If she meets three of the criteria above, she receives a

boost of 1.5; if she meets two of the criteria above, she receives a boost of 1; if she meets one of the

criteria above, she receives a boost of 0.5.

236. To evaluate the impact of increasing the magnitude of the boost, I then scale up the size

of each applicant’s low-SES boost by a factor of 2 (denoted by 2x), 3 (denoted by 3x), and so on. For

example, in later simulations where I refer to a 2x low-SES boost, I mean that an applicant in that

simulation who satisfies all four low-SES criteria receives a boost of 4 points (doubled from the

baseline of 2 points); an applicant who satisfies three criteria receives a boost of 3 points (doubled

from the baseline of 1.5 points); and so on.

237. My method differs somewhat from Mr. Kahlenberg’s method for simulating increased

weight on socioeconomic factors. In his race-neutral alternative simulations—which examine the

effect of multiple race-neutral practices, not just an increased preference for low-SES applicants—

Mr. Kahlenberg simulates eliminating consideration of whether an applicant has been identified by

admissions officers as disadvantaged, whether the applicant is a first-generation college student,

whether the applicant applied for financial aid, and whether the applicant requested a fee waiver, but

then simulates an increased preference only for students who are identified as disadvantaged. My

approach is more inclusive and flexible, simulating an increased boost for a broader set of students of

lower SES, with the size of the boost for each applicant varying with the number of indicators of low

socioeconomic status that she exhibits.

238. Exhibit 36 illustrates how the racial composition of the admitted class would be expected

to change if Harvard placed varying degrees of additional weight on the low-SES attributes noted

above and eliminated the practices (discussed above) that are alleged to benefit White applicants. The

first two columns in Exhibit 36 report the racial composition of the actual class and the simulated

class in a world where Harvard eliminates consideration of race without undertaking additional race-

neutral approaches to increase racial diversity. The third bar shows what would be expected to

happen if, in addition to eliminating consideration of race and factors that allegedly benefit White

applicants, Harvard gave each low-SES applicant an additional boost of the size discussed above in

paragraph 235. The next bar shows what would happen if Harvard doubled the maximum low-SES

boost, and so on.

175

Exhibit 37 summarizes the simulated change in the racial composition of the

admitted class under this alternative admissions regime as compared to the actual admitted class of

2019.

239. As noted above, if Harvard were to eliminate consideration of race, the share of African-

175

The full range 1-10x is located in the backup to the exhibit, as are results for the classes of 2014 – 2018, which are

qualitatively similar.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 108

American students in the admitted class would be expected to drop from 14% to 6%. The fraction of

Hispanic students would also fall, while the fraction of Asian-American and White students would

rise. If Harvard then applied a low-SES boost (as described above) and eliminated the practices

alleged to benefit White applicants, the share of African-American students in the admitted class

would be expected to remain constant at 6%—still radically below the current level. The fraction of

Hispanic students would rise, from 9% to 11%, remaining about 20% lower than in the actual class.

The fraction of White admitted students would fall back to a level comparable to the current class,

while the fraction of Asian-American students in the admitted class would grow from 27% to 31%.

Harvard would need to increase the low-SES boost to more than six times the baseline (i.e., to a

maximum factor of 12) in order for the expected proportion of African-American students among

admitted students to approximate the current level. At that point, hundreds of low-SES applicants

would be receiving an incremental boost larger than that given to candidates with the most

exceptional academic, extracurricular, personal, and athletic ratings.

176

Increasing the weight placed on socioeconomic characteristics could help generate racial diversity

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Sample consists of applicants to the class of 2019 in Prof. Arcidiacono’s corrected expanded sample who are in my preferred year-by-

year regression model. Simulation eliminates preferences for race, lineage status, recruited-athlete status, whether an applicant’s parents

are Harvard faculty and staff, whether the applicant appears on the Dean’s or Director’s interest list, and the proportion of the applicant’s

high school and neighborhood that is African-American, Hispanic, and White. In addition, recruited athletes are reassigned to rating

combinations in the regression sample which contain the next highest athletic rating. Applicants with certain socioeconomic

characteristics are given a low-SES boost by adding a value to their admission index. The value is equal to 0.5 multiplied by a given integer

multiplier, multiplied by the number of characteristics an applicant displays out of the following: disadvantaged, requested a fee waiver,

first generation college student, neighborhood median income less than or equal to $65,000.

176

See workpaper.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 109

Estimated change in racial composition after removing consideration of race, increasing weight on

socioeconomic characteristics, and eliminating the practices alleged to benefit White applicants

Source: Augmented Arcidiacono Data; College Board Cluster Data; U.S. Census Data

Note: Sample consists of applicants to the class of 2019 in Prof. Arcidiacono’s corrected expanded sample who are in my preferred year-by-

year regression model. Simulation eliminates consideration of race, lineage status, recruited-athlete status, whether an applicant’s parents

are Harvard faculty and staff, whether the applicant appears on the Dean’s or Director’s interest list, and the proportion of the applicant’s

high school and neighborhood that is African-American, Hispanic, and White. In addition, recruited athletes are reassigned to rating

combinations in the regression sample that contain the next highest athletic rating. Applicants with certain socioeconomic characteristics

are given a low-SES boost by adding a value to their admission index. The value is equal to 0.5 multiplied by a given integer multiplier,

multiplied by the number of characteristics an applicant displays out of the following: disadvantaged, requested a fee waiver, first

generation college student, neighborhood median income less than or equal to $65,000.

240. Increasing the size of the low-SES boost in this manner would also be expected to lead to

changes in the admitted class in other respects, many of which Harvard might well consider

deleterious. For example, if Harvard were to increase the size of the low-SES boost by four or five

times—enough for the combined share of AHO students in the expected class to resemble that of the

current class, but still not enough to restore a comparable share of African-American students

177

––

and eliminate the practices alleged to benefit White applicants, numerous measures of excellence in

177

If Harvard increased the size of the low-SES preference four times relative to the baseline, that would yield a

preference for students identified as “disadvantaged” that is roughly equivalent to the preference Mr. Kahlenberg gives

them in his simulations. At five times the baseline preference, students identified as “disadvantaged” receive about one

and a half times the preference in my model as in Mr. Kahlenberg’s simulations. In addition, under my more flexible

simulation, students who are first-generation or who receive fee waivers also receive a boost about the same size on

average as the one accruing to applicants identified as “disadvantaged” (see workpaper).

Predicted Class Without Consideration of Race and Factors that Allegedly Advantage White

Applicants, Change from Actual Class

Race

Actual

Admitted

Class

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

10x

Low-SES

Boost

1. White 676 +34 -4 -44 -87 -127 -162 -236

2. Asian-American 402 +120 +118 +113 +106 +98 +90 +71

3. Hispanic or Other 233 -51 -18 +19 +60 +99 +133 +204

4. African-American 234 -130 -112 -92 -71 -51 -34 +4

5. Race Missing 134 +27 +16 +4 -7 -18 -26 -43

Predicted Class Without Consideration of Race and Factors that Allegedly Advantage White

Applicants, % Change from Actual Class

Race

Actual

Admitted

Class

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

Low-SES

Boost

10x

Low-SES

Boost

1. White 40% +5% -1% -7% -13% -19% -24% -35%

2. Asian-American 24% +30% +29% +28% +26% +24% +22% +18%

3. Hispanic or Other 14% -22% -8% +8% +26% +42% +57% +88%

4. African-American 14%-55%-48%-39%-30%-22% -15% +2%

5. Race Missing 8% +20% +12% +3% -6% -13% -20% -32%

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 110

Harvard’s class would drop substantially. This can be seen in Exhibit 38, which summarizes changes

across a variety of characteristics of the admitted class. For example, the fraction of admitted students

receiving an academic rating of 1 or 2 would be expected to drop by an amount between 13% and

22%. The fraction of students receiving top extracurricular and personal ratings would also fall, and

the fraction with top athletic ratings would be cut by a third. In addition, as the magnitude of the low-

SES boost increases to 3x the baseline boost and beyond, the fraction of admitted students who are

Asian-American begins to fall, rather than rise.

241. The admitted class would also be expected to look markedly different in other

dimensions. The fraction of students intending to concentrate in the humanities and social sciences

would be expected to fall, while the fraction intending to concentrate in biological sciences would be

expected to rise. The fraction of admitted students who are children of Harvard and Radcliffe alumni

would fall, as would the number of admitted students who are children of Harvard faculty and staff.

The number of athletic recruits would drop by half.

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 111

Increasing the weight placed on socioeconomic characteristics would be expected to markedly alter

the characteristics of Harvard’s admitted class

Predicted Class Without Consideration of Race and Factors that

Allegedly Advantage White Applicants

3x Low-SES Boost 4x Low-SES Boost 5x Low-SES Boost

Actual

Admitted

Class

Predicted

Value % Change

Predicted

Value % Change

Predicted

Value % Change

Outcome Measures

[A] [B] ([B]-[A])/[A] [C] ([C]-[A])/[A] [D] ([D]-[A])/[A]

Race

1. White 676 632 -7% 589 -13% 549 -19%

2. Asian-American 402 515 +28% 508 +26% 500 +24%

3. Hispanic or Other 233 252 +8% 293 +26% 332 +42%

4. African-American 234 142 -39% 163 -30% 183 -22%

5. Race Missing 134 138 +3% 127 -6% 116 -13%

Academic

6. Average Composite SAT Score 2244 2213 -1% 2189 -2% 2164 -4%

7. Average Composite ACT Score 33.1 33.0 -0.5% 32.7 -1% 32.4 -2%

8. Average Converted GPA 77.0 77.2 +0.3% 77.1 +0.1% 76.9 -0.1%

9. Average Academic Index 228 227 -0.4% 225 -1% 224 -2%

Fraction with Profile Rating of 1 or 2

10. Academic 76% 72% -5% 66% -13% 59% -22%

11. Extracurricular 62% 61% -2% 57% -9% 52% -17%

12. Personal 71% 68% -5% 64% -11% 59% -17%

13. Athletic 27% 19% -30% 18% -33% 17% -38%

Applicant Characteristics

14. Number of Lineage Students 259 104 -60% 86 -67% 68 -74%

15.

Number of Double Lineage

Students

72 24 -67% 19 -73% 15 -79%

16. Number of Recruited Athletes 180 89 -51% 88 -51% 88 -51%

17.

Number of Children of Harvard

Faculty and Staff

44 20 -54% 17 -61% 13 -69%

18.

Number of Students on Dean’s

and Director’s Interest Lists

19. Number of Female Students 839 848 +1% 851 +1% 855 +2%

Redacted

HIGHLY CONFIDENTIAL – ATTORNEYS’ EYES ONLY Page 112