Impact Measurement: A Cautionary Tale

Evaluation has often collaborated with racism, plutocracy, neocolonialism, and genocide. We must end our collusion.

23 min readAug 10, 2020

Where power dynamics are unequal, measurement is just another means of control. (image by Francis Galton, 1883)

A History of Harm

About six months ago, I decided to write a history of impact measurement, the means by which the benefits and harms of social change work are assessed. It seemed innocent enough. I’d get more insight into the field in which Do Big Good was working and hopefully pick up some useful frameworks and methods along the way.

Impact measurement as I knew it was a world of nerdy altruists like myself — researchers and consultants drawing up logic models and theories of change, identifying indicators and metrics, collecting and analyzing data, collaborating with nonprofits and businesses, assessing whether social change goals were being achieved. They were agents of accountability and evidence, ensuring the forward march of progress.

A few days into the project, I found myself writing the sentence, “We have the unfortunate habit of getting caught in the tailwinds of racists, plutocrats, neocolonialists, and génocidaires.” I stopped writing.

[I]mpact measurement is middle class people demonstrating to rich people the worthiness of poor people to receive some small portion of the funds expropriated from them….

I wasn’t ready to acknowledge the extremely troubling past of impact measurement or to admit that those troubling dynamics continue into the present. I wasn’t ready to admit that too often impact measurement is middle class people demonstrating to rich people the worthiness of poor people to receive some small portion of the funds expropriated from them through colonialism and capitalism.

[W]e need to change norms such that impact measurement counters — rather than reinforces — unequal power dynamics….

Yet the often sordid history of impact measurement needs to be told. We need to come to terms with our past and take an honest look at how it affects the present. We must decide which parts of that past work should continue and which should definitely end. Most importantly, we need to change norms so that impact measurement counters — rather than reinforces — unequal power dynamics between foundations and grantees and between investors and social enterprises.

Impact Measurement: A Timeline

1890s–1900s: Measuring Society’s Ills
1910s–1940s: Exclusion to Extermination
1950s–1960s: Prevention and Performance
1970s–1980s: Predatory Generosity
1990s: Age of Self-Reflection
2000s–2010s: Impact is Good
2020s Onward: Ending Our Collusion

It is my hope that, by understanding our history, we can becomes agents of transformation in our field.

1890s–1900s : Measuring Society’s Ills

Influences: The Progressive Era, Scientific Management, The Industrial Revolution
Innovations: Measuring and treating social problems at scale
Shortcomings: Methods still in their infancy

At the close of the 19th century, social reform began to meld with the scientific management principles birthed by the Industrial Revolution. The world suddenly seemed less chaotic and more knowable. For the first time, measurement of social benefits and harms seemed both possible and important.

Popular culture was now mass produced. There was mass production of misery, too.

It was the first time that society was designed at scale. From manufactured goods, like the cars rolling out of Henry Ford’s factories to the new records played on phonographs and gramophones, popular culture was now mass produced.

This 1890 photograph dramatizes the squalid urban education system Progressives sought to reform. (Jacob Riis)

There was mass production of misery, too. Smallholder farmers converged on cities as migrants and as immigrants. No longer able to grow their own food, they were at the mercy of factory owners.

This labor surplus, combined with lax regulation, meant dangerous and low-paying work (see Grover Shoe Factory disaster, 1905) and squalid living conditions (see Jacob Riis’ bestseller How the Other Half Lives, 1890).

It was this mass production of social harm that inspired Progressives to take action beyond the local ambit of charities and almshouses. They sought to measure and remedy social problems at social scale.

Progressives… sought to remedy social problems at social scale.

This spirit was personified by Joseph Mayer Rice, an education reformer who wanted schools to provide more enriching material to students. The privileged son of Bavarian immigrants, he was convinced that not only was rote memorization robbing children of time to learn art and music, it was also an ineffective means of teaching basic literacy and numeracy.

Joseph Mayer Rice (via Juana López Moreno)

Rather than simply making an argument or relying on a dramatic case study, as the muckraking journalists of the era did, he designed and carried out unprecedented social research. Rice’s 1895 study of the spelling skills of 33,000 fourth to eighth-graders found no connection between time spent on spelling drills and student performance on spelling tests. He used this evidence to advocate for a more efficient and enriching curriculum.

In 1897, Émile Durkheim, now recognized as the founder of sociology, took a similar empirical approach to understanding the social ill of suicide. His work re-framed suicide as a social problem linked to social factors, such as a lack of connection to others, rather than an individual’s personal shortcomings.

Social problems were seen not as the problems of individuals, but as problems of society itself.

Progressive saw social problems not as the problems of individuals, but as the problems of society itself. If goods could be mass produced, then social good could be as well.

1910s–1940s : Exclusion to Extermination

Influences: Eugenics, Nazism, Fascism, Xenophobia
Innovations: Pseudo-scientific theories supporting white supremacy
Shortcomings: Without ethical guardrails, beliefs about the objective nature of social measurement led to mass harm and death.

When World War I broke out in 1914, demand for mass evaluation and testing increased as governments sought to determine who was psychologically fit to fight. Yet mass testing, which Rice used to such positive effect in his analysis of education methods, was now also used pseudo-scientifically to reinforce racism, ableism, and xenophobia.

Within the period of only a few decades, applications of mass solutions to poor education were reborn as Final Solutions to end entire populations.

Without clear guardrails about what constituted social good, ideas about how to better the human condition led to ideas about bettering physical human bodies, which became the “science” of eugenics. Within the period of only a few decades, applications of mass solutions to poor education were reborn as Final Solutions to end entire populations.

In 1908, Henry Goddard translated a French intelligence test and began applying it to immigrants on Ellis Island four years later. From the results, he concluded that 87% of Russians, 83% of Jews, 80% of Hungarians, and 79% of Italians were “feeble-minded.” These quantitative and supposedly objective mass measurements were used to pass the xenophobic Immigration Act of 1924, which halted immigration from Asia and set quotas on immigrants from outside Western Europe.

The “science” of eugenics aimed at eradicating social ills from poverty to imbecility at the level of individual humans.

This idea that intelligence testing revealed intrinsic features of social groups soon became the Eugenics Movement, which sought to improve the quality of humanity by ensuring the “quality” of individual humans. The “science” of eugenics aimed at eradicating social ills, from poverty to “imbecility,” at the individual level. To promote these ideas, the British Eugenics Education Society was founded in 1907, with the American Eugenics Society following suit in 1921.

Henry Goddard promoted the “science” of eugenics as a means of eradicating a number of social ills, from poverty to “imbecile[s].” (“*The Kallikak Family: A Study in the Heredity of Feeble-Mindedness*,” 1912)

Without human rights protections, these ideas initially led to the forced sterilization of mental patients in countries including Belgium, Brazil, Canada, Japan and Sweden in the 1920s and 1930s, later culminating in the genocide of the Jews, Roma, LGBTQ+, and disabled people in World War II during the Holocaust.

The initial idealism that drove Progressives to undertake mass social measurements became perverted by rampant racism and other bigotries that were popular at the time. Without checks on the evaluators’ power, measurement became a tool for social exclusion and even death.

Idealism that drove Progressives to undertake mass social measurements became perverted by rampant racism and other bigotries….

The practice of mass measurement and mass treatment needed constraint to ensure that no one group’s good intentions could result in the extermination of other groups.

1950s–1960s : Prevention and Performance

Influences: Postmodernism, Management Science, The Atomic Age
Innovations: Development of the first modern methods of management, impact measurement as measurement of harm
Shortcomings: Reduced interest in measuring positive social change

Following the horrors of World War II, the great powers chose to create institutions like the United Nations to set ethical standards for international conduct. In 1950, scientific racism was formally denounced by UNESCO. “For all practical social purposes,” they wrote, ‘‘‘race’ is not so much a biological phenomenon as a social myth.”

Academics saw that the genocides… were partially the result of a misplaced belief in the unassailable factual nature of social measurement.

Academics saw that the genocides of the war were partially the result of a misplaced belief in the unassailable factual nature of social measurement. As academics adopted more humble beliefs about the certainty of knowledge, measurement for social good grew more modest as well.

In 1966, sociologists Peter L. Berger and Thomas Luckmann posited that people interact to create shared meaning, rather than discovering absolute truths (The Social Construction of Reality). In 1962, Thomas Kuhn wrote that shifts in prevailing frameworks, which he called “paradigms,” could cause the meaning of data to change (The Structure of Scientific Revolutions).

Measurement for social good grew more modest in this period… [and] differed markedly from the ambitious world-changing drive of the Progressives… and the Fascists.

Nuclear contamination became a common concern in the post-war period, as these images from a medical guide show. (The Pictorial Medical Guide, 1951)

At the end of World War II, the US had demonstrated the destructiveness of nuclear power by dropping atomic bombs on Japanese cities. Just as social measurement in the late 19th and early 20th centuries was influenced by the machines of the Industrial Revolution, so too was mid-century measurement influenced by ideas surrounding nuclear power. When nuclear power was put to civilian use to generate electricity, great care was taken in assessing its effects.

It is in this context that we first see the term “monitoring and evaluation” (M&E), now well-known to those who measure social impact. For example, a 1959 manual on low power reactors recommended a “continuous program of monitoring and evaluation of all possible radiation and contamination.” A 1958 report by General Electric and a 1966 reactor design guide likewise used this phrase in the context of nuclear power. In the nuclear context, M&E was necessary to quickly identify small errors that could lead to catastrophic results. This focus on impact as a harm later became Impact Assessment, a formal process of determining possible negative economic, social, and environmental effects of a proposed government policies.

In the nuclear context, monitoring and evaluation was necessary to quickly identify small errors that could lead to catastrophic results.

This narrow and harm-focused approach to measuring impact differed markedly from the ambitious world-changing drive of both the Progressives and the Fascists. Rather than seeking to achieve positive social impact (by their own definitions) at a national or global scale, mid-century nuclear scientists were seeking to narrowly prevent harm — not by measuring the effect of their tools on the public, but by measuring the activities and outputs of the reactors themselves. If their monitoring methods worked, there would be no need to measure impact on society. Impact on society was exactly what nuclear scientists were trying to prevent.

Impact on society was exactly what nuclear scientists were trying to prevent.

*Peter Drucker*, “the man who invented management” (Hindsite Interactive)

While nuclear power was carefully constrained, equally momentous economic changes were viewed with ebullient enthusiasm. The post-war period saw the beginning of the “golden age of capitalism,” a time of worldwide economic expansion that did not end until the early 1970s.

Just as scientific management emerged from changes in commerce in the late 19th century, now professional management accompanied the rise of corporations in the post-war period. In these new types of organizations, managers sought to measure the activities and outputs of their firms to maximize profit.

Nudged by books like Concept of the Corporation (1946) by Peter Drucker and parodied in popular films like The Apartment (1960), management science was born out of operational methods used by Allied armies during World War II and also echoed labor management systems used on slave plantations in the mid-19th century. These methods were then put to use to measure and manage corporate employees.

M&E made the leap from nuclear energy production to business performance.

A 1968 conference presentation for the Industrial Relations Research Association argued that more data was needed “for monitoring and evaluation of manpower programs.” In this context, M&E made the leap from nuclear energy production to business performance.

1960s — 1970s : Predatory Generosity

Influences: Neocolonialism, Neoliberalism, Positivism, The Cold War
Innovations: Purpose-built methods for evaluating social programs at scale are developed, such as the logical framework (logframe).
Shortcomings: Evaluation reinforces colonial power dynamics.

Internationally, the Cold War saw the replacement of colonialism with neocolonialism, led by loans. The explicitly extractive colonial policies of the 16th - 19th centuries ended with the wave of independence movements in the mid-20th century. Yet the relationship between colonized and colonizer remained imbalanced.

Former colonizers were eager to win the “Third World” of formerly colonized countries to the side of capitalism….

Former colonizers in the West were eager to win the “Third World” of formerly colonized countries to the side of capitalism, rather than communism. This came in the form of loans, largely from the World Bank, which was founded in 1944 to provide temporary loans to poor countries that could not obtain them on the open market.

Akosombo Dam, Ghana. President Nkrumah set up a bidding war between the US and USSR in the 1950s to fund it. (Britannica)

President Truman gave these loans an altruistic framing in his 1949 inaugural address. “The old imperialism — exploitation for foreign profit — has no place in our plans,” he said. “What we envisage is a program of development…”

Development aid was an attempt to do social good on a global scale. It was also manipulative and self-serving, designed to bribe countries into joining the Western sphere of influence, rather than the Soviet Union.

In the 1950s, for example, President Nkrumah of Ghana engineered competition between the “First World” (US, UK) and “Second World” (USSR and Communism-aligned nations) to build the Akosombo Dam (above). Suddenly, Western powers began to use infrastructure aid as an effective tool of neocolonialism.

Development aid was an attempt to do social good on a massive global scale. It was also manipulative and self-serving….

Though the World Bank had lent out less than $1 billion in its first 24 years, it disbursed more than more than $11 billion during the 1970s alone. Massive generosity in one view. Massive manipulation by another.

What happened to the money? In some cases, as with the Akosombo Dam, funds were used as intended, but with enormous negative social impact. The dam, for example, forcibly relocated 80,000 people from 700 villages.

Money was misspent or disappeared. The duty to repay did not.

Unfortunately, many of these loans were little more than bribes and went knowingly to governments that were dictatorial and kleptocratic. In 1978, for example, the Bank originated a loan to finance “an irrigation project that will boost rice production” in Vietnam. A confidential report admitted, however, that the real reason for “the employment problem” was that the government was terrorizing farmers who refused to be removed from their land as part of “reorganization.”

Money was misspent or disappeared. The duty to repay did not. By the late 1990s, 41 countries were considered to be highly indebted, 80% of them in Africa. Some paid as much as 60% of their national budgets on debt service. As the New York Times noted in 1999, that left “little to invest in basic education, health, rural roads and other programs that help people escape from poverty.”

The West’s first attempt at creating social benefit at global scale had achieved tremendous unintended harm.

The West’s first attempt at creating social benefit at global scale had achieved tremendous unintended harm. It had maintained — in some case even worsened — poverty in countries that accepted the loans. Some figures, such as the scholar and financier Dambisa Moyo, said that development aid was so toxic, it should be stopped at soon as possible.

A classic logframe, highlighting the predictability assumptions (input > output) embedded in the method. (G. Coleman, 1987)

The problem of negative impact was not a result of lack of measurement. By the late 1970s and early 1980s, international institutions like the United Nations were beginning to publish best practice guides on M&E. These methods were used to measure the impact of loan-funded projects.

In 1969, the United States Agency for International Development (USAID) first promulgated the Logical Framework Approach (LFA), which it began using in the 1970s. The central artifact of LFA is the now well-known (and often updated) logical framework or “logframe” diagram, a simple table in which activities, outputs, purposes and goals of a project are described. This diagram was later updated to add inputs as well (see image) and has been updated many times over the years to both add and remove phases.

Implicit in the logframe approach is the assumption of perfect predictability.

Implicit in the logframe approach is the assumption of perfect predictability. Terms like “input” and “output” bring to mind the uniformity of outcomes associated with a manufacturing process (hello, Peter Drucker!). This process need only be monitored for correct operation to ensure that social change outcomes are generated.

These specialists (usually white, usually Western) would then monitor implicitly unreliable loan recipients.

In an undated photo, a USAID employee (center) leads an M&E team during a site visit in Lutembwe Forest, Zambia. (USAID)

These methods became particularly popular in the 1980s and sought to prevent the negative impacts of irresponsible lending. Writing in 1993, lecturer John Cameron wrote that “[f]ormal monitoring and evaluation (M&E) had its origins in correcting… the bad… practices of the 1960s and 1970s” when M&E was about “policing, as much as assisting” loan recipients.

This policing did not resolve inequities. Collecting high-quality evidence required specialists trained in precise research methods. These specialists (usually white, usually Western) would then monitor implicitly unreliable loan recipients.

M&E methods of the 1960s and 1970s were about “policing, as much as assisting” loan recipients.

It was also during the 1980s that Western evaluators began to increase their power and formalize their profession. In 1986, the American Evaluation Association was founded to “improve evaluation practices and methods, increase evaluation use, promote evaluation as a profession, and support the contribution of evaluation to… effective human action.”

A university-trained white person… was seen as the most capable of evaluating the social impact of a project carried out in any other country.

Here, M&E was still aligned with neocolonialism and white supremacy. A university-trained white person from a Western country was seen as the most capable of evaluating the social impact of a project carried out in any other country. Armed with empirical tools and credentials, the professional evaluator could assess impact on any foreign soil.

1990s : Age of Self-Reflection

Influences: The Anti-Globalization Movement, The Information Age, Post-Positivism
Innovations: Power dynamics are foregrounded; new equity-focused methods are developed.
Shortcomings: Evaluation mitigates unequal power dynamics rather than undoing them.

Critiquing power dynamics in evaluation (Guba and Lincoln, 1989)

At the end of the 1980s, critiques of this obviously flawed approach to impact measurement began to emerge. In their 1989 book Fourth Generation Evaluation, Egon Guba and Yvonna Lincoln challenged the privileged position of evaluators, writing that “[i]f only the evaluator and client are privileged to decide on the questions to be asked, the instrumentation to be employed, the mode of data analysis and interpretations… other stakeholders will be denied.” They called for a new post-positivist approach to evaluation in which power is shared between the evaluator and evaluatee and the “key dynamic is negotiation.” “[F]indings are not ‘facts,’” they proclaimed. Rather, evaluation results are “created through an interactive process that includes the evaluator (so much for objectivity!)” [Emphasis in original text.]

““[F]indings are not ‘facts.’” They are “created through an interactive process that includes the evaluator.”

Evaluators were becoming broadly and publicly self-critical. They were beginning to develop methods that explicitly took into account — and sought to undo — power imbalances and injustices.

Bringing down the Berlin Wall. The fall of Communism as a global threat to Capitalism led to a more liberal view of development and evaluation. (Anthony Suau)

The emerging global political context of the decade aided these nascent self-critiques of impact evaluation. The Berlin Wall fell in November of 1989, ending the Cold War and dramatically weakening the national security concerns that had motivated global development efforts over the past forty years.

The 1990s opened onto a freer and less fearsome world — one in which transforming power dynamics was seen as just, rather than dangerous or naive.

The 1990s opened onto a freer and less fearsome world — one in which transforming power dynamics was seen as just, rather than dangerous or naive. Evaluation specialists began to publish a second generation of texts, taking a broader view than the previous power-blind compliance approach of the 1960s — 1970s.

[A]sking “Who counts reality?” is another way of asking “Whose reality counts?”

New methods, such as Participatory Monitoring and Evaluation (PM&E), began to emerge. In their 1998 literature review, “Who Counts Reality?”, Marisol Estrella and John Gaventa centered the power dynamics of evaluation, noting that “questions of ‘who measures’ results [and] ‘who defines’ success” had become critical to the field. They acknowledged that asking “Who counts reality?” is another way of asking “Whose reality counts?”

2000s–2010s: Impact is Good

Influences: Design Thinking, Supercapitalism, Social Justice
Innovations: Sophisticated measurement frameworks, tools, and practices
Shortcomings: Evaluation is still usually the rich employing the upper middle class to evaluate impacts on the poor.

All this critique had an effect, but the result was more a superficial reform of procedures rather than a true alteration of power dynamics. Evaluation is still usually the rich (corporations and private philanthropy) employing the upper middle class (people with master’s degrees) to evaluate the poor (recipients and beneficiaries).

The term impact measurement itself is a child of the term impact investing… [an] attempt to save capitalism from itself.

The Five Dimensions of Impact, an elegant and nuanced framework for thinking about impact measurement. (Impact Management Project)

The term impact measurement itself is a child of the term impact investing, which emerged in 2007 as part of an effort to reform capitalism by “measuring social and environmental performance with the same rigor as that applied to financial performance.”

Terms like “triple bottom line” — coined by consultant John Elkington in 1998 and referring to financial, environmental and social factors — and frameworks like ESG (environmental, social, governance), coined around the same time, are also part of the attempt to save capitalism from itself. The promulgation of the Sustainable Development Goals (SDGs) by the United Nations in 2015 provided a simple rubric for what social good actually meant and became the preferred framework for impact investment and measurement.

The 17 Sustainable Develop Goals were adapted from the previous Millennium Develop Goals (United Nations)

Because it has the greatest access to capital, impact investing has the greatest range of impact measurement tools (i.e., the free yet complex IRIS+ metrics set), intellectual frameworks (i.e., Impact Management Project’s five dimensions of impact), and practices (ongoing management of impact activities instead of only summative evaluation).

Yet Impact Management and Measurement (IMM) is still part of the capitalist project. It does not problematize, except perhaps in whispers, the fact that accumulation of wealth by clients is a fundamental cause of the very poverty and vulnerability those clients seek to remedy through their investments.

[A]ccumulation of wealth by clients is a fundamental cause of the very poverty and vulnerability those clients seek to remedy through their investments.

In 2020, even Harvard Business Review, the most prominent intellectual outlet of global capitalism, admitted that impact investing would not save capitalism. Even as the wealthy fund good works, the poor get poorer and systemic injustices remain.

Anand Giridharadas, author of the 2018 book Winners Take All, has become a kind of public intellectual on this topic. “On the one hand, there’s more activity by elites to make the world better than maybe ever before” Giridharadas noted at a 2019 event. Unfortunately, “The numbers have not only not gotten better, they’re getting worse: 1% of Americans own 90% of the country’s wealth.” He compared impact investing to McDonalds offering salads — a performative addition, rather than a true fix of a harmful system.

Philanthropy achieves a similar result. In his 2018 book Decolonizing Wealth, Edgar Villanueva writes that “[t]he basis of traditional philanthropy is to preserve wealth…” Because foundations are protected from taxation, he argues that wealth has been “twice stolen, once through the exploitation of natural resources and cheap labor, and the second time, through tax evasion.”

Continuum of impact investment and philanthropy (Rockefeller Philanthropy Advisors)

There are also many variations between impact investment and philanthropy (see diagram above), and each form of capital transfer has slightly different expectations and results. For example, traditional investing demands a financial return, whereas traditional philanthropy does not. Yet both are possible because of the wealth accretion made possible by hyper-capitalism.

[M]easurement acts as an accomplice to capitalist face-saving by providing evidence and credibility.

In both impact investing and philanthropy, measurement acts as an accomplice to capitalist face-saving by providing evidence and credibility. The question impact measurement asks is: How well is this $20,000 grant from a foundation with $20 million in assets achieving its goal of increasing childhood literacy/protecting wetlands/ending homelessness?

The more important question is, should we even have a system where individuals are allowed to accrue massive wealth and give it out at a rate of 5% a year according to their own unaccountable preferences?

At the same time, many firms and professionals, foremost among them the Evaluation Equity Initiative, led by Jara Dean-Coffey, are actively working to “conceptualize, implement, and utilize” evaluation “in a manner that promotes equity.” To achieve this shift, they believe it is necessary to work with nonprofits, foundations, and consultants to:

Acknowledge that evaluation reflects a paradigm that cloaks privilege and racism as objectivity.
Explore the ways in which current practices in foundations, nonprofits and among consultants can be barriers to the adoption of equitable evaluation.
Elevate evaluative thinking… to be a leadership competency and organizational capacity.
Move beyond methodological approaches and evaluator demographics to address culture and context.
Continue to diversify and expand the talent pool of evaluators.

2020s Onward: Ending Our Collusion

I am left to wonder under what circumstances Do Big Good should be doing impact measurement at all. I do believe that using participatory and iterative design principles to plan and adapt are critical to successful social projects. Yet I worry that impact measurement calcifies and reinforces the unjust systems to which it belongs.

How can accountability to results, which is critical to achieving social change, not reinforce inequities of power and wealth?

How can accountability to results, which is critical to achieving social change, not reinforce inequities of power and wealth? Here are five commitments we at Do Big Good are making to change the power dynamics of impact measurement in our own work.

1) We will not measure impact in ways that reinforce the power of the rich over the poor.

Traditional accountability reinforces existing inequities in power and wealth. We won’t participate in that anymore. (freepik)

Measurement is fundamentally about accountability, both to results and to people. The problem is that the traditional accountability of impact measurement has a downward trajectory. People with more power hold people with less power accountable.

Traditional accountability reinforces inequities in power and wealth. We won’t we doing it anymore.

This is the direction of accountability when grantees are responsible to funders or when social enterprises are accountable to investors. Traditional accountability reinforces inequities in power and wealth. We won’t we doing it anymore.

❎ Traditional Accountability: Responsibility of one with less power to one with more power (i.e., of a homeless shelter to their funder)
✅ Transformative Accountability: Responsibility of one with more power to one with less power (i.e., of a homeless shelter to a homeless person who is their client)

Instead we will support transformative accountability, helping organizations of all types more fully live up to their responsibilities to people with less power. This could mean:

Helping a foundation be more accountable to its grantees
Helping a nonprofit be more accountable to its beneficiaries
Helping an impact fund be more accountable to the firms in which it invests
Helping C-level executives be more accountable to their employees

2) We will measure impact in ways that flip power dynamics.

We will only co-design measurement systems that formalize the responsibilities of those with more power toward those with less power. (iconicbestiary)

Following on this logic, we will only co-design measurement systems that formalize the responsibilities of those with more power towards those with less power. Reframing the examples above, we would:

Work with grantees to develop metrics by which their funder would hold itself accountable to grantees needs.
Work with beneficiaries to develop metrics by which a nonprofit would hold itself accountable to the beneficiary’s needs.
Work with social enterprises to help an impact fund hold itself accountable to the needs of those they invest in.
Work with employees to develop metrics by which executives hold themselves accountable to employee needs.

In short, we would work with those who have less power to develop metrics and evaluation systems that hold those with more power to account.

Flip the script on… impact measurement…. [C]reate new types of institutions.

If this feels different, that is because it is. It flips the script on the power dynamics of impact measurement. It creates new types of institutions.

3) We will measure impact using redesign principles.

*Redesign* is a participatory process to make existing systems more equitable. Going forward, we commit to doing participatory design in this way. (pawpixel.com)

Do Big Good has always measured impact using principles of participatory design, a creative process through which the people experiencing a problem are also the ones to develop the solution.

This has meant, for example, developing the Impact Cascade model with the civic tech volunteers of Code for America, evaluating election software with election officials, and exploring how to measure the impact of organizing with community organizers.

✅ Participatory Design: Creative process through which the people experiencing a problem are also the ones to develop the solution. (i.e., creating a new evaluation tool with the people being evaluated)

Now we are committing to going further. In 2016, Caroline Hill, Michelle Molitor, and Christine Ortiz coined the term redesign to describe a creative process through which people suffering under a faulty “solution” to a shared problem create a new solution that actually works for them. Going forward, we at Do Big Good commit to doing participatory design in this way too.

✅ Redesign: Creative process through which people suffering under a faulty “solution” to a shared problem create a new solution that actually works. (i.e., altering public safety systems from the perspective of BIPOC)

I’ve written an overview of redesign here. You can also read Hill, Molitor, and Ortiz’s piece here and take a free online course on redesign approach, called equityXdesign, here. Equity-Centered Community Design (ECCD), a redesign approach developed by Antionette Carroll and Creative Reaction Lab also offers critical methods and framing.

4) We will constantly reassess the validity of our work.

Through constant self-reflection we will hold ourselves accountable to using measurement to flip power structures… or we will stop doing that work. (freepik)

To ensure our commitment to the above principles, we will constantly reflect on whether our work in impact measurement is part of the problem or part of the solution.

My argument in this history is that too often (perhaps even most often) impact measurement has been part of the problem of white supremacy and extractive economics rather than part of the solution of dismantling and redesigning those systems.

[W]e will constantly reflect on whether our work in impact measurement is part of the problem or part of the solution.

No one in this field enters it or stays in it to perpetrate harm. Quite the opposite. Every single person I have met in impact measurement is passionately committed to making the world a better place. That is the reason they decided to do this work in the first place. But intentions are not enough. If our work legitimizes or reinforces harmful systems, we must alter our practice.

5) If we are unable to measure impact without colluding with harmful systems, we will leave this work.

If we are unable to work in a way that is not collusionary with harm, we will withdraw our labor from unjust systems by leaving them. (studiogstock)

Finally, if we are unable to work in a way that is not collusionary with harm, we will withdraw our labor from these unjust systems.

[A] lot of well-meaning white people with master’s degrees will need to analyze their complicity….

If we cannot do impact measurement in a way that actively dismantles and redesigns unjust power structures, we will stop measuring impact and find new work that is transformative. As the great teacher and facilitator Tamara Lynn once told me, “find where you are wanted and needed.”

If we cannot do impact measurement in a way that actively dismantles and redesigns unjust power structures, we will stop measuring impact….

I believe there is hope for impact measurement to realize its own goals of being a force for positive change. However, to realize this goal, a lot of well-meaning white people with master’s degrees will need to analyze their complicity with white supremacy and capitalist economics, and change their practice. I count myself in this group and am committed to doing that work as I lead Do Big Good. I invite you to join me.

TL;DR

1) Since its conceptual emergence in the 1890s, impact measurement has colluded with white supremacy and extractive economics.

2) We who measure impact must end our collusion. Redesign offers a useful framework for transformational measurement.

3) If we are unable to measure impact in a way that is not collusionary with harm, we must withdraw our labor from this system.

Mer Joyce is the founder and principal of Do Big Good, a Seattle-based co-design studio. You can reach her at Mer AT DoBigGood DOT com.