Weaponized data: How the obsession with data has been hurting marginalized communities


weapons_concept_by_baklaher-d60vqdfHi everyone, I just came back from giving a keynote speech in Vancouver Canada, complete with pictures of baby animals. I am condensing the key concepts here. A couple of notes before we tackle today’s exciting topic. First, I want to thank my awesome colleague Dr. Jondou Chen for introducing me to the term “weaponized data.” If I ever start up an alternative rock band, I am going to invite Jondou, and we’ll call it “Weaponized Data.” Sample lyrics: “From the start/you returned begrudging correlation/to my foolish causation/like an icepick to my heart.”

Second, for the grammar geeks out there—and I am one—I’m going to do something blasphemous and use “data” as both a singular and a plural noun in this post, depending on context. I know, I know, technically “data” is the plural for “datum,” so we should be saying, “The data are inconclusive” and not “The data is inconclusive.” Kind of like “media” is the plural of “medium” and “panda” is the plural of “pandum.” But, fellow grammar geeks, we must choose our battles. Let us save our energy to fight, with patience and compassion, crimes against decency like “that time works for John and I” and “you were literally on fire during your presentation.”

So, data. Data is pretty awesome. As a proud nerd, I love a good set of data and can spend endless hours looking at a sexy chart full of numbers. If data were turned into a syrup, I would put it on my soy ice cream all the time, because it is just so sweet. In the past few years, there has been more and more pressure on nonprofits being able to produce good data. Getting more and better information on practices and outcomes can only be good for our sector.

However, like fire or Jager Bombs, data can be used for good or for evil. When poorly thought out and executed, data can be used as a weapon to screw over many communities. Usually this is unintentional, but I’ve seen way too many instances of good intentions gone horribly awry where data is concerned. Here are a few challenges we need to pay attention to regarding the game of data, which is a lot like The Game of Thrones, but with way less frontal nudity:

General challenges with data

The illusion of objectivity: Data is supposed to be objective; however, humans are subjective, and they collect and interpret data; therefore, there’s no such thing as objective data. Considering the wealth of data on climate change and the effectiveness of immunization, why do we still have deniers of global warming, and people who still refuse to vaccinate their kids? People, for better or usually worse, will find data that support their established positions, and ignore everything else. And those who create methods and instruments for data collection (such as standardized tests of student performance) are also affected by their own backgrounds and biases, often leading to flawed data that are then used to make decisions.

The delusion of validity: Since many nonprofits just don’t have the resources to gather robust, scientifically accurate data, and yet all of us are forced to gather it causation-correlationsomehow, a lot of the information we gather is not really useable. We’ll say things like “95% of the students in our program increase their English proficiency by 25% or more, based on standardized pre/post-tests.” That sounds great, and we’ll throw that into a grant proposal, but deep down, many of us realize that’s BS, due to a host of confounding variables and biases (such as, for instance, the Selection Bias: maybe kids who are generally more motivated joined our program, and they would have done well regardless of whether they were in our program or not.)

The assumption of generalizability: People are so varied, and yet we have a tendency to assume that findings in studies can be generalized to everyone, and we make decisions based on those assumptions. Just because a study finds that looking at pictures of baby animals increases productivity, it does not mean this applies to everyone. They tested on one particular group, so it may be true for that group, but not necessarily true for other groups. So many studies do not focus on kids of color, for example, and yet assume that the results would generalize to them, and then decisions are made that affect these kids.

The dangers of simplification: The field that we are in is complicated, and there is severe danger when we try to simplify things too much. We lose out on the richness of our work, and we jeopardize programs that are effective in ways we may not be thinking about. For example, a kid drops out of an after-school program midway through the year. The program, of course, cannot count that kid toward meeting an outcome. However, who is to say what effects the program did have on this student. Maybe without even his short participation in the program, he would have engaged in some terrible stuff, such as dealing drugs or getting addicted to “True Blood” or something. How do we measure things like that?

The focus on the technical versus the adaptive: Data usually just reveal short periods in history, as longitudinal studies are time consuming and expensive. The risk of that is that sometimes we fail to see whole systems and ecosystems and how different elements affect one another. Solutions based on these data, then, may tend to focus on the short-term gains vs. systems change. For example, data saying that doubling down on math will increase math scores in low-income students, which leads to increase in math time in programs, sometimes at the expense of art, sports, etc. This does not take into account things like poverty, language barriers, the importance of “soft” skills, and other factors, or the fact that arts and sports may in the long-run increase motivation and thus overall academic achievement.

The focus on “accountability” as a way to place blame. As I talked about in “Why we should rethink Accountability as an organizational and societal value,” accountability has been about placing blame. It is an extrinsic motivation, and it has caused a lot of harm. An example is in the field of public education, where kids are forced to take endless standardized tests in the name of accountability and data. They are losing time they should be spent actually learning, and because of inequitable resources, the data are not accurate. Low-income schools and kids, for example, often don’t have access to computers at home; kindergarteners may never have touched a computer mouse before, and yet they’re taking tests on computers. Here’s John Oliver’s hilarious and also depressing take on the subject. Kids are so traumatized by some tests that they are throwing up during them, leading to some test makers to standardize procedures for how to handle booklets that have been thrown up on.

Weaponized Data

A very serious danger regarding data is when it is deployed without consideration for cultural and other competencies. Poorly thought-out data, unfortunately, is Minecraft-Blue-Diamond-Sword-Pickaxe-Set-0rampant in our sector. Again, I don’t think anyone has bad intentions when using data, but that does not prevent data from being used to cause harm. It has been done so numerous times in history, such as “scientific data” being used to perpetuate things like phrenology, eugenics, and Apartheid. Here are ways that I’ve seen data being weaponized in our sector:

Data is used to hoard resources, perpetuating Trickle-Down Community Engagement: If an organization does not have resources to collect data, then it does not have the data to collect resources. I call this the Data-Resource Paradox, and it mirrors the Capacity Paradox, when smaller organizations cannot get resources because it does not have capacity, so it cannot build capacity to get resources. Unfortunately, marginalized communities—communities of color, disabled communities, rural, LGBTQ, etc.—are left in the dust because they simply cannot compete with more established organizations to gather and deploy data. I have seen this repeatedly, recently with a City levy grant that forces nonprofits to have not just one, but two years of strong data before they’re even eligible to apply for funds that are, ironically, supposed to be going to low-income communities. Larger, more mainstream, and usually well-meaning, organizations get the resources, and unfortunately many do not have connections to the diverse communities targeted, so they “trickle-down” some of the funding to smaller organizations, perpetuating a vicious cycle. (Please see “Are you or your org guilty of Trickle-Down Community Engagement?”)

Data is used as a gatekeeping strategy: Similarly, I’ve data being used to prevent strategies from being deployed. In Seattle, for example, we have been talking about this education and opportunity gap for ages. It seems all these data-backed strategies have not been working to close this achievement gap for the past three decades. Yet, when community leaders advocate for trying something new, the response is often, “Yeah, that sounds good, but where’s the data proving that will work?” When I pushed for closer collaborations between schools and nonprofits, for example, with funding going equitably to nonprofits to do family engagement, another committee member smirked and said, “Show me the studies that prove funding nonprofit partners directly will lead to results in school.” Dude, what I’m proposing MAY not work, but what you have been supporting HAS not worked. Yet, because he was able to pull up studies faster than I could, he was able to sway the rest of the group.

Imperfect data is used as a convenient way to make tough decisions. For example, a local university decided they would discontinue a staff position focused on recruiting Asian students, stating that the data shows that there are enough Asian students, so there’s no need to focus on recruiting them. However, when the data is disaggregated, it shows Southeast Asian kids were underrepresented. The data, flawed as it was, was a convenient way for the school to make a decision and cut down on costs. (Luckily, the community banded together and pushed back, using the disaggregated data, and the position has been reinstated to focus specifically on Southeast Asian students).

Data is used to pathologize whole communities: As Dr. Jondou puts it: “There is a dangerous pattern of behavior that emerges for researchers and consumers of student_writingresearch looking at data comparing groups. The first time we look at the data, we see a difference between groups. The next we look, we see a problem. Then we look again and we assign responsibility. Finally, we pathologize entire groups of people.” Look at this study published in the Washington Post: “Researchers visited the children’s homes twice: when they were nine months old and again when they were 2 to 3 years old. On the first visit, the researchers assessed babies’ ability to manipulate simple objects, such as a rattle, and use and comprehend words; on the second visit, they assessed the toddlers’ memory, vocabulary and basic problem-solving skills.” From this, they concluded: “The research suggests that prekindergarten may be too late to start trying to close persistent academic achievement gaps between Latino and white students.”

Rochelle Gutierrez of the University of Illinois at Urbana-Champaign talks about this “Gap-Gazing” as a fetish many researchers have, and a harmful one that offer “little more than a static picture of inequity, supporting deficit thinking and negative narratives about students of color and working-class students […] and promoting a narrow definition of learning and equity” and proposes “a new focus for research on advancement (excellence and gains) and interventions for specific groups.” Why exactly are White students the gold standards for all kids to aspire to, when they all have different strengths and needs?

Really, testing kids at 9 months old and then when they are 2 or 3 and concluding that one group is definitely deficient and that it’s hopeless for them even before kindergarten? No one disputes the fact that there are differences between different groups of kids. But the conclusions place the blame and responsibility on 2-year-olds and their parents instead of looking holistically at all the systems that affect families, including poverty, education funding, curriculum, different learning styles across different cultures, etc.

De-Weaponizing Data

The emphasis on data has been both good and bad. When used right, data, like fire, can be used warm and illuminate. When used wrong, it can burn whole communities:

Consider contexts and who is driving the data: The problem of people not from communities affected by communities making decisions for those who are is very prevalent in our field, and the work around data is no exception. Who created the data? Was the right mix of people involved? Who interpreted the data? The rallying cry among marginalized communities is “Stop talking about us without us,” and this applies to data collection and interpretation.

Pay for data, especially data created by communities most affected by inequity: Getting good data takes resources. Expecting nonprofits to produce high-quality data on a small budget is futile and distracting. Nonprofits must pay for data, and funders need to support it and take some risks, especially with smaller nonprofits that may not yet have the track record on data collection. If you require data, pay for it. (Same goes if you require financial audits).

Pay for more than just data: Thanks to the push for data, we’ve been seeing lots and lots of shiny data reports. Unfortunately, few people in the field have time to read these reports, much less put the information to actual use. As I mentioned in “Capacity 9.0: Fund people to do stuff and get out of their way,” reports and toolkits and data-dissemination summits are useless unless the people in the field are there to put them to use. Fund people.

Disaggregate data: As discussed above, it is easy to lump myriad different communities into fewer categories, but the loss in accuracy is not only frustrating, it is

Baby pandum!

Baby pandum!

extremely damaging and inequitable. Where you can, be thoughtful about the categories you use to organizing groups of people. It makes a difference.

Combine data with community engagement: Data alone, considering how flawed it is, is not enough to motivate communities affected by inequity. I’ve seen whole efforts skip community engagement efforts on the belief that the data is strong enough to convince everyone to get on a bandwagon. Those efforts usually fail.

Redefine what constitutes good data: Oftentimes, it is not that marginalized communities don’t have the data, it is just that the data they do have does not conform to this mainstream definition of what data is or how it should be presented. So a study with t-tests and Pearson r’s and stuff is considered “good” data, but testimonials from dozens of people directly affected by issues is considered less desirable qualitative data? The definition of data, as well as of “capacity” and “readiness” and other concepts, has been perpetuating inequity and needs to be changed.

Re-examine comparison groups: Are they really necessary? Why is one group held up as the standards for all other groups? Instead, focus on individual groups’ intrinsic strengths and challenges and growth. Comparisons may be needed, but it’s important to be thoughtful about it to avoid oversimplifying things and falling into deficit mindsets.

Let me know your thoughts. I have to run. I’m inspired to write more lyrics for this band I might form.


Make Mondays suck a little less. Get a notice each Monday morning when a new post arrives. Subscribe to NWB by scrolling to the top right of this page and enter in your email address.

  • Devra Thomas

    I just want to hug you for this one, Vu. Especially the point about “what I’m proposing may not work but what you’re funding does not work.”

    How much “data” could we glean from using Lean and Agile principles in our work? That is, having a minimum viable product/service, getting it into the hands of actual customers/users, and then using their real feedback to make the next iteration? Sure, one still has to find the start-up funds for a small MVP, but then when it’s time to scale, we’re approaching funders with a “yes, we know this will work without a huge study” concept.

    • verucaamish

      Why is it that the nonprofit sector seems to idolize tech but then can commit to one of the core principles of tech – fail early and often?

      • Devra Thomas

        I’m not sure I follow your statement. Do you mean nonprofits don’t do that enough? If so, I agree, and it’s possibly because of either a perceived backlash of retracted funding due to “failure” or because of a perception that the inherent stakes of most nonprofits are too large to risk failing on even a small project.

        • verucaamish

          Sorry if it’s badly worded. I do agree that the culture of “best practices” discourages that kind of “failure.” I agree with you it’s fear that if it didn’t work the stakes are too high on projects, both on outcomes and lost funding. I would LOVE to share with funders everything I learned through failed project but so much of my reporting ends up with spin to show it worked and we need more money.

  • Mat Despard

    Great post Vu. The “what works” movement to fund – you guessed it – what works as in evidence-based practice has the potential to increase resource disparities between large and small nonprofits. Insistence that programs have evidence to support their effectiveness runs directly counter to the process of innovation. Focusing too much on outcomes runs counter to design thinking.

    What about qualitative data? This is data too, right? Turn to qualitative methods when you need to attempt to capture complexity.

    Like Vu, I’m a huge fan of data. But data (both qual and quant) is merely an input for organizational learning and innovation.

  • http://www.lorch.ca Rhonda Lorch

    Another great post – thanks Vu. My concern is that in the quest for numbers we have forgotten about human input. To my mind, the human stories need to be given weight in decision making. The numbers don’t capture the subtleties that can describe successful outcomes. And I love the distinction between trying something that may not work vs continuing to do something that isn’t working – not much to lose is there?

  • Beth O’Connor

    Dude – it’s Jager Bombs, not Yager Bombs

    • http://nonprofitiwithballs.com/ Vu Le

      Note to self: Spell-check after midnight…(now corrected in article)

  • Stacy Ashton

    What? You were in Vancouver? Which conference? Sorry I missed it 🙁

  • Denise Fosse

    I love this post! I remember being at a community meeting and hearing an older person of color say, “I guess we could be studied to death….because that’s exactly what’s happening”. Meaningful study is important, but I think more important is to realize that there is no ONE solution to any problem. Because in the Western world we try to “scale” everything for so-called efficiency, we lose creativity and simply impose on those we are purportedly trying to help. Let’s make all of our systems more flexible and accessible–cut down on rules and let’s not forget that money can solve major problems if it is deployed in the right way.

  • Patrick Taylor

    Totally agree.

    there is a worrying trend in philanthropy to focus on metrics and data. While this can be used as a tool for good, just as often it is used as a way of seeming like you are doing something by giving someone else more work to do nd/or justifying your existence by forcing grantees to track a lot of data points (“Hey, can you track these 100 data points so I can prove to my board how smart I was to give you a grant?”)

    There was an article in the Economist recently about how the rise of impact investing has led philanthropy and their charity partners to focus on short-term results and low-hanging fruit rather than longer-term, more intractable problems and solutions. Data can be incredibly helpful in helping understand the larger picture, and in seeing patterns that may not be obvious at the ground level. However, numbers are not going to magically solve our problems. we only have to look at the recent stock market meltdown to understand how brilliant people focused only on the numbers can make massive mistakes that have huge consequences.

  • http://sheenatabraham.wordpress.com Sheena

    Thank you for talking about using data to pathologize whole communities. It happens far too often.

  • Nicole

    I’ve found it hard to respond to applications that ask whether your programming reaches a certain % of some demographic. If your program helps any % of those who are underserved, or living below the poverty line, or who have a disability – isn’t that enough for consideration? How do you respond to that without compromising the integrity of what you do? Furthermore, I so agree with the need for capacity to carry out tasks that help with data-gathering. It is hard to do well and it takes so much time! Oftentimes, our staff only has time for data collection as an afterthought. Any suggestions on working through this?

  • Jessica Lynn

    This definitely makes me rethink how my nonprofit looks at data. Also, baby burrowing owl.

  • Paul

    Excellent post! Have you read Dr. Barry Kibel’s book, “Success Stories as Hard Data: An Introduction to Results Mapping” 1999, Kluwer Academic / Plenum Publishers, NY? Heard this fellow speak–talk about an intellect!–he was speaking using power point–and adjusting the slides as he spoke and gained new insights into his work–without hesitation or interrupting his talk!! Whoa–talk about multi-tasking on steroids. He had a formula that assigned points to a worker’s action. Say, 1 point for handing out leaflets. 2 points for tabling. 6 points for putting on a forum, etc. Haven’t kept track on what ever happened to this idea of Results Mapping. I am thinking that given time, teachers could develop a rubric for evaluating portfolios that may satisfy the feds and NCLB??? Nah, that would make sense—therefore, never happen. DRAT!!

  • jimlabbe

    Incisive piece.

    Here are some tools I have found useful for community-based research and data collection:

    Power to our People
    Participatory Research Kit: Creating Surveys

    Finding Nature inYour Neighborhood
    A Field Mapping Protocol for Community
    Based Assessment of Greenspace Access


  • Les Kutas

    Any data collector should pay for data they gather and the payment should go to the person on whom it is being collected. Individuals and aggregations should “own” their own data and should be compensated. Free email, free Facebook accounts etc. should not count as compensation. Medical data should be yours and not the doctors. This change would provide a (low level) basic income as proposed in the current Swiss referendum.

  • Sam

    I recently saw this article about the effects of poverty on the brain and though back to your blog post here. http://www.newyorker.com/tech/elements/what-poverty-does-to-the-young-brain

  • CKS

    In addition to the challenges of weighing and interpreting the data that comes with the proliferation of “big data,” I’d add to your weaponization concerns that this can just be yet another way for the majority to suppress minority complaints. Assuming it is easy to collect data among the well-resourced, data-rich majority population, then it is also easy to justify something that is working well for them. Never mind that it might be perpetuating terrific inequities elsewhere: majority rules! People often forget that democracy, which data should be in the service of, is not strictly about majority rule, but rather is about striking a balance between the needs of the many and the needs of the few. Big data, as you’ve rightly pointed out, is most often to be found in the service of the many… or rather, those with many dollars.

    While we all laud evidence-based policy decisions, it’s best to recall who has the most access to the evidence. Great post!