Updated: Jun 8
In our first blog post in this series we walked through the characteristics of white supremacy, identified where they show up in our current evaluation practices and offered up alternatives, and outlined how we could seek to align our evaluation practices with anti-racist approaches. Since then, we’ve had many more conversations on our team, presented to peer evaluator networks on the topic, designed a workshop on white supremacy in evaluation, and intentionally evolved the language we use to describe how settler colonial perspectives show up in our work.
As promised, we’re working on a series of blog posts that dive deeper into how our team is thinking about and acting on equity in evaluation. Here is Part 2: Holistic Rigor.
In a way, I feel like I finally found words for the feelings I was having when I read the 2016 American Statistical Association warning about the misuse of p values and statistical significance. Thanks to Michael Quinn Patton for first bringing my attention to this.
Over 800 statisticians signed on to encourage practitioners to stop using the term statistically significant because it misleads people toward believing there is more certainty in their findings than there actually is – and because it does not reflect the world’s complexity. Instead they put forth four principles for statistical analysis:
Let’s sit with that for a minute.
In April, when COVID-19 was new and just as challenging as it is today, our team shared our ideas for the role of evaluation in times of crisis. The principle of “good enough” resonated strongly in our subsequent conversations with philanthropic and nonprofit partners.
There was an openness to rethinking the assumptions of our evaluation field. Nonprofit leaders made it clear that they needed rapid feedback and information to pivot in real time, as opposed to years-long research that could be published in a peer-reviewed journal. They needed data collection processes and analyses that weren’t overly focused on perfection.
This feedback about “good enough” led our team to reconsider usefulness to the client and to push our definitions of rigor. This thinking was reinforced and expanded by a workshop we participated in with Dr. Mindelyn Anderson of Mirror Group, who emphasized that statistical soundness is only one component of rigor. Community-centered evaluation practices are another critical component of an authentically rigorous approach.
At EC, we’ve settled on an idea of rigor that balances different ways of knowing; we appreciate other colleagues for pushing our thinking here. As a team of many social workers, we spend a lot of time advocating for the inclusion of local community expertise in evaluation. We also believe in science. We’re of the mind that we need them both.
The field of professional evaluation has traditionally valued academic knowledge or the evidence base over community knowledge, so we seek to achieve more balance by promoting more participatory forms of evaluation. Ultimately we’re looking to honor the multiple ways of knowing and constellation of perspectives embedded throughout our evaluations. We use some of these phrases to help guide our definition of rigor:
It is only rigorous if it useful to our partners
It is only rigorous if it builds partners’ capacity
It is only rigorous if it is participatory
It is only rigorous if it is equitable
How should adopting this expanded definition of rigor shift our practices for the long term? Who inspires your understanding of rigor? What else should our team at Emergence Collective be thinking of as evaluators around rigor?