This site requires Javascript to be turned on. Please enable Javascript and reload the page.

Learning Data Ethics for Open Data Sharing

Effects of Good/Bad Data Ethics

While protecting research participants, de-identification can in some cases lead populations such as BIPOC communities to feel erased. If you collapse racial or ethnic minority groups into fewer categories in order to make them less identifiable, these categories may not make them feel described, or it could make them feel “othered.”

Take a look at this comic by Mona Charabi.

Now take a look at this presentation of a CNN news poll.

Charabi’s comic and CNN’s exit poll table both describe a small group of racial categories that protect identities of the participants but don’t represent, and/or misrepresent, the humans that the data is about. Sharing data isn’t just about the final datasets of spreadsheets or transcripts. What you share can also include visualizations to show how you communicate the results.

When examining the ethical principles of justice and beneficence against the risk of disclosure, what is happening by collapsing these data categories if either of these examples were the final data you shared?

How might removal of selected information, including stripping out identifiers, from a dataset distort it such that it no longer represents what it was intended to represent?

Listen to 10 minutes (28:28 - 40:18) of this podcast from Data & Society, which is giving voice to indigenous populations.

Through these examples of stories about individuals, the speaker says that sometimes, to be named is to be counted and acknowledged. It can enable a group to seek justice. Use voices to allow populations to speak for themselves. Don’t boil their qualitative individuality out into a datapoint. Talk with their communities to get their perspective about how best to uphold their privacy, as well as best reflect their interests and priorities.

Here are some resources that provide guidelines:

CARE Principles for Indigenous Data Governance
Operationalizing the CARE and FAIR Principles for Indigenous data futures (2021 article)
The San Code of Research Ethics (developed 2017 by an indigenous group in Cape Town, Africa)
NIH policy for Responsible Management and Sharing of American Indian/Alaska Native Participant Data (2022)

On the flip side, malicious things can occur by not adequately protecting data that you have shared. Corporations or governments could harvest the data for profit or surveillance (such as for terrorism or marginalization), beyond the initial goal of your research data.

The Equitable Open Data Report (2017) by the Detroit Digital Justice Coalition and Detroit Community Technology Project was created because of getting direct feedback from community residents in the city of Detroit about their feelings on the benefits and harms of Detroit’s Open Data Portal. The feedback was consolidated into recommendations that could be used to support conversations and inform policy provisions related to Detroit’s collection, dissemination, and use of open data.

As identified by Tarrant (2020), you can think about risk of disclosure based on the following questions:

What is the probability of an attacker attempting to re-identify an individual?
What is the probability of an attacker succeeding to re-identify an individual?
What are the consequences to the individual if they have been identified?

An impact model (Markham 2020) can be a helpful assessment tool to break down ethical considerations for your dissemination practices. Think about each of the following impact areas in regards to lower- to higher-level granularity and shorter- to longer-term of impact:

Immediate treatment of people
Side effects resulting from research data
Use of data after or beyond initial analysis
Long term forecasting of data use

Sources:

Markham, A. 2020. An “Impact Model” for ethical assessment, IRE 3.0 Companion 6.4, Association of Internet Researchers, https://aoir.org/reports/ethics3.pdf (pg 76-77)
Tarrant, D., Thereaux, O., & Mezeklieva, V. (2020, June). Anonymising data in times of crisis [Report]. Open Data Institute (ODI). https://theodi.org/article/anonymising-data-in-times-of-crisis/

This page has paths:

Contents of this path: