Learning Data Ethics for Open Data Sharing

What Goes Into a Data Repository Record?

Let’s look at a data repository record together. Then you can look at a record yourself!



What are some first things you notice?

You may have noticed things at the top first: the title, the date the latest version of the record was published, who the researchers are, and that this data record has a DOI. A DOI especially helps to add permanency, and findability to the data.

You may next have seen some actionable options in this record, what you’re able to do with it—download, analyze online, and access restricted data. Access restricted data? This should clue you in on the fact you may not be able to get to the data easily. The notes box on the right side, highlighted in red, also should alert you there will be some files you can’t access immediately.

Are there any files you can access, you wonder? To find that out, you will see there’s a Data & Documentation tab. Here are documentation files about the data, which could help you evaluate if this data would be helpful for you try to reuse.


Other things you can do to try to learn about this dataset are to go back to the At a Glance tab and look through the metadata available. You will see a robust summary, which includes the purpose of the data project, major topics covered, and research questions the data attempted to answer. This particular summary provides some details about the locational scope, sample, date range, and various variables that are involved; it appears that there are various different types of data included: demographic characteristic variables, variables about actions taken, exam characteristic variables, injury characteristic variables (which are likely grouped by a controlled vocabulary of anatomical categories), suspect characteristic variables, and legal resolution variables. Scanning through this not-tremendous but still rich amount of detail in the summary helps a lot to determine what all you can expect from this dataset, without even having to download the file at all.


Besides a brief summary about the data, you can see more helpful details, such as geographic coverage keywords, the date range that the data was actually collected—note date of data collection is different from the publication date of the repository record, and the last updated version date!—as well as methodology details like the time method  at which the data was collected (cross-sectional; i.e. a single point in time), the data type (administrative and medical records, clinical data).

Something special about ICPSR as a data repository is that it also links to various publications (not just the authors) that have used the data. You could access those publications to see if there’s any additional uses, methods, or other data used in combination worth noting.

All this is to say, this amount of detail, from the open metadata and the open documentation files, may make you decide you would like to proceed with requesting restricted access. Now to look into what hoops you would have to jump to be able to request this data…

Now you try it!

Select a data repository record from this list of restricted data records on ICPSR, or if you want a challenge, on this list of restricted data records on OpenICPSR (you’ll notice these repository records are not as good due to the researcher self-submitting records rather than getting curatorial staff support), and think about the questions below.

What are you able to access, and what are you not able to access?

What do you learn about the dataset? Can you at least answer these 5 things below:What’s good about this repository record? (See if you can connect it to FAIR principles and ethical principles)

Can you tell why this dataset has been restricted? (For example, can you tell what sort of sensitive information is likely in here?)
Why might this data be restricted rather than closed completely?

Is there anything about the metadata, or the documentation, that you could think of that would make the data be more FAIR, or more ethical?

Source:

This page has paths:

This page is referenced by:

This page references: