The Red Hen Anonymizer and the Case Protocol for De-identifying Audiovisual Recordings

Speakers

Mr  Yash Khasbage, Indian Institute Of Technology–Hyderabad and Prof. Mark Turner, Case Western Reserve University

Abstract

Advances in science often depend upon great ranges of data. If machine learning is the rocket, data are the fuel.  In genomics, astrophysics, materials science, energy, and many other fields, science is improving because of new methods for amassing, wrangling, and sharing data. But researchers who study advanced higher-order human cognition are often stopped from sharing various kinds of data because of concerns about privacy. Researchers have boilerplate systems for cybersecurity—systems like Box, guides to DFARS compliance, etc.—so that the original data are shared with only those who are authorized to view it.  Researchers also have boilerplate expectations for de-identifying text, such as eliminating columns of information from the csv files that result from computer-mediated behavioral experiments or surveys. But we have no established policy for sharing de-identified audiovisual recordings. Just now, science has invented new technology that lets researchers de-identify both voice and appearance, and produce JSON output indicating bodypose and face and hand keypoints in numerical form, suitable for computer search, machine learning, etc.  The Red Hen Anonymizer is a new tool for de-identification. We introduce it and present its features. We also introduce the guidelines that have been established by the Case Western Reserve University Institutional Review Board (IRB) for its use. Those guidelines are referred to as “the Case Protocol for Audiovisual Recordings.”