The Intelligent Sight and Sound Project dataset is the first multimodal archive of insights developed in the United States for the prediction of chronic cancer pain. Through this work, we’re helping NIH pioneer research to accelerate AI-based pain models that shed light on diverse, real-life clinical data about patients’ lived experiences of illness, diagnosis, and treatment.
In creating this solution with NIH, our team built seven baseline machine-learning models using conventional and fusion neural networks. The best-performing multimodal tool for chronic pain detection fuses different data types across multiple signals instead of relying on facial images alone. To eliminate disruptions in data gathering due to the COVID-19 pandemic, ÎÞÓÇ´«Ã½ created a smartphone app that enabled NIH to continue to collect patient data. Patients submitted their medical narratives through self-recorded videos, which we extracted into multiple signals.
Today our team continues to train and update new models as patients continue to enter the study, with our work extending to the integration of text and thermal imagery captured in clinic. The data now includes more than 500 smartphone videos and nearly 200,000 video frames, and the amount of data will continue to grow over time. After NIH recruits all patients for the trial and we update models for the entire set of subjects, the models will run in dynamic fashion using full-motion video.
NIH has implemented controls to govern the release of the dataset to safeguard personally identifiable information within facial images. As the nation’s largest repository of cancer pain information, the dataset will be open to AI researchers on a case-by-case basis for specific medical AI research projects. In the future, our research team will investigate the possibility of using multispectral generation of facial pain videos as well as creating a new way to secure sensitive medical data through the application of federated learning to train models.