Healthcare-Based Multimodal Recovery Prediction for the Servicemember
![Graphic showing a military servicemember on the left and a full hospital scenario on the right.](/sites/default/files/styles/ifde_wysiwyg_full/public/project/image/2024-07/Multimodal%20recovery%20lead%20image.png?itok=5KnOjw7J)
Servicemembers engaging in mission-critical activities may become injured or exposed to hazardous chemical, biological, radiological, nuclear, or explosive agents throughout their service. To keep these servicemembers safe and ensure mission success, armed forces medics and unit commanders need to assess an individual’s post-symptomatic health status to decide when they may return to the field.
A team at Lincoln Laboratory is developing an explainable machine learning model using multimodal data types of both real and synthetic origin — e.g., electronic health records (EHRs), clinical notes, genetic data, biomarkers, and images — to accurately predict personalized recovery of a servicemember post-symptoms, following an injury or surgery. The model will allow for monitoring of specific health metrics and is envisioned as a tool for unit commanders and armed forces medics to make objective decisions regarding the health of their servicemembers. Additionally, the tool could be adapted for civilian use to predict recovery timelines and hospital discharge.
Understanding medical recovery is a complex process that depends on both the nature of the trauma or sickness and an individual’s medical history, state of health, and genetics. However, accessing a military dataset of sufficient size and diversity to train predictive algorithms is a challenge. Instead, the team is employing real civilian patient data (de-identified) from Brigham and Women’s Hospital — in collaboration with Dr. Vesela Kovacheva (MD, PhD) from the Department of Anesthesiology, Perioperative Pain Management — and augmenting the data with synthetic data generated by statistical methods and machine learning models.
There are no existing models to predict health recovery, nor publicly available data specific to this use case. So far, the team has identified 52 predictor variables of postoperative recovery duration and generated 250 synthetic clinical preoperative assessments using large language models. They have also developed a named-entity recognition pipeline to extract these predictor variables from unstructured text to create structured data, which will be used to develop the recovery model. The team's next steps include evaluating the quality of their synthetic data as well as building the recovery model on synthetic data and validating it against both real and synthetic data.
The team is currently looking to engage with Department of Defense partners to integrate the model with military datasets and transition this work into military medical treatment facilities. Please contact Isabel Smokelin for more information.