In the race to find vaccines to end the pandemic, Pharma companies worldwide concentrated tremendous effort on the task and embraced innovation whenever possible.
One such innovation is Saama’s AI-based Smart Data Query (SDQ) data management solution, developed in partnership with Pfizer. The objective was to automate query management, using machine learning to predict discrepancies, identify reasons for those discrepancies, and auto-generate query text with ‘human-in-the-loop’ oversight.
The diagram below shows how Pfizer data managers worked in tandem with the evolving AI to quickly, and in many cases automatically, resolve clinical queries in their COVID-19 studies.
The workflow followed these five steps:
- Site investigators feed information via eCRFs into the sponsor’s EDC system. Because the EDC is seamlessly integrated with SDQ, the AI can immediately review the data and make predictions without the need for data duplication or overwrite approval. Data managers can then review the predictions for accuracy.
- Queries are automatically generated as eCRFs, stored in the EDC. The AI bins the queries by confidence interval and provides links to the discrepant data points with a first-pass, automated response.
- Data managers can view each data discrepancy and AI-generated response in the SDQ user interface; if the AI response is correct, the data manager signs off on the change (step 6) and the query is raised in the open state to the eCRF in the EDC.
- Meanwhile, the SDQ AI recognizes that its suggested query text was correct/appropriate, and the underlying SDQ algorithms improve.
- If the auto-generated query text is either partially correct or incorrect, the reviewer can edit the response in the SDQ user interface (step 7) before raising the query. The physical process is very intuitive and efficient: just a few clicks and you’re onto the next query. Because the SDQ interface is closely linked with the EDC, auto-generated queries are available for review almost immediately after eCRFs come into the EDC system. In either case (outright rejection or amendment), the SDQ learns from the reviewer’s feedback and gets better at generating automated responses.
This entire process, from eCRF entry to auto-query generation and review, takes just minutes and gets faster as more clinical data is passed through and reviewed in SDQ.
While we continue to evaluate the performance of SDQ, and while the AI engine continues to get better as it’s trained on more data, the preliminary results have been impressive:
- High-Volume Reconciliation
More than 105 million data points were reconciled in just 4 months
- 15X Time Savings
Median calendar days, from data capture to query generation, were reduced from 25.4 (for all vaccine studies) to just 1.7 days. The total time saved in reviewing data and automating query text is estimated to be between 2,800 and 3,500 hours.
- Unstructured Text Analysis
Using natural language processing, more than 750,000 free text sentences and phrases were successfully processed, to help detect adverse event signals and reconcile them with medical histories and case report forms.