By Malai Sankarasubbu, VP of AI Research
In response to the COVID-19 pandemic, the White House, the Allen Institute for AI (AI2), and leading research groups have created a publicly available database that contains more than 45,000 scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses. The COVID-19 Open Research Dataset (CORD-19) is intended to help the medical community keep up with the latest research and find the most accurate answers to questions related to the virus and its impact.
To help the global research community use this database even more effectively, Saama’s AI Team built a semantic search capability on top of CORD-19, which we’ve contributed to the EndPandemic National Data Consortium.
To obtain the highest search accuracy possible, we applied a deep learning reading comprehension model—ALBERT QA, trained for Pharma—using the most advanced natural language processing (NLP) technology available. With components from our Life Science Analytics Cloud (LSAC), we were able to construct the search engine in just a couple of days. This highly accurate search engine shows every article header and text snippet that contains the search term entered. Our hope is that researchers, doctors, and even the general public can get the most enlightening answers and insights from the scholarly papers in the database, which is updated on a weekly basis.
We’re proud of our contribution and the accuracy it delivers. With easy access to better information, the faster we can all put this pandemic behind us. Please try it out and ask your coronavirus questions.
Access Out-of-the-Box Features in 4 Weeks—Guaranteed.
Saama can put you on the fast track to clinical trial process innovation.