Close Icon
Article Blog April 30, 2020 4 minute read

Clinical Research Problem-Solving: Improving Data Quality

What if you had an automated solution that thinks like a data manager?

Garbage in, garbage out. The phrase, coined by IBM’s George Fuechsel to describe the importance of good computer programming, also rings true for clinical research professionals who need quality in to get value out.

From First Patient Visit through Database Lock, data quality issues can make or break the success of a clinical study, in terms of meeting milestones on time and, ultimately, getting regulatory approval. Quality data from start to finish helps clinical operations professionals make better decisions, mitigate risk, and measure the success of plans and goals more effectively.

Ensuring data quality, however, is a daunting task. Manual entry and review processes are prone to errors—especially in these unprecedented times where most of us are preoccupied with concerns about the public health crisis. Even in normal conditions, data managers spend tremendous amounts of time generating queries to reconcile data disparities. Most queries take between 24 hours and seven days to get resolved, largely because clinicians have other things going on in addition to the trials they’re supporting.

Introducing the Data Quality Engine

In an effort to streamline the data management process and ensure ongoing data quality, Saama has developed a Data Quality Engine (DQE) that runs on top of your source system data tables. The DQE parses through each row, attaching data quality flags whenever pre-defined data rules are violated.

The DQE proactively notifies data managers of data errors, missing data, or aged data, so they can take appropriate action quickly. If you get alerted that your CTMS data is erroneous, for example, you have a clear jumping off point for rapid issue resolution.

The DQE is a data double check that automatically flags anything that might be missed, or messed up, during a manual review. With the DQE backing up your data management team, you go a long way towards improving oversight and vendor accountability.

Ultimately, the DQE will help data managers achieve source data verification (SDV) faster and get more accurate insight into the progression of clinical trials.

Data Quality Engine—Rules and Flags

Data Quality RuleDescription
Mandatory CheckVerifies that an entered value is not null. Example: If a Subject Enrollment Date is null, the DQE will flag the row with an Error or Warning.
Data ConformityVerifies that an entered value meets expected data requirements. Example: If Site Open Date is set before Site Closed Date, the DQE will flag the row with an Error or Warning.
Referential IntegrityVerifies that an entered value can be referenced by other relational tables. Example: If Visit ID from Subject Visit doesn’t match the Visit ID from the Form Table, the DQE will flag the row with an Orphan.
Uniqueness ConformityVerifies that entered value is unique per row in the source table. Example: If a Visit ID or Form ID is not unique, the DQE will flag the row with an Error or Warning.

See Your Data Quality at a Glance

When you install the DQE, a Data Quality Dashboard displays your data quality metrics across all source systems for each study, so you can easily identify data entry issues and track the exact source of data quality rule violations. 

High-level data summary cards show how many records have been flagged as good or bad and the last time files were checked. These cards are intended to show insights into the total count of data records, bad data records, and your data quality rate across all your enabled systems. Any row flagged with an Error, Orphan, or Warning is considered a bad data record.

Additionally, you can get an overview of which records failed on which rule or flag, to see which rule types are failing the most. A summary table showcases the count of all records grouped by flag for each enabled source system, and a pie chart displays the count of records across all source systems that have failed on a particular data rule.

In addition to these high-level views, the Data Quality Dashboard provides granular details to help verify the quality of your data sets. A Record Failure Summary identifies the exact source systems, source tables, and DQ Flags your records are labeled with, and you can get specific rule violation details for selected source systems and tables.

Here’s a list of the type of rule violation details captured:

Source Column NameSource table column name
Source Column ValueSource table entered value
Rule TypeDQ rule configured on that source table column
Error ReasonDescription of DQ rule failure
SeverityDQ flag shown when DQ rule has been violated
DQ TimestampTime when the DQ rule was last checked

Every day, the pandemic is putting a spotlight on how we can transform the clinical trial industry through better business intelligence. Our new Data Quality Engine—among other solutions attached to our AI-powered Life Science Analytics Cloud (LSAC)—will be an important enabler of more successful clinical studies in the months and years ahead. As technology insights empower study leaders and data management teams to make better decisions, will you be ready? 

Saama can put you on the fast track to clinical trial process innovation.