Health data ingestion in an on-demand world.

As a custodian of health data for citizens, data ingestion is at the heart of many population health platforms. And in an on-demand world, where people expect technology and data to make their lives more efficient and easier, it’s time for these platforms to join the party and make personal health data more accessible and meaningful to the people creating it.


Digital health is now expanding beyond the four walls of hospitals and clinics, so traditional population health platforms are now facing challenges to include unconventional data sources such as social, environmental, claims, genomic and device to create a more personalised health plan for an individual.

As these data sources grow, people are expecting greater access to, and choice about, what happens to their data. And they want to know that their information is safe. Any modern, reliable population health platform must not only have the technology to collect these new sources of data and support easy access to them, but must also provide robust control, privacy, and consent of the information it collects.

Health Level 7 (HL7) primary standards have traditionally been the most prevalent interoperability standard used for exchanging health and medical transactions. This standard has shaped much of the health data ingestion paradigm for the last 20 years. When the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009 in the United States was introduced 10 years ago it dramatically increased the adoption of Electronic Health Records (EHRs), and subsequently a huge volume increase in the use of the HL7 standards.

However, with the recent shift in unconventional data sources, there has been a huge increase in the volume and types of data collected – stretching beyond the typical domain of HL7. This has put pressure on organisations to review their data ingestion strategies and created a marketplace for health ingestion platforms that are investing in new ways of collecting, storing and processing this new health information.

Until recently, data ingestion paradigms called for an extract, transform and load (ETL). In which data was taken from the source (typically in an HL7 format) then transformed to fit the requirements of the target system and loaded as part of a data pipeline. As these new data types and sources have become available in vast quantities many systems are adapting this paradigm to now load raw data into a data lake – changing the paradigm from ETL to ELT. This shift has many benefits; it allows data to be transferred to a target system more quickly, it removes the complex transformation logic typically required as part of the data pipeline, and it gives organisations the freedom to develop transformations. Data can be pulled and transformed when required – getting the most out of the raw data as the platform evolves.

In conjunction with this shift in data ingestion, artificial intelligence (AI) capabilities have also been introduced that allow organisations to get value out of their data at a much quicker pace. There was already a wealth of health information that was not ingested to its full value. The introduction of new health data types just increases the need for ‘smart systems’ that can process and make use of the new data emerging every day. AI branches such as Machine Learning (ML) and Natural Language Processing (NLP) will be a crucial part of future health data ingestion to deliver values unheard of today. ML algorithms may be able to use the raw data available in data lakes to train and automate much of the transformation required to reduce data ingestion time and complexity. Along with the curation of a library of self-tuning transformations. NLP may be used to identify, tag, and codify vast amounts of personal health information captured in many unstructured health documents, turning it into meaningful information.

There are boundless opportunities for deriving more value out of health data with this shift in paradigm and technology. Saving clinicians time and improving health outcomes. Rapidly turning raw data into information that can be evaluated and reasoned on is key, as new data will often surface new insights, creating significant value and growth for an organisation over time. Population health platforms must embrace the exponential growth in types and volume of health data available and act to review and adapt to health data ingestion strategies to meet the demands of the future.