Big Data and Data Science in Critical Care
Big Data and Data Science in Critical Care
                             The digitalization of the health-care system has resulted in a deluge of clinical big data and has
                             prompted the rapid growth of data science in medicine. Data science, which is the field of study
                             dedicated to the principled extraction of knowledge from complex data, is particularly relevant
                             in the critical care setting. The availability of large amounts of data in the ICU, the need for
                             better evidence-based care, and the complexity of critical illness makes the use of data science
                             techniques and data-driven research particularly appealing to intensivists. Despite the
                             increasing number of studies and publications in the field, thus far there have been few
                             examples of data science projects that have resulted in successful implementations of data-
                             driven systems in the ICU. However, given the expected growth in the field, intensivists
                             should be familiar with the opportunities and challenges of big data and data science. The
                             present article reviews the definitions, types of algorithms, applications, challenges, and future
                             of big data and data science in critical care.                        CHEST 2018; 154(5):1239-1248
KEY WORDS: big data; critical care; data science; machine learning; prediction models
                             The digitalization of the health-care system is        clinical realities of the ICU.5 Critical care
                             changing the way we practice medicine and              research requires an integrative approach that
                             conduct clinical research.1,2 The widespread           embraces the complexity of critical illness and
                             implementation of electronic health records            the computational technology and algorithms
                             (EHRs) is paving the way for big data research         that can make it possible.7,8 Moreover, the
                             and is bringing the world of data science to the       data required to follow this approach are
                             patient’s bedside.2-4 Within the health-care           being generated and digitized in troves. EHRs,
                             system, the ICU presents a particularly                bedside monitors, medication pumps, and
                             convincing case for using data science to              ventilators are continuously generating new
                             improve patient care.5 The evidence                    minable data, and soon the advancement of
                             supporting many of the interventions                   modern molecular diagnostics will result in a
                             performed in the ICU is scarce, and practice           deluge of “omics” data derived from the
                             variability is abundant.5,6 In addition, the           genome, transcriptome, microbiome, and a
                             complexity of critical illness makes the               long list of other “-omes” (Fig 1).
                             traditional reductionist approach to medical
                             research insufficient; that is, single-drug             As big data and data science gradually
                             intervention trials or single pathway                  infiltrate most aspects of clinical research
                             biomarker studies are unlikely to satisfy the          and, ultimately, clinical care in the ICU, it is
ABBREVIATIONS:    AUC = area under the receiver-operating charac-      Department of Medicine (Dr Churpek), The University of Chicago,
teristic curve; EHR = electronic health record; MIMIC = Multipa-       Chicago, IL.
rameter Intelligent Monitoring in Intensive Care; NLP = natural        CORRESPONDENCE TO: Matthew M. Churpek, MD, PhD, Pulmonary
language processing                                                    and Critical Care, University of Chicago, 5841 S Maryland Ave, MC
AFFILIATIONS: From the Department of Pediatrics (Critical Care) (Dr    6023, Chicago, IL 60637; e-mail: matthew.churpek@uchospitals.edu
Sanchez-Pinto) and the Department of Preventive Medicine (Health       Copyright Ó 2018 American College of Chest Physicians. Published by
and Biomedical Informatics) (Drs Luo and Sanchez-Pinto), North-        Elsevier Inc. All rights reserved.
western University Feinberg School of Medicine, Chicago, IL; and the   DOI: https://doi.org/10.1016/j.chest.2018.04.037
chestjournal.org                                                                                                                   1239
                                                                          Big Data in the ICU
Patient information Contextual Diagnoses Interventions Cellular/Molecular Imaging Natural language Physiologic
  Demographics         Encounter            Billing diagnoses      Medications          Routine laboratory      Radiology             Clinician notes      Clinician-charted
                       information
  Social                                    Physician-curated      Procedures            “Omics” data           Pathology             Diagnostic reports   Device-generated
                       ICU & hospital
                                            diagnoses
       Family          operations                                  Organ support               Patient          Video recording       Patient narratives       Monitors
                                            Data-driven
       Financial       Environmental        phenotypes             Other therapies             Microbial        Photographs           Speech recording         Wearables
Generally available in the electronic health record (EHR) Not available or limited information in the EHR
Figure 1 – Some of the major sources of big data in the ICU. The term “omics” refers to the data derived from modern molecular techniques
(eg, genomics, transcriptomics, proteomics, metabolomics, microbiomics). EHR ¼ electronic health record.
increasingly evident that intensivists should be familiar                                    These and other pertinent definitions in data science are
with the promise and perils of these approaches. The                                         presented in Table 1.
present article reviews the definitions, types of
algorithms, applications, challenges, and future of big
data and data science in critical care.                                                      The Ecosystem of Data Science in Health Care
                                                                                             The data revolution in health care would not be possible
                                                                                             if it were not for several key developments, including: (1)
Definitions in Data Science                                                                   the data science movement that has transformed other
Big data can be defined as digital data that are generated                                    industries; (2) the extraordinary growth in
in high volume and high variety and that accumulate at                                       computational power; (3) the availability of open source
high velocity, resulting in datasets too large for                                           tools and low-cost equipment to perform advanced
traditional data-processing systems.2,9 In practice, big                                     analyses; and (4) the increasing availability of
data in health care depend on both the breadth and the                                       educational resources and advanced degrees in data
depth of the data being captured. For example,                                               science and related fields.12 Open source programming
administrative health-care datasets with few data                                            and scripting languages, such as R and Python, have
elements per patient record (low depth) are usually                                          extensive libraries of statistical packages and machine
considered big data problems when they contain                                               learning algorithms that are relatively easy to use by
millions of records (wide breadth). Conversely, when                                         researchers with some training and have greatly
applying next-generation sequencing and other “-omics”                                       democratized the access to data science techniques.
approaches (high depth), just a few dozen patients can
                                                                                             Educational resources, such as graduate programs in
become a big data problem (narrow breadth).10
                                                                                             data science, are also increasingly available. Notable
Data science can be defined as “the set of fundamental                                        among these are massive open online courses, which
principles that support and guide the principled extraction                                  have led to the increased popularity of data science and
of information and knowledge from data.”9 A closely                                          the applications of machine learning techniques to real-
related term is data mining, which is the actual extraction                                  world problems.13 There are now a multitude of courses
of knowledge from data via machine learning algorithms                                       from prestigious institutions that cover data science and
that incorporate data science principles. Machine learning                                   machine learning, and many of them are free. These
is the field of study that focuses on how computers learn                                     courses provide the interested researcher with
from data and the development of algorithms that make                                        educational opportunities from world-class experts at
this learning possible.11 Finally, another important                                         the touch of a button. Other websites contain user-
concept in data science is domain expertise, which in                                        created code to run machine learning algorithms from
health care can be defined as the understanding of real-                                      scratch (eg, https://github.com/) or host data science
world clinical problems and the realities of patient care                                    competitions for participants around the world (eg,
that help frame and contextualize the application of data                                    https://www.kaggle.com/). This rich online environment
science to health-care problems.8,9,11                                                       is ideal for data scientists to learn and grow, with the
1240 Contemporary Reviews in Critical Care Medicine                                                                  [   154#5 CHEST NOVEMBER 2018                             ]
TABLE 1   ] Definitions of Common Terms in Data Science
  Term                                                                   Definition
  Big data                Digital data that are generated in high volume and high variety and that accumulate at high
                            velocity, resulting in datasets too large for traditional data-processing systems
  Data science            The set of fundamental principles that support and guide the principled extraction of information
                            and knowledge from data
  Data mining             The extraction of knowledge from data via machine learning algorithms that incorporate data
                            science principles
  Domain expertise        The understanding of real-world problems in a given domain (eg, critical care medicine) that helps
                            frame and contextualize the application of data science to solve these problems
  Machine learning        The field of study that focuses on how computers learn from data and the development of
                            algorithms that make this learning possible
  Features                The data elements, also known as independent variables, used to train a model. Features can be
                            simple transformations of the raw data (eg, average heart rate in the last 24 h) or complex
                            transformation such as the ones performed by neural networks (see Table 2)
  Outcomes                The data elements, also known as dependent variables, represent the target for training in a
                            supervised learning model. Outcomes can be categorical (eg, yes/no) or continuous (eg, length
                            of hospital stay). Categorical binary outcomes are the most common in medicine (eg, died or
                            alive by 28 days). Binary outcomes are typically represented as a Boolean logic (ie, true/false or
                            1/0) but can also be represented using fuzzy logic (ie, a range of probabilities, or degrees of
                            truth, between 0 and 1)
  Supervised learning     Algorithms that are used to uncover the relationship between a set of features and one or more
                            known outcomes
  Unsupervised learning   Algorithms that are used to uncover naturally occurring patterns or groupings in the data, without
                            targeting a specific outcome
  Model training          The process through which machine learning algorithms develop a model of the data by learning
                            the relationships between features and, in supervised learning, between features and
                            outcomes. This is also referred to as model derivation or data fitting
  Model validation        The process of measuring how well a model fits new, independent data. For example, evaluating
                            the performance of a supervised model at predicting an outcome in new data. This approach is
                            also referred to as model testing.
  Predictive model        A model generally trained to predict the likelihood of a condition, event, or response. The US Food
                            and Drug Administration specifically considers predictive strategies as those geared toward
                            identifying groups of patients more likely to respond to an intervention
  Prognostic model        A model specifically trained to predict the likelihood of a condition-related endpoint or outcome
                            such as mortality. In general, the goal is to estimate a prognosis given a set of baseline features,
                            regardless of what ultimately leads to the outcome
  Overfitting              The phenomenon that occurs when an algorithm learns from idiosyncrasies in the training data,
                            usually referred to as noise. Noisy data are data that are randomly present in the training
                            dataset but do not represent the generalizable truth (usually referred to as signal) that explains
                            the relationships between the features and the outcomes. Overfitting will generally lead to poor
                            performance of the model in an independent validation dataset
  Digitization            The conversion of something analog or physical (eg, paper documents, printed images) into a
                            digital format (ie, bits or 1s and 0s)
  Digitalization          The wide adoption of digital technologies by an organization to leverage their digitized data with
                            the goal of improving operations and performance. The adoption of electronic health records
                            and other digital technologies (eg, picture archiving and communication systems for medical
                            images, pharmacy management systems, billing systems) are examples of digitalization in
                            health care
  Data curation           The process of integrating data from different sources, structuring it, authenticating it, and
                            annotating it to ensure its quality, add value, and facilitate its use and reuse
  Structured data         Data (usually discrete or numeric) that are easy to search, summarize, sort, and quantify.
                           Examples include vital signs (eg, heart rate) or laboratory test results (eg, CBC)
  Unstructured data       Data that do not conform to a prespecified structure, such as a written narrative, images, video, or
                           audio. Unstructured data are generally harder to search, sort, and quantify. Examples include
                           clinician notes, pathology slides, and radiology images
chestjournal.org                                                                                                            1241
field of medicine benefiting from advances shared                A
around the world freely through the Internet.                    Patients              Clinical features                 Outcomes
                                                                    Pt. 1                                                    Survived
Types of Algorithms in Data Science                                 Pt. 2                                                    Not survived
Machine learning algorithms are generally divided into              Pt. 3                                                    Survived
two categories: supervised and unsupervised.11 Semi-                 ...
                                                                                            LEARN
supervised algorithms represent a hybrid of the two but
have been used less often in health-care problems.                                         PREDICT
Finally, deep learning algorithms defy this classification,
                                                                   New Pt.                                                   ?
even though they derive from artificial neural network
algorithms, which are generally classified as supervised
algorithms. The most defining characteristic of deep            B
                                                                   Heterogeneous
learning is their focus on learning data representations             population                                 Cluster 1:
                                                                                                                • High Oxygenation Index
(or features) that can then be used in supervised,                                                              • Low Glasgow Coma Scale
                                                                                                                • High mortality risk
unsupervised, or semi-supervised problems. The
following discussions review these types of algorithms in
more detail (as well as in Fig 2 and Table 2).                                                                     Cluster 2:
                                                                                                                   • On inotropes
                                                                                                                   • Acute kidney injury
Supervised Learning Algorithms                                                                                     • High mortality risk
1242 Contemporary Reviews in Critical Care Medicine                                   [   154#5 CHEST NOVEMBER 2018                         ]
TABLE 2   ] Examples of Algorithms Use in Data Science
  Algorithm Class                Examples                                          Description
  Classic regression      Linear regression,      Linear regression is a supervised learning algorithm that models the
                            logistic regression     relationship between one or more features and a continuous outcome
                                                    by fitting a regression line that minimizes the sum of all the residuals,
                                                    which are the distances between each feature in the training data and
                                                    the line being fitted to model them. Logistic regression is a
                                                    generalization of the linear model that uses the logistic function to
                                                    estimate the probability of a binary outcome. To do this, the fitted
                                                    sigmoid-shaped curve of the logistic function maps the feature values
                                                    into a probability between 0 and 1
  Regularized             Lasso, ridge            An extension of the classic regression algorithms in which a penalty is
    regression              regression, elastic     imposed to the fitted model to reduce its complexity and decrease the
                            net                     risk of overfitting (see Table 1).
  Tree-based              Classification and       A class of supervised learning algorithm based on decision trees.
                            regression trees,       Decision trees are a sequence of “if-then-else” splits that are derived
                            random forest,          by iteratively separating the data into groups based on the
                            gradient boosted        relationship of the features with the outcome. Random forest and
                            trees                   gradient boosted trees are example of ensemble tree models.
                                                    Ensemble models combine the output of many trained models to
                                                    estimate an outcome
  Support vector          Linear, polynomial,     A class of supervised learning algorithms that represents the data in a
    machines                radial basis kernel     multidimensional feature space and then fits a “hyperplane” that best
                                                    separates the data based on the outcomes of interest
  K-nearest neighbor      K-nearest neighbor      A type of supervised learning algorithm that represents data in
                                                    multidimensional feature space and uses local information about
                                                    observations closest to a new example to predict the outcome for that
                                                    example
  Bayesian                Naive Bayes, Bayesian   A class of supervised learning algorithms that use Bayes’ theorem of
                           network                  conditional probability, which is the probability that something will
                                                    happen given that something else has already occurred. In general,
                                                    Bayesian algorithms work by iteratively updating the probability of an
                                                    outcome (or posterior belief) given new data
  Neural network          Artificial neural        A class of nonlinear algorithms built using layers of nodes that extract
                            network, deep           features from the data and perform combinations that best represent
                            neural network          the underlying structure, usually to predict an outcome. Neural
                                                    networks can be shallow (eg, a perceptron with two layers) or deep
                                                    (multiple layers), which form the basis for the field of deep learning
  Dimensionality          Principal component     A class of unsupervised learning algorithms that exploit the inherent
    reduction               analysis, linear        structure in the data to describe data using less information. Principal
    algorithms              discriminant            components, for example, summarize a large set of correlated
                            analysis                features into a smaller number of representative features
  Latent class analysis   Latent class analysis   A type of unsupervised learning algorithm that identifies unseen
                                                    subgroups, or latent classes, in the data. Class membership is
                                                    unknown for each example so the probability of class membership is
                                                    indirectly estimated by measuring the patterns in the data
  Cluster analysis        K-means, hierarchical   A class of unsupervised learning algorithm that uses the inherent
                            cluster analysis        structures in the data to best organize the data into subgroups of
                                                    maximum commonality based on some distance measure between
                                                    features
information in an increasingly higher order of                complex features. After deeper layers of nodes have
hierarchical complexity in the form of stacked layers of      perceived increasingly more complex features in an
nodes (or “neurons”).20 For example, if the input is a        unsupervised way, they can then be used to perform
photo of several people, the first layer of nodes might        specific tasks, such as matching the faces in the photo to
simply extract straight lines, curves, and color hues.        certain specific people with known features. In medical
Deeper layers may combine some of those lines, curves,        applications, deep learning has been used, for example,
and hues to represent eyes, noses, ears, and other more       to detect diabetic retinopathy on funduscopic images,21
chestjournal.org                                                                                                        1243
detect cancer in skin photographs,22 or predict clinical       colleagues18 used a combination of a classification and
outcomes by using EHR data.23,24                               regression tree-based biomarker risk model and gene
                                                               expression profiles in pediatric patients with sepsis to
Data Science Applications in Critical Care                     identify a subgroup of patients who were more likely to
                                                               benefit from corticosteroids.
Predictive and Prognostic Models
The most common applications of data science to                Clustering and Phenotyping
critical care problems are predictive and prognostic           Unsupervised learning algorithms in critical care have
models using supervised learning algorithms. Although          mainly been used to uncover naturally occurring
identical from a modeling perspective, predictive and          subgroups or clusters of patients who share similar
prognostic models can be distinguished semantically by         clinical and/or molecular characteristics. These clusters
the fact that predictive models are generally trained to       are oftentimes called phenotypes, subphenotypes, or
predict the likelihood of a condition, event, or response,     subtypes, although there is still little consensus on the
whereas prognostic models are specifically trained to           terminology.16 For example, Calfee and colleagues32
predict the likelihood of a condition-related endpoint or      applied latent class analysis and identified two
outcome, such as mortality (Table 1).16,25 This                subphenotypes of ARDS using clinical and cytokine data
distinction, however, is not always clear in the literature    from two randomized controlled trials of ARDS. The
and, depending on the use case, might be irrelevant.           subphenotypes identified had distinct differences in
                                                               inflammatory profiles, response to ventilator strategies,
One of the oldest and best-known prognostic models to
                                                               and clinical outcomes. Knox and colleagues33 used self-
estimate risk of mortality in ICU patients is the Acute
                                                               organizing maps and k-means clustering to identify four
Physiology and Chronic Health Evaluation score, which
                                                               distinct clusters of patients with sepsis-associated
was first developed in the 1980s by Knaus and
                                                               multiple organ dysfunction syndrome that were
colleagues26,27 using logistic regression. Since then, many
                                                               independently associated with outcomes after adjusting
other groups have developed predictive and prognostic
                                                               for severity of illness. Luo and colleagues34 analyzed
models using larger, more granular datasets and applying
                                                               multiple physiological variable trends of patients in the
modern machine learning methods. For example,
                                                               MIMIC dataset and applied nonnegative matrix
Churpek and colleagues28 developed a logistic regression
                                                               factorization to group-related trends, which were shown
model in a dataset of > 250,000 hospital admissions that
                                                               to effectively predict 30-day mortality while maintaining
accurately estimated the risk for ICU transfer, cardiac
                                                               model interpretability. Finally, Vranas and colleagues35
arrest, or death in ward patients. In a follow-up study, the
                                                               applied clustering analysis to discover and validate six
same group showed that more modern machine learning
                                                               clinically recognizable subgroups of ICU patients who
methods, such as random forests and gradient boosted
                                                               differed significantly in all baseline characteristics and
machines, could more accurately predict clinical
                                                               clinical trajectories despite sharing common diagnoses.
deterioration compared with classic logistic regression.15
In another example, Joshi and Szolovits29 used 54 clinical     Applications With Nontraditional Data Types
variable time series to predict 30-day mortality in ICU
                                                               Natural language processing: Much of the data used in
patients in the publicly available Multiparameter
                                                               critical care studies, such as vital signs or laboratory test
Intelligent Monitoring in Intensive Care (MIMIC)
                                                               results, are structured data that can be easily entered into a
dataset. They clustered the physiological measurements
                                                               relational database or spreadsheet and be sorted and
into organ-specific patient states and achieved a state-of-
                                                               summarized. However, there is a significant amount of
the-art 30-day mortality prediction area under the
                                                               clinical information contained in the form of
receiver-operating characteristic curve (AUC) of 0.91.
                                                               unstructured clinical narratives (eg, progress notes,
Predictive models aimed at identifying patients with           discharge summaries, nursing notes, diagnostic
specific conditions or those more likely to respond to a        reports).36 Methods for analyzing narrative data,
specific therapy are more commonly used in the field of          generally known as natural language processing (NLP),
oncology, with multiple examples of biomarker-based            are designed to extract features from texts that can then be
models used to diagnose particular subtypes of cancer that     used in task-specific algorithms for different purposes (eg,
respond to targeted therapy.4,16 However, there are some       prognostic modeling). Lehman and colleagues37 used
examples in the critical care literature, particularly in      clinical data and unstructured progress notes from the
sepsis and septic shock.30,31 For example, Wong and            first 24 h of ICU admissions to estimate the risk of
1244 Contemporary Reviews in Critical Care Medicine                               [   154#5 CHEST NOVEMBER 2018            ]
in-hospital mortality. They inferred topic models from         Challenges and Pitfalls of Data Science in
progress notes and achieved an AUC of 0.82, which was          Critical Care
superior to severity of illness scores based only on           Like most emerging technologies, the products of data
structured clinical variables. Ghassemi and colleagues38       science research in critical care will undoubtedly go
further investigated the prognostic power of topics as         through a series of hype and disillusionment cycles before
features from the first 24 h and achieved an AUC of 0.85        becoming accepted, proven assets in the study and care of
for in-hospital mortality when combining text and              critically ill patients. One of the first challenges that data
structured data. Weissman and colleagues39 applied NLP         science faces in critical care is that, despite the increasing
to analyze discharge documents of ARDS survivors and           number of studies and publications in the field, thus far
found that ARDS itself is rarely mentioned in those            there have been few examples of data science projects
documents, as opposed to more frequent mentions of             that have resulted in successful implementation of data-
“mechanical ventilation” and “ICU stay.” Conversely,           driven systems in the ICU.11 This lack of exposure in the
their NLP-based document classifier reported                    clinical setting inevitably results in a degree of mistrust by
100% accuracy for ARDS identification, suggesting that          clinicians in these data-driven systems.47,48 Although
NLP can be used to effectively identify patients with          clinicians are happy to use similar systems to browse
certain types of conditions.                                   their smart televisions, shop online, or interact with
Physiological waveform analysis: Physiological                 social media apps, they are wary of the idea of sharing
waveform data from bedside monitors and wearable               clinical decision-making responsibilities with machine
devices are increasingly being used in data science            learning algorithms, particularly if they view them as
studies in critical care. Many institutions collect and        “black boxes.”48 It is likely that only the implementation
store physiological monitor data, such as                      of well-designed, interpretable, and effective data-driven
electrocardiography, photoplethysmography, impedance           systems in the ICU will make clinicians start to gain trust
pneumography, invasive arterial manometry, end-tidal           in them. Furthermore, the implementation of these data-
capnography, and electroencephalography. The publicly          driven systems must be performed under the rigorous
available MIMIC databases contain physiological                auspices of well-controlled experimental studies,
waveform data for ICU patients at Beth Israel Deaconess        including (but not limited to) simulation testing,
Medical Center, which has facilitated the development of       preintervention and postintervention studies, and
the state-of-the-art waveform analysis in the field.40 For      randomized controlled trials. The medical informatics
example, researchers have used waveform data to                literature has good examples of using scientifically
estimate cardiac output data using pulse contour               rigorous approaches to the implementation and testing of
analysis techniques,41 detect hypovolemia using                digital solutions such as clinical decision support tools
photoplethysmography data,42 and predict                       and can serve as a model to follow.49-51
hyperlactatemia using combined physiologicaldata.43
                                                               Clinicians and researchers appraising a data-driven
Image analysis: The advancement in the field of deep            system and the literature that supports it must be aware of
learning, which is particularly useful for image analysis,     common pitfalls that can raise concerns about its value.
has resulted in a rapid increase in the number of studies in   The effectiveness of a data-driven system goes beyond a
this area in the last few years.44 However, none of the        measure of performance, such as an AUC or a P value. To
current published studies has tested the usefulness of         be effective, a data-driven system must produce
automated image analysis in an ICU setting. The rapid          actionable outputs for the right patients, at the right time.
growth of this field, however, will undoubtedly result in       For example, the output can be predictive information
many uses applicable to critical care situations. Perhaps      that can help a clinician decide the most effective
most pertinent to critical care clinicians is the              treatment for a particular patient as soon as a diagnosis is
advancement in techniques to detect pulmonary                  made. Furthermore, when evaluating the clinical
pathology in chest radiographs,45,46 as well as normal and     implementation of such a system, it is important to know
abnormal findings in brain and abdominal imaging.44             whether it has been tested in an experimental setting and
These techniques could be particularly helpful in ICUs         whether it has shown a meaningful impact in a population
with limited availability of specialists who can accurately    similar to the one for which it is being considered.
interpret radiographic images in a timely fashion, but
their effectiveness and safety should first be thoroughly       Unfortunately, bad data science abounds. We must
tested before any clinical implementation is considered.       make a collective effort to ensure that only good data
chestjournal.org                                                                                                       1245
science evolves into data-driven systems that can be             complex nonlinear interactions between variables, they
safely tested and used in critically ill patients. The ease of   never sleep, and they can multitask effortlessly.
access to large amounts of data and computing power              However, clinical thinking and medical decision-making
can lead to data mining “fishing expeditions” that can            are not reproducible by current technologies.10 The
result in low-quality research.3 Poorly framed clinical          qualitative aspect of clinical decision-making—the so-
problems, bad data, or debatable methods will result in          called “art of medicine”—is impossible to model
flawed data science, and it can create more problems              quantitatively. Furthermore, many factors influencing
than it solves.3,6,48 Using epidemiologic best practices to      clinical decision-making, including clinical, societal, and
analyze retrospective data, including thoughtfully               personal factors, are not necessarily reflected in the
adjusting for confounding variables, is just as important        digital records, and thus any output from a data-driven
in large datasets as it is in smaller ones. In addition, a       system will need to be first evaluated, interpreted, and
model may fit well only on the training data but                  enriched by clinicians before any action is taken.
generalize poorly to other data, a phenomenon known as           However, to achieve a successful partnership between
overfitting (Table 1). Overfitting may occur when                  clinicians and computers, we must first improve the
algorithms learn from idiosyncrasies, or noise, in the           skills of bedside clinicians at interpreting and using the
training data; techniques such as cross-validation and           output from these data-driven systems.48
regularization can be used to mitigate this problem.52,53
                                                                 Finally, another challenge faced by data science teams in
Poorly implemented digital technologies can harm                 critical care is balancing the need for data openness and
patients,54 and only a rigorous approach to their                reproducibility with the demand for data privacy and
evaluation and implementation can mitigate this risk.            security. The open data science movement calls for
Partnerships between data scientists, clinical domain            transparent and reproducible research with seamless
experts, medical informaticians, and implementation              data-sharing across institutions. Indeed, a recent study
science specialists will result in more effective and safer      showed an alarming lack of reproducibility in data
data-driven systems. Clinicians with data science skills,        science studies using the same ICU data, which suggests
clinical research expertise, and an intimate knowledge of        that algorithms, study procedures, computer code, and
the clinical realities in the ICU can help data science          even datasets should be openly available to ensure
teams capture the right data, address the right clinical         reproducibility.58 However, this data openness must not
problems, and produce the right actionable knowledge.10          result in poor data governance, lack of data security, or
Furthermore, clinician input can help minimize the               loss of confidentiality, all of which are necessary to
number of unnecessary alerts or prompts these systems            perform ethical research and maintain public trust.59
might produce, thereby reducing the risk of alert fatigue,
which is another common problem among front-line
providers working with novel digital technologies.55             The Future of Big Data and Data Science in
                                                                 Critical Care
Another common concern among clinicians is the
                                                                 We imagine a future in critical care in which data-driven
perceived loss of autonomy in the face of increasingly
                                                                 systems and clinicians work hand-in-hand. Large
more sophisticated computational systems. This concern
                                                                 quantities of clinical, physiologic, and “omics” data are
exists despite the fact that clinicians will readily
                                                                 analyzed by computational systems and are served to the
acknowledge that the complexity of medicine nowadays
                                                                 bedside clinicians in the form of manageable,
far exceeds the capacity of the unaided human mind and
                                                                 interpretable, and actionable knowledge that augment
that perhaps these novel computational systems can help
                                                                 the clinician’s decision-making capacity. Predictive
manage some of this complexity.10 To put it in
                                                                 models perform diagnostic and therapeutic
perspective, humans typically make decisions using
                                                                 recommendations, while clinicians contextualize these
fewer than six data points, because anything more than
                                                                 recommendations and coordinate their implementation.
that becomes cognitively too expensive.56 However, an
                                                                 False alerts are kept to a minimum and systems are
ICU patient can generate thousands of data points in a
                                                                 continuously improved through a collaborative and
single day, and when you add fatigue, interruptions, and
                                                                 scientifically rigorous approach.
the clinicians’ own cognitive biases, it is not surprising
that many clinical decisions end up being suboptimal.57          Data science can be transformative. There is a real
Conversely, computers can sift seamlessly through tens           opportunity that this scenario will become a reality in
of thousands of data points, they can easily analyze             the near future, but there is still a lot of work ahead of
1246 Contemporary Reviews in Critical Care Medicine                                [   154#5 CHEST NOVEMBER 2018          ]
us. Our patients entrust us with their precious data and                      19. Luo Y, Ahmad FS, Shah SJ. Tensor factorization for precision
                                                                                  medicine in heart failure with preserved ejection fraction.
we—clinicians, researchers, data scientists, and leaders                          J Cardiovasc Transl Res. 2017;10(3):305-312.
in critical care—have an obligation to use it in the best                     20. Goodfellow I, Bengio Y, Courville A. Deep learning. Adapt Comput
possible way.                                                                     Mach Le. 2016:1-775.
                                                                              21. Gulshan V, Peng L, Coram M, et al. Development and validation of a
                                                                                  deep learning algorithm for detection of diabetic retinopathy in
Acknowledgments                                                                   retinal fundus photographs. JAMA. 2016;316(22):2402-2410.
Financial/nonfinancial disclosures: The authors have reported to               22. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level
CHEST the following: M. M. C. has a patent pending for risk                       classification of skin cancer with deep neural networks. Nature.
stratification algorithms for hospitalized patients; he is also supported          2017;542(7639):115-118.
by a career development award from the National Heart, Lung and
                                                                              23. Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised
Blood Institute and a research project grant program award from
                                                                                  representation to predict the future of patients from the electronic
the National Institute of General Medical Sciences. None declared                 health records. Sci Rep. 2016;6:26094.
(L. N. S.-P., Y. L.).
                                                                              24. Aczon M, Ledbetter D, Ho L, et al. Dynamic mortality risk
                                                                                  predictions in pediatric critical care using recurrent neural networks.
References                                                                        arXiv preprint arXiv:170106675. 2017 Jan 23.
 1. Smith M, Saunders R, Stuckhardt L, McGinnis JM. Best Care at              25. Wong HR. Intensive care medicine in 2050: precision medicine.
    Lower Cost: The Path to Continuously Learning Health Care in                  Intensive Care Med. 2017;43(10):1507.
    America. Washington, DC: National Academies Press; 2013.
                                                                              26. Knaus WA, Zimmerman JE, Wagner DP, Draper EA, Lawrence DE.
 2. Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in             APACHE-acute physiology and chronic health evaluation: a
    health care: using analytics to identify and manage high-risk and             physiologically based classification system. Crit Care Med. 1981;9(8):
    high-cost patients. Health Aff (Millwood). 2014;33(7):1123-1131.              591-597.
 3. Badawi O, Brennan T, Celi LA, et al. Making big data useful for           27. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a
    health care: a summary of the inaugural mit critical data conference.         severity of disease classification system. Crit Care Med. 1985;13(10):
    JMIR Med Inform. 2014;2(2):e22.                                               818-829.
 4. Iwashyna TJ, Liu V. What’s so different about big data?. A primer         28. Churpek MM, Yuen TC, Winslow C, et al. Multicenter development
    for clinicians trained to think epidemiologically. Ann Am Thorac              and validation of a risk stratification tool for ward patients. Am J
    Soc. 2014;11(7):1130-1135.                                                    Respir Crit Care Med. 2014;190(6):649-655.
 5. Anthony Celi L, Mark RG, Stone DJ, Montgomery RA. “Big data” in           29. Joshi R, Szolovits P. Prognostic physiology: modeling patient severity
    the intensive care unit. Closing the data loop. Am J Respir Critic Care       in intensive care units using radial domain folding. Paper presented
    Med. 2013;187(11):1157-1160.                                                  at: American Medical Informatics Association Annual Symposium
                                                                                  Proceedings; November 3-7, 2012; Chicago, IL.
 6. Ghassemi M, Celi LA, Stone DJ. State of the art review: the data
    revolution in critical care. Crit Care. 2015;19:118.                      30. Henry KE, Hager DN, Pronovost PJ, Saria S. A targeted real-time
                                                                                  early warning score (TREWScore) for septic shock. Sci Translat Med.
 7. Buchman TG, Billiar TR, Elster E, et al. Precision medicine for               2015;7(299):299ra122.
    critical illness and injury. Crit Care Med. 2016;44(9):1635-1638.
                                                                              31. Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD,
 8. Johnson AE, Ghassemi MM, Nemati S, Niehaus KE, Clifton DA,                    Buchman TG. An interpretable machine learning model for accurate
    Clifford GD. Machine learning and decision support in critical care.          prediction of sepsis in the ICU. Critical Care Medicine. 2018;46(4):
    Proceedings IEEE. 2016;104(2):444-466.                                        547-553.
 9. Provost F, Fawcett T. Data science and its relationship to big data       32. Calfee CS, Delucchi K, Parsons PE, et al. Subphenotypes in acute
    and data-driven decision making. Big Data. 2013;1(1):51-59.                   respiratory distress syndrome: latent class analysis of data from two
10. Obermeyer Z, Lee T. Lost in thought—the limits of the human mind              randomised controlled trials. Lancet Respir Med. 2014;2(8):611-620.
    and the future of medicine. N Engl J Med. 2017;377(13):1209.              33. Knox DB, Lanspa MJ, Kuttler KG, Brewer SC, Brown SM.
11. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):              Phenotypic clusters within sepsis-associated multiple organ
    1920-1930.                                                                    dysfunction syndrome. Intensive Care Med. 2015;41(5):814-822.
12. Murdoch TB, Detsky AS. The inevitable application of big data to          34. Luo Y, Xin Y, Joshi R, Celi L, Szolovits P. Predicting ICU Mortality
    health care. JAMA. 2013;309(13):1351-1352.                                    Risk by Grouping Temporal Trends from a Multivariate Panel of
                                                                                  Physiologic Measurements. Paper presented at: Proceedings of the
13. Hardin J, Hoerl R, Horton NJ, et al. Data science in statistics               30th AAAI Conference on Artificial Intelligence; 2016.
    curricula: preparing students to "think with data." Am Stat 2015;
    2015;69(4):343-353.                                                       35. Vranas KC, Jopling JK, Sweeney TE, et al. Identifying distinct
                                                                                  subgroups of ICU patients: a machine learning approach. Crit Care
14. James G, Witten D, Hastie T, Tibshirani R. An Introduction to                 Med. 2017;45(10):1607-1615.
    Statistical Learning, Vol. 112. New York, NY: Springer; 2013.
                                                                              36. Sjoding MW, Liu VX. Can you read me now? Unlocking narrative
15. Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW,                        data with natural language processing. Ann Am Thorac Soc.
    Edelson DP. Multicenter comparison of machine learning methods                2016;13(9):1443-1445.
    and conventional regression for predicting clinical deterioration on
    the wards. Crit Care Med. 2016;44(2):368-374.                             37. Lehman LW, Saeed M, Long W, Lee J, Mark R. Risk stratification of
                                                                                  ICU patients using topic models inferred from unstructured progress
16. Seymour CW, Gomez H, Chang CH, et al. Precision medicine for                  notes. Paper presented at: AMIA annual symposium
    all? Challenges and opportunities for a precision medicine approach           proceedings; 2012.
    to critical illness. Crit Care. 2017;21(1):257.
                                                                              38. Ghassemi M, Naumann T, Doshi-Velez F, et al. Unfolding
17. Mayhew MB, Petersen BK, Sales AP, Greene JD, Liu VX, Wasson TS.               physiological state: mortality modelling in intensive care units. Paper
    Flexible, cluster-based analysis of the electronic medical record of          presented at: Proceedings of the 20th ACM SIGKDD International
    sepsis with composite mixture models. J Biomed Inform. 2018;78:               Conference on Knowledge Discovery and Data Mining; 2014.
    33-42.
                                                                              39. Weissman GE, Harhay MO, Lugo RM, Fuchs BD, Halpern SD,
18. Wong HR, Atkinson SJ, Cvijanovich NZ, et al. Combining                        Mikkelsen ME. Natural language processing to assess documentation
    prognostic and predictive enrichment strategies to identify children          of features of critical illness in discharge documents of acute
    with septic shock responsive to corticosteroids. Crit Care Med.               respiratory distress syndrome survivors. Ann Am Thoracic Soc.
    2016;44(10):e1000-e1003.                                                      2016;13(9):1538-1545.
chestjournal.org                                                                                                                                  1247
40. Saeed M, Villarroel M, Reisner AT, et al. Multiparameter Intelligent       50. Awdishu L, Coates CR, Lyddane A, et al. The impact of real-time
    Monitoring in Intensive Care II (MIMIC-II): a public-access                    alerting on appropriate prescribing in kidney disease: a cluster
    intensive care unit database. Crit Care Med. 2011;39(5):952.                   randomized controlled trial. J Am Med Inform Assoc. 2016;23(3):
41. Sun JX, Reisner AT, Saeed M, Heldt T, Mark RG. The cardiac output              609-616.
    from blood pressure algorithms trial. Crit Care Med. 2009;37(1):72.        51. Bright TJ, Wong A, Dhurjati R, et al. Effect of clinical decision-
42. Roederer A, Weimer J, DiMartino J, Gutsche J, Lee I. Robust                    support systems: a systematic review. Ann Intern Med. 2012;157(1):
    monitoring of hypovolemia in intensive care patients using                     29-43.
    photoplethysmogram signals. Paper presented at: Engineering in             52. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I,
    Medicine and Biology Society (EMBC), 2015 37th Annual                          Salakhutdinov RR. Improving neural networks by preventing
    International Conference of the IEEE; 2015.                                    co-adaptation of feature detectors. arXiv preprint arXiv:
43. Dunitz M, Verghese G, Heldt T. Predicting hyperlactatemia in the               12070580:2012.
    MIMIC II database. Paper presented at: Engineering in Medicine             53. Bishop CM. Pattern recognition and Machine Learning. New York,
    and Biology Society (EMBC), 2015 37th Annual International                     NY: Springer ScienceþBusiness Media LLC; 2006.
    Conference of the IEEE; 2015.
                                                                               54. Han YY, Carcillo JA, Venkataraman ST, et al. Unexpected
44. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in            increased mortality after implementation of a commercially sold
    medical image analysis. Medical Image Analysis. 2017;42:60-88.                 computerized physician order entry system. Pediatrics.
45. Gonzalez G, Ash SY, Vegas-Sanchez-Ferrero G, et al. Disease staging            2005;116(6):1506-1512.
    and prognosis in smokers using deep learning in chest computed
    tomography. Am J Respir Crit Care Med. 2018;197(2):193-203.                55. Kizzier-Carnahan V, Artis KA, Mohan V, Gold JA. Frequency of
                                                                                   passive EHR alerts in the ICU: another form of alert fatigue
46. Bar Y, Diamant I, Wolf L, Greenspan H. Deep learning with non-                 [published online ahead of print June 22, 2016]? J Patient Saf.
    medical training used for chest pathology identification. Paper                 https://doi.org/10.1097/PTS.0000000000000270.
    presented at: Proc. SPIE2015.
                                                                               56. Kahneman D. Thinking, Fast and Slow. New York, NY: Farrar, Staus
47. Tangri N, Kent DM. Toward a modern era in clinical prediction: the             and Giroux; 2011.
    TRIPOD statement for reporting prediction models. Am J Kidney
    Dis. 2015;65(4):530-533.                                                   57. Neuraz A, Guerin C, Payet C, et al. Patient mortality is associated
                                                                                   with staff resources and workload in the ICU: a multicenter
48. Verghese A, Shah NH, Harrington RA. What this computer needs is                observational study. Crit Care Med. 2015;43(8):1587-1594.
    a physician: humanism and artificial intelligence. JAMA.
    2018;319(1):19-20.                                                         58. Johnson AE, Pollard TJ, Mark RG. Reproducibility in critical care: a
                                                                                   mortality prediction case study. Paper presented at: Machine
49. Pickering BW, Dong Y, Ahmed A, et al. The implementation of
                                                                                   Learning for Healthcare Conference; 2017.
    clinician designed, human-centered electronic medical record viewer
    in the intensive care unit: a pilot step-wedge cluster randomized trial.   59. Raghupathi W, Raghupathi V. Big data analytics in healthcare:
    Int J Med Inform. 2015;84(5):299-307.                                          promise and potential. Health Inf Sci Syst. 2014;2:3.
1248 Contemporary Reviews in Critical Care Medicine [ 154#5 CHEST NOVEMBER 2018 ]