|
DataSpeaks responds to the
call for Ideas from the Grand Challenges Organization of the Gates Foundation.
Improve Global Health through a Computational Method Enabling Discovery Science with
Time-Ordered Data: Investigating Dynamics and Change for Life Systems
Summary
Dramatic improvements in global health demand dramatic improvements in how we use time-ordered data to measure, discover, explicate and visualize interactions that describe dynamics and change in function, response, and agency for life systems. The Challenge is to develop a fundamentally new computational method, embodied as software, to help transform floods of data into scientific understanding of life, health and disease; to discover and develop new treatments and other health products; and to enable new Internet-based health information services that improve research, practice and health. We are poised for rapid progress. The new method must meet a set of at least ten achievable requirements and enable a new category of software that breaks through the roadblock that stands between where we are now and where we would like to be in science, medicine, and public health.
The Time is Right
The foundations for rapid progress are in place. Data-driven discovery science is just beginning to overcome limitations of hypothesis-driven science for elucidating systems. We have decoded genomes of humans, pathogens (e.g., malaria parasite) and disease vectors (e.g., mosquito). We monitor gene expression, embark on functional proteomics and image brains in action. Developments in microarrays, miniaturized sensors, and Internet-enabled monitoring devices that allow us to monitor the substances, actions and environments of life systems are producing huge and growing backlogs of data. Computational infrastructure is being extended globally. Processor speeds are increasing. There is huge demand for better health. World economies need a boost from healthier people and better technology. The world is waiting for breakthroughs that will come from discovery science for time-ordered data.
The crux of this Challenge is expressed in articles that commemorate the 50 th anniversary of discovering the double helix and recently decoding the human genome (e.g., Special Section, “Building on the DNA Revolution,” Science , 11 April 2003). We need to go from genomes to life by moving beyond relatively static structures and “parts lists.” Discovery science needs to investigate the dynamics of how life systems function, respond to environments including treatments, and act as agents over time . We also need better ways to investigate how dynamics change as organisms develop, adapt, age, become disordered and respond over time . We need to advance beyond data snapshots to data movies; analyses of parts to syntheses of systems. Life systems include whole organisms, cells, molecular machines, complex regulatory pathways and networks, communities of organisms and ecosystems; component systems such as immune, digestive, nervous, endocrine and cardiovascular systems; as well as health, economic and social systems that affect health.
The data processing methods of yesteryear, which include statistical analyses, neural networks, and genetic algorithms, will continue to be important but have been insufficient to explicate complex adaptive systems. As a result, our ability to collect data vastly exceeds our ability to transform data into discoveries, scientific information and knowledge that can improve health-related decisions, policies and actions. This slows progress in creating revolutionary new healthcare products and services. Approvals of new chemical entities are declining. Drugs are excessively expensive for lack of an additional computational method for research and development.
The insufficiency of conventional computational methods also is illustrated by Wolfram who created Mathematica, a leading scientific software product, and authored “A New Kind of Science.” Whatever the eventual verdict for Wolfram's Principle of Computational Equivalence, which contrasts with both data and hypothesis-driven science, Wolfram recognizes that we still need to remove a critical methodological roadblock to scientific progress.
An additional category of software, embodying a new computational method, is as critical to improving health dramatically as new DNA-sequencing methods and tools were to decoding entire genomes.
Ten Achievable Requirements for Meeting This Challenge
The method and software system that meets this Challenge must address at least ten requirements as a set . Recent developments make this possible.
Measurement – Leroy Hood, Director of the Institute for Systems Biology in Seattle, identified the first requirement. “We don't know how to make measurements [of function and interactions] that are really critical” ( Science , 294, p. 84). The foremost thing that we can do now to be more scientific and build on the DNA revolution is to begin measuring such interactions. Values of new measures of dynamic interaction are computed from values of other measures that characterize the composition or actions of systems and their environments. For life systems, the interactions can, as examples, involve DNA, RNA, proteins, lipids, carbohydrates and other endogenous substances; neural, cardiovascular and immune activity; symptoms, behavior and performance; as well as drugs, tasks, pathogens, nutrients and pollutants in system environments. New measures of dynamic interaction between measures of treatment and health quantify benefit/harm over time and across health variables. Then clinical trials could begin testing benefit/harm, as distinct from health variables, to evaluate safety and efficacy. Measures of correlation that work well for cross-sectional data are not well suited to measure dynamic interactions with time-ordered data.
Dynamics, Change and Time-Ordered Data – This Challenge targets a new metric system to investigate dynamics, coordination, connectivity, integration, agency and health impact, either constant or changing. The metric must be suitable for underutilized time-ordered data – the richest source of information about how systems function and become disordered, respond, act as agents and change over time. The metric system must help reveal pathways and cascades of activity and provide a detailed accounting of temporal parameters involving delay, persistence, episodes and pulses of events. It must help evaluate the temporal criterion of causal relationships and facilitate longitudinal study designs such as randomized multiple N-of-1 clinical trials that can improve the ethics and cost-effectiveness of research. Computational methods that work best for time invariant systems are not responsive to this Challenge.
Individuality – The metric must apply directly to data for individuals to account for diversity in genotypes, phenotypes, and histories. Measures of disordered interaction must be suitable for diagnosing functional disorders for individuals. The metric must be well suited for individualizing treatment and identifying genetic and other predictors of differential response. Computational methods that average are not responsive to this Challenge.
Complementarity – The required computational metric system must be complementary to the statistical method. Values of the new measures that are obtained from two or more individuals must facilitate statistical analyses and hypothesis testing. The new metric must account for replication over time within individuals whereas statistics accounts for replication across individuals that form groups. Methods for group analyses that do not accommodate dimensional independent variables for individuals (e.g., multiple planned doses, actual doses, blood levels of drugs) are not responsive to this Challenge.
Complexity – The new metric must be able to address hundreds or thousands of variables simultaneously for high throughput screening. Furthermore, it must be able to account for complex relationships (e.g., Boolean events), synergy and redundancy involving sets of many variables. Recent evidence suggests that high human complexity is due more to interactions than gene number.
Nonlinearity and Nonadditivity – The metric must be suitable for systems that exhibit nonlinearity and nonadditivity.
Hierarchies – The metric system must facilitate investigations of systems that are organized as structural hierarchies (e.g., cells, tissues, organs, organ systems, organisms as wholes, communities) and manifest themselves at different functional levels (e.g., laboratory measures, symptoms, mental and physical performance, quality of life). Investigations of hierarchies help elucidate emergence and the clinical relevance of biological measures.
Extension – The metric must extend prodigious and critical human capabilities to perceive patterns of association and contingency in time-ordered data by using objective, transparent, reproducible, computational procedures that can be specified in protocols and automated for application to large, complicated data sets. The method must help make data speak. It must enable new Internet-based information services for more personalized evidence-based medicine.
Efficiency – The metric must improve research efficiency by, for example, reducing the number of subjects required in many clinical trials and targeting drug development to patients that will benefit and away from patients that will be harmed.
Models – The metric must inform development of realistic mathematical models and simulations.
Impact
This Grand Challenge transcends particular diseases, the health problems of particular populations and the distinction between treatment and prevention. It calls for a new metric system that will foster accountability and help unleash the power of science to improve health by measuring, discovering, explicating and visualizing:
Dynamic interactions that describe mechanisms by which life systems function over time,
How these mechanisms become disordered, change and are affected by treatments and other exposures,
Interactions between treatment and health that quantify the benefit/harm of treatments,
Interactions that help make collections of time-ordered data, including medical records, more valuable.
Software embodying this method affords a business opportunity that will drive demand for computational resources and help insure health impact without continued philanthropy and grants.
Comment
This Challenge requires transcending the status quo. Transcendence demands determination, leadership and resources to realize the promise of the insight and know-how underlying this document. The Grand Challenges Organization and its benefactors can seize this opportunity to advance software that enables dramatic improvements in life science and health.
|