HOME :: CHAPTER 12 :: Essay 12.2 |
PREVIOUS :: NEXT |
Essay 12.2
Modeling the Virtual Plant: A Systems Approach to Nitrogen-Regulatory Gene Networks
Miriam L. Gifford, Rodrigo A. Gutiérrez, and Gloria M. Coruzzi
August, 2006
In the post-genomic era, biologists are faced with the challenge of analyzing and integrating huge volumes of data within a biological context. This genome-scale data comes from many different sources, including genome sequences, microarray experiments, protein:protein interaction data, and large scale measurements of enzyme and metabolite levels. Such a massive influx of information and the desire to integrate it and make it biologically coherent, has forced us to think not in terms of single molecules but in terms of "systems." In other words, in order to truly understand biological organisms, we must not only comprehend how genes, proteins, and cellular elements function, but we must integrate these many entities in order to understand the connections that link them together. This can be achieved using a systems approach. Systems biology is "the exercise of integrating the existing knowledge about biological components, building a model of the system as a whole, and extracting the unifying organizational principles that explain the form and function of living organisms" (concept reviewed in von Bertalanffy 1968). In practical terms, this involves an iterative process that includes (i) data collection and integration of all available information (ideally all components and their relationships in the organism), (ii) system modeling, (iii) experimentation at a global level, and (iv) generation of new hypotheses that start a new round of the cycle.
Although ecologists and physiologists have been using a systems approach to study plants for many years, a systems biology approach that can be applied to study molecules is only feasible now with the advent of genomic technologies that can supply us with a sufficient volume of information at many levels of organization. Thus, the exciting prospect of the post-genomic era is for the first time to be able to integrate knowledge across different levels of biological organization and to anchor this at the molecular level.
The ultimate goal of systems biology as applies to plant research, is to generate a working model of the plant as a whole that describes processes across all layers of biological organization: molecular, cellular, physiological, organismal, and ecological, using a model that does this in a dynamic fashion. Succeeding in this endeavor would be a major step towards understanding the genetic blueprint of Arabidopsis, as well as obtaining a holistic view of form and function of plants. For example, a systems biology approach to modeling gene interactions in plants would help us identify the gene networks that are important in regulating development, or that are implicated in the plant responses to abiotic treatments and stress, for example. Furthermore, it would help us find the connections between these different gene networks. Finally, it would help us to make predictions about the effect of a perturbation (e.g., gene mutation) in complex traits of interest, such as seed yield or plant growth, or to determine the conditions that would optimize the growth of the plant.
Nitrogen Regulated Gene Networks in Plants: A Case Study
We are using a systems approach to model plant gene network responses to nitrogen treatments in plants. Modeling this gene network would have significant agricultural implications, as nitrogen (N), which is commonly taken up as nitrate (NO3) or ammonium (NH4) from the soil, is a key limiting nutrient for the synthesis of amino acids, nucleotides, and vitamins. Nitrogen also appears to be associated with the regulation of developmental pathways, as aspects of development are controlled by a number of signaling networks that integrate responses to environmental, nutrient, and hormonal signals. For example, nitrate, ammonium, and nitrogen metabolites (such as glutamate) are all implicated as N-signaling molecules that regulate root growth. However, little is known regarding the master regulators affecting the N-regulated gene networks that control growth and development in relation to sensing such nitrogen signals.
We are addressing these questions as a working example to test how a systems approach can be used to decipher how N-treatment regulates and integrates the expression of genome-wide networks in Arabidopsis. To do this, we investigate genomic data generated from nitrogen-treated plants, by going through the iterative "systems biology" cycle. We first use gene-chip technology to examine changes in gene expression on a genome-wide scale according to changing nitrogen levels; we then assay these alterations in different organs or cell-types of the plant. Next, we integrate this microarray data with the data concerning interactions of genes involved in metabolic and developmental signaling pathways. We can then use these gene network models in a predictive fashion to identify putative regulatory hubs that link and effect the N-response of interconnected gene networks affecting growth and development. Finally, these models may be tested in the lab by further N-treatments and by the analysis of Arabidopsis mutants in these putative network hubs. At the core of our systems approach lies the construction of an Arabidopsis molecular "multinetwork," which we use to map and understand connections between diverse genes and molecules within the plant cell. We are building our multinetwork in an interactive framework—the "VirtualPlant"—which houses a multitude of genomic data types about genes and their interactions from many different treatments, allowing us to globally analyze and ultimately model gene network responses in Arabidopsis.
Data Collection and Integration
A multinetwork displays information about the many different ways that genes, proteins, and molecules (most generically referred to as "nodes"), can be connected to one another (Figure 1A). The "edges" that connect the nodes are drawn based on evidence gained from either experiments in the lab or using predictive algorithms (Figure1B). For example, a protein:protein interaction in which two proteins form a complex could constitute such an edge (connection between two proteins, or nodes). Protein:protein and protein:DNA edges could be experimentally determined, but also might be predicted based on two homologous proteins interacting in a different species for the former, or the presence of a transcription factor binding site in the promoter of a gene, for the latter. Another experimentally derived edge could originate by determining that a gene encoding a certain enzyme utilizes a particular metabolite in a nonreversible catalytic reaction (the interaction would thus be a gene-encoding enzyme:metabolite. An "edge" connection between genes could also be drawn based on transcriptional activation of a target gene by a transcription factor, depicted as a protein:DNA interaction edge. Both of the latter two examples include nodes that are connected by "directed edges" (i.e., the transcription factor regulates the target gene, not vice versa). Alternatively, an edge might be nondirected, as is the case for edges representing a protein:protein interaction.
In this Arabidopsis multinetwork, we have integrated what is known about the molecular interactions in the context of a plant cell from various data sources (e.g., enzyme:metabolite, protein:protein, microRNA:target) (Figure 1B) into a single coherent multinetwork model (Figure 1C). At present the multinetwork has 6176 gene nodes, 1459 metabolite nodes, and 230,900 interaction "edges" connecting the nodes. Two nodes can have multiple edge connections. Querying this multinetwork with a gene list (e.g., from a microarray experiment), displays all connections between the genes of interest and other biological molecules (Figure 1C,[iv]). The query could be comprised of a list of genes that have been experimentally determined to be regulated by an N-treatment in roots by microarray analysis, as compared to a mock-induction, for example. The user can then examine how these N-regulated genes are connected to each other to determine what genes and biological processes that are N-regulated are connected in a network.
System Modeling
Our Arabidopsis multinetwork provides a framework onto which one can analyze and integrate experimental measurements, such as the levels of gene expression in roots resulting from a nutrient treatment. This facilitates examination of multiple mechanistic connections between both genes and proteins. For example, our lab has used the multinetwork model to identify the subnetworks of genes whose expression is controlled by nitrate, carbon, and light treatments.
Experimentation At A Global Level
In the working example shown in Figure 2, we first used microarray analysis of treated Arabidopsis plants to determine lists of genes that were regulated by either carbon (C), light (L), or nitrogen (N) treatments. Next, we intersected these three lists, yielding a set of 414 genes that were regulated by all three input signals. When we queried the multinetwork with this list of 414 genes, we found that 251 of these C/L/N-regulated genes were associated with each other in the multinetwork. In order to understand the connections between these genes, we first focused on the 44 metabolic genes within the network (blue hexagons in Figure 2). One group of these metabolic genes within the network is involved in energy and carbon metabolism (purple cloud in Figure 2), while another group is involved in amino acid biosynthesis (yellow cloud in Figure 2). These metabolic processes are associated with the types of signals that the genes were found to respond to, which are involved in harvesting energy from light, building carbon skeletons, and using the C-skeletons generated to assimilate nitrate to build amino acids. In addition, many of the C/L/N-regulated genes in this network are known to be localized to the membrane/cytoplasm (orange cloud in Figure 2), which is indicative of the transport of nutrients within the cell.
Generation of New Hypotheses
One novel finding from our network analysis of this set of C/L/N-regulated genes is that the genes involved in these processes appear to be interconnected by several transcription factors (see green diamonds in Figure 2). These transcription factors may act as regulatory "hubs" to coordinate energy metabolism and amino acid biosynthesis in response to C-, L-, or N-treatments, possibly as a means of resource partitioning to ensure that amino acids are only made when sufficient energy from carbon is available.
Finally, we further analyzed the C/L/N-network to include a survey of the nonmetabolic genes of cellular function, which are also predicted to be regulated by the putative transcriptional hubs. Cellular processes with associated genes connected to the transcription factor "hubs" are shown in colored ovals in Figure 2; the number of genes associated with each term is shown in parentheses. These processes include cellular communication, defense, signal transduction, and transport facilitation. Therefore, in addition to coordinating metabolic processes, these transcription factor "hubs" also seem to integrate other cellular processes in response to the light, carbon, and nitrogen signals. Intriguingly, there are many predicted targets for which the gene is not annotated. Thus, the network analysis helps us to connect genes of unknown function, with sets of genes of known function, enhancing our ability to begin to learn the function of unknown genes ("guilt by association"). Furthermore, these potential regulatory hubs in the gene network are now targets for mutant analysis and further microarray analysis in the lab. This analysis will validate which of these putative regulatory hubs are functional in vivo.
Future Prospects
In order to predict how these gene networks respond under untested conditions, we will start to use statistical methods that extrapolate and so describe the behavior of the genes in the multinetwork in response to various nutrient regimes. As additional molecular details become available, our multinetwork models will incorporate dynamic patterns of gene expression using data from treatment and developmental time courses (such as from Schmid et al. 2005) and networks in the context of spatial resolution (for example at the single-cell level, Birnbaum et al. 2003). The most valuable aspect of the systems approach is not simply to model what we know, but, more importantly, to make predictive models, which will enable us to predict how a system will react under untested conditions. Importantly, it will ultimately enable us to determine how a whole system will respond to perturbations of a regulatory "hub." For example, if we alter a regulatory gene connected to a network of carbon metabolism genes, how will perturbing that regulatory hub affect the connected genes in nitrogen metabolism? The multinetwork analysis should allow us to make these predictions in silico, before we test them in transgenic plants. Thus, the power of the systems approach to predict how changes in one gene affect the plant system or network has enormous value for practical applications in agriculture and plant biotechnology.
References
Birnbaum, K., Shasha, D. E., Wang, J. Y., Jung, J. W., Lambert, G. M., Galbraith, D. W., and Benfey, P. N. (2003) A gene expression map of the Arabidopsis root. Science 302: 1956–1960.
Gutierrez, R. A., Lejay, L. V., Chiaromonte, C., Shasha, D. E., and Coruzzi, G. M. (2006) A systematic test of C/N interactions in Arabidopsis. In preparation.
Schmid, M., Davison, T. S., Henz, S. R., Pape, U. J., Demar, M., Vingron, M., Scholkopf, B., Weigel, D., and Lohmann, J. U. (2005) A gene expression map of Arabidopsis thaliana development. Nature Genet. 37: 501–506.
Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003) Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 13: 2498–2504.
von Bertalanffy, L. (1968) General System Theory: Foundations, Development, Applications. George Braziller, New York.
HOME :: CHAPTER 12 :: Essay 12.2 |
PREVIOUS :: NEXT |