Session 1: Reconstruction and visualisation of metabolic networks
Ines Thiele, Iceland: Expanding genome-scale metabolic models
Expanding knowledge and expanding scope. 67 non-unique metabolic models available, due to increased knowledge of *how* to generate models, automated bacterial model generation (SEED) and increased numbers of analysis tools, such as the COBRA Toolbox. Increasingly a community effort.
Gap-filling for "dead-end" metabolites, allowing flux to run through these reactions. SMILEY algorithm can be used to assist in this, which relies on KEGG for metabolic reactions, and a proprietary collection of transport reactions. Optimisation problem: which reactions need to be added to the network to connect dead-ends, minimising addition of reactions. Three cases: add reaction, add reversibility, add transport reactions. SMILEY predictions will be experimentally validated. SMILEY is similar to GapFill and GapFind.
~300 blocked reactions, ~100 dead-end reactions in human (Recon 1). These are largely in cytosol, as this acts as a kind of "default" compartment, and also nevertheless contains the majority of the network. BRENDA also used to confirm reaction directionality. Experimental data from biofluid also used to identify missing metabolites (or missing transport reactions). Note: can missing reactions be inferred cheminformatically from the chemical structures? Manual curation is *crucial*, experimental data can help.
Expanding the scope beyond metabolism: Reconstruction of transcriptional and translational machinery of E. coli, PLoS Comp. Biol., 2009. Goal of the model: to produce ribosomes. Attempting to imply transcriptional levels relate to flux level. Constraints by transcriptomics data, would like to constrain further with quantitative proteomics data. Uses Flux Variablity Analysis with proprietary algorithm (FASTFVA) to predict feasible flux ranges to constrain further.
Note: this next session is a little flaky as my laptop randomly switched off...
Coupling of metabolic reactions with their enzymes (i.e. with the transcriptional network) further constraints the metabolic network further: the reaction is blocked if the enzyme is not present. Analysis of predicted codon usage appears to match experimental data.
V.A.P. Martins dos Santos, Wageningen: Genome-scale comparison of the metabolic spaces of the pathogen Pseudomonas aeruginosa and the industrially relevant, non-pathogenic Pseudomonas putida
Comparison at the level of genomic sequence doesn't indicate pathogenicity. Must look at the metabolic network.
Initial comparison by sequence analysis at genomic level shows little overlap. Reactions can be the same, but different GPR associations. Use of similar gene association profiles to determine potential equivalent functions across the two organisms. Reconciled models shows increased similarity, but still many differences. These differences are mapped to pathways: major difference in transport reactions.
Gene essentiality studies to categorise genes as essential, optional, or necessary under a given growth condition. Elementary mode analysis for ATP-production shows similarity across the two organisms. Differences in elementary modes can be accounted for by the presence / absence of unique reactions.
F. Moran, Complutense University of Madrid: Comparison and graph representation of metabolic networks using Tool-4-Metatool
Ref: METATOOL: for studying metabolc networks, Bioinformatics, 1999, 15, 251-257.
Input text file representing model / metabolic network. Note: why use proprietary format? Stoichiometric analyses performed. Question: is this another COBRA Toolbox? http://bbm1.ucm.es/t4m Supports KEGG files (for graphical display, presumably network maps) and COBRA-formatted SBML.
Web-based tool. Metabolic pathways rendered. Allows largely models to be simplified into subsets. GUI linked to KEGG.
Application to comparison of production pathways for valine: generation and comparison of elementary modes.
May be size-limited: can genome-scale models be loaded / rendered / analysed? Question: the bottleneck is in the rendering?
Session 2: Methods in metabolic pathway analysis
Francisco Planes: Novel methods to compute elementary flux modes in genome-scale models
Note: health warning. These notes are a little threadbare due to the writer's lack of expertise on the subject...
Convex basis: minimum set of EFMs able to generate the flux cone. Non-unique. k-shortest EFMs: minimise the number of reactions in the EFM. Reactions can be forced to be present once. Question: Is this a constraint applied from experimental data? Can constraint to ensure only (or to minimise) irreversible reactions used: irreversible EFMs. This approach forces a single reaction to appear, reducing the set of EFMs.
Applied to lysine synthesis. Future developments attempting to include isotope labelling (13C) fluxomics data and to develop the concept of elementary carbon modes. Publications in preparation.
Stefan Klamt, Magdeburg: New techniques for computing intervention strategies in metabolic networks
Techniques considered: FBA-based, and EFM-based optimisations. With FBA, search for minimal set of knock-outs that lead to coupling between biomass and product synthesis. With EFM, identify a knock-out that retains a minimal set of EFMs.
Minimal Cut Sets: minimise minimal functional units that can operate in steady-state and achieve objective function. Goal: repress non-optimal production routes for desired product, P. Consider also desired modes, which must be preserved after cutting.
Next approach: CASOP. Computational Approach for Strain Optimisation Aiming at High Productivity. Considers not just knock-outs (KOs) but also overexpression. Considers not only product production rate (yield optimality), but also substrate uptake rate. Productivity is proportional to the ratio of these terms. The latter is related to the capacity of the system. Applies weighting of EFMs, reaction importance measures (in how many EFMs does a reaction participate) and reaction ranking, based on the above terms.
Applied to succinate production in E coli. KOs and overexpression candidates were ranked.
Identification of potential excess or undersupply of co-factors and small molecules. Excretion of small molecules is performed through the addition of artificial reactions and electron and energy sink reactions.
Applicable to medium-scale networks, but in most cases, the most suitable interventions lie in central metabolism (which is covered by these medium-scale networks). Implemented in CellNetAnalyzer. To be followed up by experimental work to validate intervention preditions.
Stefan Schuster, Jena: Combining pathway analysis with evolutionary game theory
Note: Lots of notes lost due to dodgy laptop...
Evolution is co-evolution, a competition that can be explained with game-theory. Players in the game attempt to optimise their position, sometimes at the expense of other players. Usage of different metabolic pathways can be considered strategies of the players, and these can be induced through mutation or epigenetic effects.
Game: assume that organisms want to maximise ATP production. Can use fermentation and respiration as strategies. FBA suggests pathways, which can be constraint with individual maximum reaction rates.
Note to self: immensely interesting. Read Stefan's papers!
Ina Koch: Automatic modularisation of biochemical networks based on elementary modes and transition invariants
Note: interesting talk on Petri Net based approaches, with note-taking unfortunately suffering from my laptop's malaise.
Karoline Faust: Predicting metabolic pathways from functionally linked genes.
"Topological, no FBA, sorry!" Looks at co-expression and co-regulation to project pathways using KEGG Mapper tool. Problems: cannot detect organism-specific variants of known pathways, or novel pathways (obviously).
So, the approach is a de novo one, mapping to global network from KEGG and MetaCyc. Specific subnetworks are then extracted from the global networks. Hub compounds are a problem, but hard to determine what is a hub node. Use atom-following strategy (based on chemical structure) to determine hubs. The need for this has been mitigated by the introduction of KEGG reaction pairs. Question: are hub nodes identified then removed / ignored subsequently? Answer: it seems "yes". Global network is based on KEGG RPAIRs.
Weaknesses: difficult to predict cycles or spirals, and highly interconnected central metabolic central metabolism. Also difficult to map enzymes to reactions, with current state of EC systems and existing databases.
Subgraphs are extracted by kWalks algorithm.
Session 3: Data integration into metabolic pathway analysis
Nathan Price, Illinois: Probabilistic integrative modeling of metabolic and regulatory networks
Consider interaction networks (based on statistical inference) and mechanistic, biological reaction networks. Integrating the analysis of the two approaches is rarely done due to the complexity of doing so. Attempt to coordinate the integration of these approaches.
Need exists for automated metabolic reconstructions, due to exponential increase in published genome sequences. Current reconstruction approach takes ~1 year. Automated SEED generated tool exists, and allows first drafts to be generated, which still require manual curation.
Goal is to integrate automation of transcriptational, regulatory networks and metabolic networks. PROM: Probabilistic Regulation of Metabolism. Integration: transcription encodes enzymes regulating metabolites, metabolites regulate transcription factors - feedback between the networks.
rFBA: regulatory network represented by Boolean rules, encode the GPR. Limitations: Boolean rules themselves require manual generation from literature, knock-downs are not considered: enzymes are either present or absent.
PROM: integrates networks, predicts flux changes upon perturbation at the transcriptional level, more quantitative than strict, on-off rules describing enzymes. Probabilistic Boolean rules are applied, not Boolean rules. Rather than stating...
IF B THEN A
...a probability is applied...
P(A|B) = 0.95 (i.e. p(mRNA|TF))
Probabilities added as a restraint to the FBA analysis of network. These can be violated, but doing so applies a penalty. Acts as a bias, rather than a hard-and-fast constraint. Adds a penalty term to be minimised.
Benchmarked against E. Coli. Palsson model plus regulatory interaction data from RegulonDB, validated against knock-out data. Increased comprehensiveness due to inference of high-throughput experimental data.
Going beyond binary study of lethality / non-lethality to look at probabilistic predictions of lethality.
Same approach also applied to M. tuberculosis, with predictions of gene essentialities, with 95% accuracy.
Opinion: talk of the day so far. Excellent.
Jean-Marc Schwartz, Manchester: Functional enrichment analysis by elementary modes
Integration of transcription and metabolism, inferring activity of metabolism based on gene expression data. Initial work looked at up/downregulated genes under a given perturbation, then linking these genes based on metabolic map.
Then mapped upregulated genes onto elementary modes, generated from individual KEGG pathways. Question: why limit elementary modes to a single, artificial, arbitrary pathway? Second collection of elementary modes were built from pairwise combinations of connected elementary modes across pathways.
Up/downregulated genes were then mapped to these pairwise combinations of elementary modes. Use of novel software: BLASTsets. Find elementary modes "significant" according to the transcriptomics data. This approach is more deterministic than looking at pathways.
Limitation: elementary modes give a "linear" view of metabolism, i.e. A --> --> B. However, much of metabolism is cyclic, recycling metabolites. How is cyclic vs. acyclic activity quantified? Represent molecular map as mass flux, rather than stoichiometric flux. Sorry, getting a little lost... Need to speak to Jean-Marc about this when I get home.
S.A. Wahl, Delft: Identification of enzyme kinetic equations from dynamic data using piecewise affine approximations.
Attempt to include kinetic data in metabolic networks. Apply perturbation experiments, metabolomics and 13C labelling. Flux + metabolic concentrations -> kinetic parameter estimates.
Apply pulse to growth media, intracellular fluxes and metabolite concentrations fluctuates. Apply labelled pulse and follow isotopic uptake. Follows the BioScope pulse experimental protocol, allowing experimental data to be determined from samples that are periodically pulsed from the chemostat. Reproducible for labelled and non-labelled metabolites, and across experimental techniques (LCMS and GCMS). Apply linear fit to data, then apply "switches" to produce model that reproduce both fluxomics and metabolomics data. Ultimately attempt to generate a kinetic model to model the dataset.
Ines Thiele, Iceland: Expanding genome-scale metabolic models
Expanding knowledge and expanding scope. 67 non-unique metabolic models available, due to increased knowledge of *how* to generate models, automated bacterial model generation (SEED) and increased numbers of analysis tools, such as the COBRA Toolbox. Increasingly a community effort.
Gap-filling for "dead-end" metabolites, allowing flux to run through these reactions. SMILEY algorithm can be used to assist in this, which relies on KEGG for metabolic reactions, and a proprietary collection of transport reactions. Optimisation problem: which reactions need to be added to the network to connect dead-ends, minimising addition of reactions. Three cases: add reaction, add reversibility, add transport reactions. SMILEY predictions will be experimentally validated. SMILEY is similar to GapFill and GapFind.
~300 blocked reactions, ~100 dead-end reactions in human (Recon 1). These are largely in cytosol, as this acts as a kind of "default" compartment, and also nevertheless contains the majority of the network. BRENDA also used to confirm reaction directionality. Experimental data from biofluid also used to identify missing metabolites (or missing transport reactions). Note: can missing reactions be inferred cheminformatically from the chemical structures? Manual curation is *crucial*, experimental data can help.
Expanding the scope beyond metabolism: Reconstruction of transcriptional and translational machinery of E. coli, PLoS Comp. Biol., 2009. Goal of the model: to produce ribosomes. Attempting to imply transcriptional levels relate to flux level. Constraints by transcriptomics data, would like to constrain further with quantitative proteomics data. Uses Flux Variablity Analysis with proprietary algorithm (FASTFVA) to predict feasible flux ranges to constrain further.
Note: this next session is a little flaky as my laptop randomly switched off...
Coupling of metabolic reactions with their enzymes (i.e. with the transcriptional network) further constraints the metabolic network further: the reaction is blocked if the enzyme is not present. Analysis of predicted codon usage appears to match experimental data.
V.A.P. Martins dos Santos, Wageningen: Genome-scale comparison of the metabolic spaces of the pathogen Pseudomonas aeruginosa and the industrially relevant, non-pathogenic Pseudomonas putida
Comparison at the level of genomic sequence doesn't indicate pathogenicity. Must look at the metabolic network.
Initial comparison by sequence analysis at genomic level shows little overlap. Reactions can be the same, but different GPR associations. Use of similar gene association profiles to determine potential equivalent functions across the two organisms. Reconciled models shows increased similarity, but still many differences. These differences are mapped to pathways: major difference in transport reactions.
Gene essentiality studies to categorise genes as essential, optional, or necessary under a given growth condition. Elementary mode analysis for ATP-production shows similarity across the two organisms. Differences in elementary modes can be accounted for by the presence / absence of unique reactions.
F. Moran, Complutense University of Madrid: Comparison and graph representation of metabolic networks using Tool-4-Metatool
Ref: METATOOL: for studying metabolc networks, Bioinformatics, 1999, 15, 251-257.
Input text file representing model / metabolic network. Note: why use proprietary format? Stoichiometric analyses performed. Question: is this another COBRA Toolbox? http://bbm1.ucm.es/t4m Supports KEGG files (for graphical display, presumably network maps) and COBRA-formatted SBML.
Web-based tool. Metabolic pathways rendered. Allows largely models to be simplified into subsets. GUI linked to KEGG.
Application to comparison of production pathways for valine: generation and comparison of elementary modes.
May be size-limited: can genome-scale models be loaded / rendered / analysed? Question: the bottleneck is in the rendering?
Session 2: Methods in metabolic pathway analysis
Francisco Planes: Novel methods to compute elementary flux modes in genome-scale models
Note: health warning. These notes are a little threadbare due to the writer's lack of expertise on the subject...
Convex basis: minimum set of EFMs able to generate the flux cone. Non-unique. k-shortest EFMs: minimise the number of reactions in the EFM. Reactions can be forced to be present once. Question: Is this a constraint applied from experimental data? Can constraint to ensure only (or to minimise) irreversible reactions used: irreversible EFMs. This approach forces a single reaction to appear, reducing the set of EFMs.
Applied to lysine synthesis. Future developments attempting to include isotope labelling (13C) fluxomics data and to develop the concept of elementary carbon modes. Publications in preparation.
Stefan Klamt, Magdeburg: New techniques for computing intervention strategies in metabolic networks
Techniques considered: FBA-based, and EFM-based optimisations. With FBA, search for minimal set of knock-outs that lead to coupling between biomass and product synthesis. With EFM, identify a knock-out that retains a minimal set of EFMs.
Minimal Cut Sets: minimise minimal functional units that can operate in steady-state and achieve objective function. Goal: repress non-optimal production routes for desired product, P. Consider also desired modes, which must be preserved after cutting.
Next approach: CASOP. Computational Approach for Strain Optimisation Aiming at High Productivity. Considers not just knock-outs (KOs) but also overexpression. Considers not only product production rate (yield optimality), but also substrate uptake rate. Productivity is proportional to the ratio of these terms. The latter is related to the capacity of the system. Applies weighting of EFMs, reaction importance measures (in how many EFMs does a reaction participate) and reaction ranking, based on the above terms.
Applied to succinate production in E coli. KOs and overexpression candidates were ranked.
Identification of potential excess or undersupply of co-factors and small molecules. Excretion of small molecules is performed through the addition of artificial reactions and electron and energy sink reactions.
Applicable to medium-scale networks, but in most cases, the most suitable interventions lie in central metabolism (which is covered by these medium-scale networks). Implemented in CellNetAnalyzer. To be followed up by experimental work to validate intervention preditions.
Stefan Schuster, Jena: Combining pathway analysis with evolutionary game theory
Note: Lots of notes lost due to dodgy laptop...
Evolution is co-evolution, a competition that can be explained with game-theory. Players in the game attempt to optimise their position, sometimes at the expense of other players. Usage of different metabolic pathways can be considered strategies of the players, and these can be induced through mutation or epigenetic effects.
Game: assume that organisms want to maximise ATP production. Can use fermentation and respiration as strategies. FBA suggests pathways, which can be constraint with individual maximum reaction rates.
Note to self: immensely interesting. Read Stefan's papers!
Ina Koch: Automatic modularisation of biochemical networks based on elementary modes and transition invariants
Note: interesting talk on Petri Net based approaches, with note-taking unfortunately suffering from my laptop's malaise.
Karoline Faust: Predicting metabolic pathways from functionally linked genes.
"Topological, no FBA, sorry!" Looks at co-expression and co-regulation to project pathways using KEGG Mapper tool. Problems: cannot detect organism-specific variants of known pathways, or novel pathways (obviously).
So, the approach is a de novo one, mapping to global network from KEGG and MetaCyc. Specific subnetworks are then extracted from the global networks. Hub compounds are a problem, but hard to determine what is a hub node. Use atom-following strategy (based on chemical structure) to determine hubs. The need for this has been mitigated by the introduction of KEGG reaction pairs. Question: are hub nodes identified then removed / ignored subsequently? Answer: it seems "yes". Global network is based on KEGG RPAIRs.
Weaknesses: difficult to predict cycles or spirals, and highly interconnected central metabolic central metabolism. Also difficult to map enzymes to reactions, with current state of EC systems and existing databases.
Subgraphs are extracted by kWalks algorithm.
Session 3: Data integration into metabolic pathway analysis
Nathan Price, Illinois: Probabilistic integrative modeling of metabolic and regulatory networks
Consider interaction networks (based on statistical inference) and mechanistic, biological reaction networks. Integrating the analysis of the two approaches is rarely done due to the complexity of doing so. Attempt to coordinate the integration of these approaches.
Need exists for automated metabolic reconstructions, due to exponential increase in published genome sequences. Current reconstruction approach takes ~1 year. Automated SEED generated tool exists, and allows first drafts to be generated, which still require manual curation.
Goal is to integrate automation of transcriptational, regulatory networks and metabolic networks. PROM: Probabilistic Regulation of Metabolism. Integration: transcription encodes enzymes regulating metabolites, metabolites regulate transcription factors - feedback between the networks.
rFBA: regulatory network represented by Boolean rules, encode the GPR. Limitations: Boolean rules themselves require manual generation from literature, knock-downs are not considered: enzymes are either present or absent.
PROM: integrates networks, predicts flux changes upon perturbation at the transcriptional level, more quantitative than strict, on-off rules describing enzymes. Probabilistic Boolean rules are applied, not Boolean rules. Rather than stating...
IF B THEN A
...a probability is applied...
P(A|B) = 0.95 (i.e. p(mRNA|TF))
Probabilities added as a restraint to the FBA analysis of network. These can be violated, but doing so applies a penalty. Acts as a bias, rather than a hard-and-fast constraint. Adds a penalty term to be minimised.
Benchmarked against E. Coli. Palsson model plus regulatory interaction data from RegulonDB, validated against knock-out data. Increased comprehensiveness due to inference of high-throughput experimental data.
Going beyond binary study of lethality / non-lethality to look at probabilistic predictions of lethality.
Same approach also applied to M. tuberculosis, with predictions of gene essentialities, with 95% accuracy.
Opinion: talk of the day so far. Excellent.
Jean-Marc Schwartz, Manchester: Functional enrichment analysis by elementary modes
Integration of transcription and metabolism, inferring activity of metabolism based on gene expression data. Initial work looked at up/downregulated genes under a given perturbation, then linking these genes based on metabolic map.
Then mapped upregulated genes onto elementary modes, generated from individual KEGG pathways. Question: why limit elementary modes to a single, artificial, arbitrary pathway? Second collection of elementary modes were built from pairwise combinations of connected elementary modes across pathways.
Up/downregulated genes were then mapped to these pairwise combinations of elementary modes. Use of novel software: BLASTsets. Find elementary modes "significant" according to the transcriptomics data. This approach is more deterministic than looking at pathways.
Limitation: elementary modes give a "linear" view of metabolism, i.e. A --> --> B. However, much of metabolism is cyclic, recycling metabolites. How is cyclic vs. acyclic activity quantified? Represent molecular map as mass flux, rather than stoichiometric flux. Sorry, getting a little lost... Need to speak to Jean-Marc about this when I get home.
S.A. Wahl, Delft: Identification of enzyme kinetic equations from dynamic data using piecewise affine approximations.
Attempt to include kinetic data in metabolic networks. Apply perturbation experiments, metabolomics and 13C labelling. Flux + metabolic concentrations -> kinetic parameter estimates.
Apply pulse to growth media, intracellular fluxes and metabolite concentrations fluctuates. Apply labelled pulse and follow isotopic uptake. Follows the BioScope pulse experimental protocol, allowing experimental data to be determined from samples that are periodically pulsed from the chemostat. Reproducible for labelled and non-labelled metabolites, and across experimental techniques (LCMS and GCMS). Apply linear fit to data, then apply "switches" to produce model that reproduce both fluxomics and metabolomics data. Ultimately attempt to generate a kinetic model to model the dataset.
Session 4: Data integration into metabolic and regulatory models
Jason Papin, Virginia: Integrating differential expression with genome-scale metabolic networks
Data to integrate: differential expression (transcriptomic, proteomic), transcript verification (RT-PCR, next-generation sequencing).
Differential expression mapping with the GIMME algorithm and the Schlomi approach. Limitation is that data is integrated if it passes an arbitrary threshold. MADE formalisation (submitted to Bioinformatics) developed in Jason's lab. Applies a weighting based on significance of differential expression between two states. Objective function based on increasing and decreasing expression changes, not a biomass function. BUT the biomass function is retained as a constraint: the model solution must result in growth.
Application to yeast and the algae Chlamydomonas reinhardtii, the latter of which incorporates photosynthesis which is often overlooked in reconstruction analysis. Transcript variation: network reconstruction refined by applying experimental data to unknown / putatively-identified metabolic ORFs. Allows reconstruction to be validated by the presence of expressed genes.
Incorporation of photosynthesis: prism reaction derivation. Map maximum reaction rate under a range of wavelengths, giving "bandwidths" to reactions. Coefficients can be generated (didn't get how) that will allow light-sources to be integrated with metabolic reconstruction analysis.
Interest in applying differing light sources during the growth phase to study effect on network analysis itself and biomass production, etc.
Peter Droste, Juelich: OMIX - A software solution for customizable visualization in the context of metabolic networks
Graphical software tool for mapping omics data on networks. Displays reactions, metabolites, flux indicating edges, and effector edges. Data visualised by mapping to network components. Programmable by a script language (OVL, Omix Visualization Language): apparently can be done by the user. Can be animated to visualise time-series data. Third-party plugins can be written. Supports import / export of many formats, including SBML, Matlab, Excel. Supports network analysis tools. Customisable, which is neat, but not sure why SBGN is not being used.
Andreas Hoppe, Berlin: Integrating metabolome and expression data into optimization-based network analysis
Metabolite profiles. Ideally, kinetic model, but large-scale models are generally just stoichiometric. Integration based on thermodynamic feasibility, added as constraint for FBA. Relies on knowing Gibbs free energies. Can be used in prediction of concentrations (feasible ranges), critical metabolites (with narrow range). Metabolite profile reduces flux space.
Expression data in FBA, uses threshold-based activity prediction. Publication: Expression Match FBA, Huthmacher et al, BMC Sys Biol, 2010.
Implication of cellular objectives form expression changes. MFM: single objective flux distribution. Publication: Hoffmann et al, Genome Informatics, 2006.
Software available: http://www.bioinformatics.org/fasimu/

0 comments:
Post a Comment