All cells in a multi-cellular organism are derived from the zygote via cell division and have the same DNA sequence, yet there are many different types of specialized cells to carry out distinct functions in the organism. Stem cells are a special class of cells as they have the potential to differentiate into either all or a subset of cell types in the body, e.g., blood cells, neuron cells or liver cells. Research on differentiation of stem cells and re-programming of non-stem cells into stem cells is the initial step towards on development of re-generative medicine. Bone marrow transplantation is the only established stem-cell based gene therapy so far.
A recent study by Takebe and co-workers (summary for general readers) has demonstrated that a mix of three types of human stem cells can be directed to differentiate into a liver bud with sequential expression of 83 genes. The artificial liver buds transplanted to mice have produced enzymes and metabolites mimicking a human adult liver. This study demonstrates that reprogramming and directed differentiation of cells has the potential to allow synthesis of artificial organs in the very near future.
From mathematical modelling perspective, the data collected from such studies can be used to build/train a whole liver model, which can be used to simulate various scenarios to improve the artificial organs being designed. Genome-wide metabolic model of human cells are already available (Recon 2 by Thiele and co-workers, summary for general readers) and vast amount of data is being generated on gene expression and epigenetics on various cell types (i.e. ENCODE Project) therefore building an eukaryotic whole-cell model is becoming plausible. A recent example is a prokaryotic whole-cell model by Karr and co-workers (summary for general readers). 3-D models of cell formation and propagation are also becoming more accessible with the availability of required software and computation power. Hence, a whole-cell model can be used as the basic module of a system-wide model (which be a cell population, whole-organ, whole-organism) to simulate the complicated process of building artificial organs.
Modular Modelling of the Stem Cells
Differentiation and re-programming of cells rely on a number of processes:
- Signal transduction: Sensing of extracellular signals (e.g. growth factors) on the cell membrane triggers activation of signalling pathways, the propagation of the signal in the cell is mediated by protein-protein interactions.
- Gene transcription and protein synthesis: The signals activate or inhibit a set of transcription factors (DNA-binding proteins), which in turn activate the transcription of their target genes in the nucleus. Transcription of target genes leads to synthesis of their corresponding proteins, each of which is specialized on a set of functions within the cell.
- Epigenetics: Structure of the chromatin in the cell is remodelled and marked with small molecules to assist the activation or repression of gene transcription. Chromatin modifications are usually slower processes than other processes like signal transduction, gene transcription and protein synthesis, hence epigenetics is likely to be the rate limiting step of differentiation and re-programming of the cells.
Our aim is to build dynamic mathematical models of stem cells to describe their differentiation and re-programming. Our strategy is building modular models at all levels (i.e., signalling, epigenetics, transcription, translation, cell division) and merging these modules with the ultimate aim of building a whole-cell of model of stem cells.
The first module in progress is description of initial steps of cell differentiation in mouse embryo (E3.0-E5.0), including 16-cell morula, ICM (inner cell mass), EPI (embryo proper) and PE (primitive endoderm). The processes represented in the model are activity of TF networks as response to extracellular signals like LIF, FGF and HIPPO pathways. Detailed mathematical models of LIF and FGF pathways are available in the literature. Activity of the signalling pathways will be simulated using previously build and validated models and then synchronised with activity of TF networks, transcription, translation and epigenetics.
Our secondary aim is to develop an algorithm for efficient parallel simulation and synchronisation of modular mathematical models. Development of a “simulation manager” which is capable of interacting with multiple simulators and synchronising their simulation results is in progress. The simulation manager is being written in Python and making use of the Python interfaces of simulation software. The models are developed in SMBL, and will be available in the biological model repository BioModels.
Modular modelling of cellular processes have several practical advantages over building large and unified models:
Natural modularity of cellular networks: Functions carried out by the cells are often modular in terms of type of interactions. For example, signalling cascades rely on protein-protein interactions, transcription rely on formation of protein complexes and DNA binding, translation involves interaction of protein complexes with mRNA. Experimental studies are also likely to focus on a single module of the network at a time and provide information/data only on that module. For some modules, spatial modelling maybe desired, i.e. in cases where 3-D diffusion of a species has a significant impact on the system dynamics. Similarly, logical or stochastic modelling may be necessary when dealing with low-copy number proteins or DNA binding, where `homogeneous mixture of molecules` assumption is not valid. This natural modularity results in construction and validation of models for small modules using a case-specific programming language and simulator.
Challenge of re-use of models and need for multiple simulators : Building a larger model composed of many modular models from the literature can save time and resources. However, as explained above, the models are usually represented in different programing languages and designed to be simulated in a particular simulator (i.e. Copasi, MATLAB, CellDesigner, Smoldyn). Although efforts like SMBL try to standardize the language and simulation medium, use of these tools may not be practical in some cases.
Need for parallel simulation algorithms: As described above, modules of the cellular networks may be modelled using different formalisms: Spatial models, stochastic models, kinetic models, logical models. Using more than one type of simulator maybe necessary to bring these models together and simulate them simultaneously. The modules may have overlapping variables, i.e., a TF will be synthesised in the `translation module` and appear as a variable in this module, while the level of same TF will be a variable in the `transcription module` as a regulator. The quantities of such variables have to be synchronised across the modules to assure the consistency of the model. Simultaneous simulation and synchronisation of models is necessary also for the cases where dynamics of one module is much slower than the others, optimal use of computational resources is aimed by parallel distributed computing. Hence a common interface to manage multiple simulators in a standard language (i.e. Python) may facilitate the simultaneous simulations of the modules and synchronisation of their time-dependant status. Efficient algorithms to manage the simulators and to synchronise the overlapping variables are needed for simulation of consistent and feasible system-wide models.