In the simplest case in these model networks, there's a little central clock that ticks; that's not true for real cells, but it will do for right now. Every gene looks at the state of its inputs and it does the right thing. So let me define the state of the network as the current on and off values of all 100,000 genes. So how many states are there? Well, there are two possibilities for gene 1 and two possibilities for gene 2 and so on, so there's 2 100,000, which is 10 30,000, so we're talking about a system in the human genome, even if we treat genes as idealized as on or off -- which is false because they show graded levels of activity -- it's got 1030,000 possible states. It is mind-boggling because the number of particles in the known universe is 10 80.
Here's what happens in the ordered regime. At any moment in time, the system is in a state, and there's a little clock; when the clock ticks, all the genes look at the states of their inputs and they do the right thing, so the whole system goes from some state to some state, and then it goes from state to state along a trajectory. There's finite number of states; it's large, but it's finite. So eventually the system has to hit a state it's been in before, and then it's a deterministic system; it'll do the same thing. So it will now go around a cycle of states. So the generic behavior is a transient that flows into a cycle.
The cycle is called a state cycle or an attractor. What I did 30 years ago was ask, "What's a cell type?" And I guessed that cell types were attractors, because otherwise we'd have 1030,000 different cell types, and we have something like 260. So here's what happens in the ordered regime. The number of states on the state cycle tends to be about the square root of the number of genes. The square root of 100,000 is around 318. So this system with 1030,000 states settles down to a little cycle with 300 states on it. That's enormous order. The system had squeezed itself down to a tiny black hole in its states phase.
If you're on one of these attractor state cycles and you perturb the activity of a single gene, like if a hormone comes in, most of the time you come back to the same attractor, so you have homeostasis. Sometimes, however, you leave one state cycle and you jump onto a transient that goes onto another state cycle, so that's differentiation.
All the things I'm telling you are testable things about the human genome. And they are predictions about the integrated behavior of the whole genome, and there's no way of getting to that right now without using the ensemble approach. They¿re very powerful predictions. That's the strength of it. The weakness is it doesn't tell you that gene A regulates gene F, which of course is exactly one of the things that we want to know.
SA: What does it buy you?
There's all sorts of questions you can answer using the ensemble approach that we will not be able to do until we have a complete theory of the genome.
SA: What then is the next step to take? Take the data from the genome and plug it into these models?
Yes, in the sense that you can do experiments to test all of the predictions of these kinds of ensemble models. Nothing prevents me from cloning in a controllable promoter upstream from 100 different randomly chosen genes in different cell lines, perturbing the activity of the adjacent gene and using Affymetrix chips to look at the avalanches of changes of gene activity. All of that is open to be tested.
We should be able to predict not only that it happens but the statistical distribution of how often when you do it cell type A goes back to being cell type A and how often cell type A becomes cell type B. Everything here is testable. In the actual testing of it for real cells, we'll begin to discover which perturbation to which gene actually causes which pathway of differentiation to happen. You can use molecular diversity or combinatorial chemistry to make the molecules with which you do the perturbation of cells and then test the hypotheses we've talked about.