Next: Obtaining QSMM, Previous: Building Blocks for Intelligent Machines, Up: Introduction [Contents][Index]

How to implement a probabilistic mapping? For its particular argument, a machine could generate probabilities of all possible outcomes. Then, the machine would randomly select a particular outcome according to those probabilities, for example, using a random number generator.

How can the machine perform such selection? Let us assume that the random number generator can return two numbers, 0 or 1, with equal probabilities. A straightforward approach to utilizing that random number generator is performing a fixed number of calls to the generator to obtain a binary fractional number in the range 0 to 1. All possible outcomes are non-overlapping segments on a line of length 1, where the length of every segment is equal to the probability of a corresponding outcome. To determine an outcome, the machine finds a segment containing the point at a distance equal to the binary fractional number from the beginning of the line.

For example, if three calls to the random number generator returned numbers 1, 0, and 1, then the binary fractional number is 0.101.
This binary number is equal to 1*2^{-1}+0*2^{-2}+1*2^{-3}=2^{-1}+2^{-3}=0.5+0.125=0.625.
If four possible outcomes have probabilities 0.125, 0.5, 0.125, and 0.25, then the segments end at distances 0.125, 0.625, 0.75, and 1 from the beginning of the line.
Assuming points at the ends of the segments are not parts of those segments, number 0.625 falls in the third segment, that is, the probabilistic mapping returns the third outcome.

At present, QSMM uses this approach to select an outcome of a probabilistic mapping according to probabilities of all possible outcomes. However, this approach has a drawback: it is hard to evaluate how much information from the random number generator the machine uses to select a particular outcome.

A different approach is putting more load on the random number generator to select a less probable outcome and putting less load on the random number generator to select a more probable outcome. In the extreme case, when an outcome has probability 1, and all other outcomes have probability 0, the machine selects the outcome without calling the random number generator at all.

In this approach, the machine builds a binary Huffman tree for a list of outcome probabilities and then traverses nodes of the Huffman tree from the root node to a leaf node representing a selected outcome by calling the random number generator at each traversed non-leaf node to select one of its two child nodes.
This approach, however, implies rounding outcome probabilities to 2^{-k}, where *k* is the depth of a leaf node, and works better for a sufficiently large number of possible outcomes.

Generally, the greater is the probability of a leaf node, the shorter is a path to it from the root node, and, therefore, a lesser number of calls to the random number generator is necessary to select the leaf node.
Thus, it is *easier* to select a more probable outcome, because this selection requires a lesser number of actions to perform.

For our example with outcome probabilities 0.125, 0.5, 0.125, and 0.25, the Huffman tree is:

For the above Huffman tree, the machine selects the most probable second outcome with probability 0.5 if a call to the random number generator returns 1. The machine selects the least probable first or third outcome with probability 0.125 if three calls to the random number generator return the sequence 0, 1, 0 or 0, 1, 1 respectively. The average number of calls to the random number generator for the Huffman tree is equal to 0.125*3+0.5*1+0.125*3+0.25*2=1.75.

For four equal outcome probabilities 0.25, the Huffman tree is:

For the above Huffman tree, the number of calls to the random number generator to select any outcome is 2. On average, this Huffman tree puts more load on the random number generator to select an outcome compared to the Huffman tree in Figure 1.1.

Thus, in this approach, the average number of calls to the random number generator depends on a degree of difference between outcome probabilities—if some probabilities are much greater than others, the number of calls is less, and, if all probabilities are more or less equal, the number of calls is greater. Looks like this principle is relevant to the physical environment—for a less probable event to occur, more random events have to happen.

Supposing a call to the random number generator has constant *choice complexity*, selecting a particular outcome of a probabilistic mapping has choice complexity equal to this constant choice complexity multiplied by the number of calls to the random number generator required to perform the selection.
Average choice complexity is equal to the constant choice complexity multiplied by the average number of calls to the random number generator required to select an outcome.

However, a call to the random number generator might not have constant choice complexity. Choice complexity for a stochastic act of selecting number 0 or 1 by the random number generator might depend on the number of possible distinguished outcomes the act has. If the outcomes are similar, the number of distinguished outcomes will be less than 2.

The distinguishability of outcomes might affect how the stochastic act changes the entropy of nature. For example, when water is boiling in a kettle, water molecules move rapidly, and a possible location of a water molecule after a fixed period of time varies greatly. However, locations of water molecules in the kettle probably do not affect much—a usable outcome would be hot water in the kettle to make tea. In this case, the complexity of choice lying with a water molecule may be low. On the other hand, if the next location of a molecule affects a resolution of a stochastic optimization task where to search planets in space for colonization, the complexity of choice lying with the molecule may be high.

If nature is a self-sustaining environment, one can interpret the complexity of choice as the amount of change necessary to perform to sustain the environment as a result of various outcomes of a stochastic act. Supposing time is closed like space might be, sustaining the environment requires continuous effort throughout all time-space continuum.

Let us consider an adaptive probabilistic mapping. If an outcome correlates with a desired change in spur, and other outcomes have a lesser correlation, then the outcome will have a greater probability compared to other outcomes. Supposing an outcome always leads to a desired change in spur, and all other outcomes never lead to such change, the outcome would have probability 1 and all other outcomes would have probability 0—the choice of a result of the probabilistic mapping for the argument would be deterministic. On the other hand, if all outcomes equally lead to a desired change in spur, all those outcomes will have equal probabilities (that sum up to 1).

Suppose the machine solves an optimization task consisting in the maximization of spur increment velocity and the machine has learned how to achieve a constant increase in this velocity. The adaptive probabilistic mapping would have an argument with one of its corresponding outcomes with a significantly greater probability than all other outcomes for the argument. This outcome takes part in reinforcing a behavior to constantly increase the spur increment velocity. The machine puts a little load on the random number generator to select an outcome for this argument, because one outcome has a significantly greater probability, and this outcome becomes a result most of the times.

Consider a situation that, at a particular point of time, the increment velocity has gone down, so the machine needs to change its own behavior to increase the increment velocity even more. The outcome with a significantly greater probability has now a lesser probability approximately equal to probabilities of some other outcomes for that argument. The probabilities of outcomes for other arguments may also change in a similar way. The machine engages the random number generator in a greater extent for selecting outcomes according to those new lists of probabilities.

In other words, the machine is now in a difficult situation that means an increase of average complexity of choices the machine has to perform to increase the increment velocity.
Would the machine *feel* that the situation is difficult?
(To answer this question, one can consider the amount of energy supplied to the random number generator.)

A computer running a pseudo-random number generator can simulate the behavior of a probabilistic mapping. Running the pseudo-random number generator is a deterministic process—the way today’s computers operate. The computer might use a random number generator implemented as a physical unit running a stochastic physical process. We can consider this unit as an interface to a (probabilistic) mapping provided by nature. A stochastic act will then be calling such mapping to return a result for a particular argument. If a probabilistic mapping is an electric circuit, it is more straightforward to use stochastic physical processes as a source of randomness.

One could relate the abstract idea of good and bad to the concept of choice complexity. The good would support a certain level of choice complexity. The survival of an animate being would be preserving its ability to perform choices with a sufficient degree of complexity. This has something in common with the definition of life as a characteristic which distinguishes objects that have signaling and self-sustaining processes from those that do not. In our case, the signaling is the way of exchanging information within superpositions of probabilistic mappings, and the basic self-sustaining process is supporting certain level of choice complexity.