1.4 Animate Machines

Can an intelligent machine be conscious? This section explains a point of view where an adaptive probabilistic mapping within an intelligent machine can induce a rudiment of consciousness.

How to implement a probabilistic mapping? For its particular argument, a machine could generate probabilities of all possible outcomes. Then, the machine would randomly select a particular outcome according to the probabilities, for example, using a random number generator.

How can the machine perform this selection? Suppose, the random number generator can return two numbers, 0 or 1, with equal probabilities. A straightforward approach to utilizing that random number generator is performing a fixed number of calls to the generator to obtain a binary fractional number in the range 0 to 1. Non-overlapping segments on a line of length 1 can represent all possible outcomes, where the length of every segment is equal to the probability of a corresponding outcome. To determine an outcome, the machine finds a segment containing a point on the line at a distance equal to the binary fractional number measured from the line beginning.

For example, if three calls to the random number generator returned numbers 1, 0, and 1, the binary fractional number is 0.101. In decimal notation, it is equal to 1*2-1+0*2-2+1*2-3=2-1+2-3=0.5+0.125=0.625. If four possible outcomes have probabilities 0.125, 0.5, 0.125, and 0.25, then the segments end at distances 0.125, 0.625, 0.75, and 1 from the beginning of the line. Assuming that points at the ends of the segments do not belong to them, number 0.625 falls in the third segment, that is, the probabilistic mapping returns the third outcome.

QSMM can use this approach to select an outcome of a probabilistic mapping according to probabilities of all possible outcomes. However, the approach has a drawback: it is hard to evaluate how much information from the random number generator the machine uses to select a particular outcome, as there can be loss of information received from the random number generator.

The following example demonstrates loss of information. If the random number generator returned numbers 1, 1, and 0, the decimal fractional number is 2-1+2-2=0.75. If the random number generator returned numbers 1, 1, and 1, the decimal fractional number is 2-1+2-2+2-3=0.875. Both fractional numbers fall in the fourth segment, so the third number produced by the random number generator in both cases does not make any difference.

Another approach is putting more load on the random number generator to select a less probable outcome and putting less load on the random number generator to select a more probable outcome. In the extreme case, when an outcome has probability 1, and all other outcomes have probability 0, the machine selects the outcome without calling the random number generator at all.

In this approach, the machine builds a binary Huffman tree for a list of probabilities of all outcomes and then traverses the nodes of the Huffman tree from the root node to a leaf node representing a selected outcome by calling the random number generator at each traversed non-leaf node to select one of its two child nodes. This approach, however, implies rounding outcome probabilities to 2-k, where k is the depth of a leaf node, and works better for a sufficiently large number of possible outcomes.

Generally, the greater is the probability of a leaf node, the shorter is a path to it from the root node, and, therefore, a lesser number of calls to the random number generator is necessary to select the leaf node. Thus, it is easier to select a more probable outcome, because this selection requires a lesser number of actions to perform.

Conversely, the lesser is the probability of a leaf node, the greater number of calls to the random number generator is necessary to select the leaf node. It looks like this principle is relevant to the physical environment—for a less probable event to occur, more random events have to happen.

For our example with outcome probabilities 0.125, 0.5, 0.125, and 0.25, the Huffman tree is:

Huffman tree for probabilities 0.125, 0.5, 0.125, and 0.25

Figure 1.1: Huffman tree for probabilities 0.125, 0.5, 0.125, and 0.25

For the above Huffman tree, the machine selects the most probable second outcome with probability 0.5 if a call to the random number generator returns 1. The machine selects the least probable first or third outcome with probability 0.125 if three calls to the random number generator return the sequence 0, 1, 0 or 0, 1, 1 respectively. The average number of calls to the random number generator for the Huffman tree is equal to 0.125*3+0.5*1+0.125*3+0.25*2=1.75.

For four equal outcome probabilities 0.25, the Huffman tree is:

Huffman tree for four probabilities 0.25

Figure 1.2: Huffman tree for four probabilities 0.25

For the above Huffman tree, the number of calls to the random number generator to select any outcome is 2. On average, this Huffman tree puts more load on the random number generator to select an outcome compared to the Huffman tree in Figure 1.1.

To conclude, the average number of calls to the random number generator depends on a degree of difference between outcome probabilities—if some probabilities are much greater than others, the number of calls is less, and, if all probabilities are more or less equal, the number of calls is greater.

Supposing all calls to the random number generator have the same choice complexity, selecting a particular outcome of a probabilistic mapping would have a choice complexity value equal to a constant choice complexity value multiplied by the number of calls to the random number generator required to perform the selection. An average choice complexity value would be equal to a constant choice complexity value multiplied by the average number of calls to the random number generator required to select an outcome.

However, calls to the random number generator might not have equal choice complexity. Choice complexity for a stochastic act of selecting number 0 or 1 by the random number generator might depend on the number of distinguished outcomes the act has. If the outcomes are similar, the number of distinguished outcomes may be less than 2.

The distinguishability of outcomes might depend on a way how the stochastic act affects the entropy of nature. For example, when water is boiling in a kettle, water molecules move rapidly, and a possible location of a water molecule after a fixed period of time varies greatly. However, locations of water molecules in the kettle probably do not affect much—a usable outcome would be hot water in the kettle to make tea. In this case, the complexity of choice associated with a water molecule might be low. On the other hand, if the next location of a molecule were affecting a resolution of a stochastic optimization task where to search planets in space for colonization, the complexity of choice associated with the molecule could be high.

If nature is a self-sustaining environment, one can interpret the complexity of choice as the amount of change necessary to perform to sustain the environment as a result of various outcomes of a stochastic act. Supposing time is closed like space might be, sustaining the environment requires continuous effort throughout all time-space continuum.

Let us consider an adaptive probabilistic mapping. If an outcome positively correlates with a desired change in spur, and all other outcomes have a lesser correlation, the outcome has a greater probability compared to all other outcomes. If an outcome always led to a desired change in spur, and all other outcomes never led to the change, the outcome would have probability 1, and all other outcomes would have probability 0—the choice of a result of the probabilistic mapping for a specific argument would be deterministic. On the other hand, if all outcomes equally lead to a desired change in spur, they all have equal probabilities (that sum up to 1).

Suppose a machine is solving an optimization task consisting in the maximization of spur increment velocity, and the machine has learned how to achieve a constant increase in the velocity. The adaptive probabilistic mapping would have an argument where one of its corresponding outcomes has a significantly greater probability compared to probabilities of all other outcomes for the argument. This outcome would take part in reinforcing a behavior to constantly increase spur increment velocity. The machine puts a little load on the random number generator to produce a result of the adaptive probabilistic mapping for the argument, because the outcome with a significantly greater probability becomes the result most of the times.

Consider a situation that, at a particular point of time, the increment velocity has gone down, so the machine needs to change its own behavior to increase the increment velocity even more. The outcome with a significantly greater probability has now a lesser probability approximately equal to probabilities of some other outcomes for the argument. Probabilities of outcomes for other arguments may also change in a similar way. The machine engages the random number generator in a greater extent for selecting outcomes with less differing probabilities.

In other words, the machine is now in a difficult situation that means an increase of average complexity of choices the machine has to perform to increase the increment velocity. Would the machine feel the difficulty? (To answer this question, one can consider the amount of energy supplied to the random number generator.) The author considers the awareness of difficulty a rudiment of consciousness.

A computer running a pseudo-random number generator can simulate the behavior of a probabilistic mapping. Running a pseudo-random number generator is a deterministic process—the way today’s computers operate. A computer could use a random number generator implemented as a physical unit running a stochastic physical process. We could consider the unit as an interface to a (probabilistic) mapping provided by nature. A stochastic act would then be calling the mapping to obtain a result for a particular argument.

One could relate the abstract idea of good and bad to the concept of choice complexity. The good would support a certain level of choice complexity. The survival of an animate being would be preserving its ability to perform choices with a sufficient degree of complexity.