Previous: , Up: Adaptive Probabilistic Mapping   [Contents][Index]


2.13 Example of Using the Actor API

In the example of using the Actor API, an agent controlled by the sample program has to find a short path to the gold in a labyrinth and then to an exit from the labyrinth. The picture of this labyrinth is encoded in the sample program using a subset of ASCII characters resembling pseudographics. You can change that picture to test the behavior of the agent for various configurations of the labyrinth.

To find a resulting path, the agent visits the labyrinth the number of times defined by the macro NVISIT. A visit is considered finished when the agent moves to a cell where the labyrinth exit is located. A simulated annealing approach is used, so the temperature of the actor gradually decreases with every subsequent visit of the labyrinth. The macro KT_0 defines the initial temperature of the actor, and the macro KT_1 defines the final temperature.

At the last visit of the labyrinth, the sample program shows the image of the labyrinth and movements of the agent in it, prints an indication whether the agent took the gold, and displays the current length of visiting path. After the agent finishes the last visit of the labyrinth, a user can press Q to exit the sample program, small G to print the map of learned directions of movements from every cell of the labyrinth (reachable from the labyrinth entry) to a cell with the gold, or capital G to print the map of learned directions of movements from every cell of the labyrinth to the labyrinth exit after taking the gold. When the user presses Q, the program prints the number of visits of the labyrinth, when the agent took the gold, and exits.

A random seed can be specified by a program argument. If the random seed is non-negative, then the actor will operate normally. If the random seed is negative, then the actor will generate output signals completely randomly. You could compare the behavior of the agent for these two modes of program execution.

Each labyrinth cell is a rectangle of 4 characters in width and 3 characters in height. The picture of a labyrinth cell is represented in Figure 2.5. Adjacent cells have common border columns or rows. For adjacent cells located vertically, row 2 of an upper cell and row 0 of a lower cell are the same rows. For adjacent cells located horizontally, column 3 of a left cell and column 0 of a right cell are the same columns.

Labyrinth Cell

Figure 2.5: Labyrinth Cell

The macros MAX_X and MAX_Y define maximum zero-based coordinates of a labyrinth cell. Thus, the labyrinth has (MAX_X+1)*3+1 character columns in width and (MAX_Y+1)*2+1 character rows in height. The macros ENTRY_X and ENTRY_Y define zero-based coordinates of the entry cell of the labyrinth relative to its upper left corner.

Within a labyrinth cell, space characters denote allowed movement paths, and other characters denote disallowed movement paths. Let us denote a character in column x and row y of a labyrinth cell by <x,y>. If a cell is reachable from the labyrinth entry, then characters <1,1> and <2,1> of the cell must be spaces; small circles denote them in the figure. If movement to a left and/or right adjacent cell is allowed, then characters <0,1> and/or <3,1> respectively must be spaces; left and right arrows denote them in the figure. If movement to an upper adjacent cell is allowed, then characters <1,0> and <2,0> must be spaces; up arrows denote them in the figure. If movement to a lower adjacent cell is allowed, then characters <1,2> and <2,2> must be spaces; down arrows denote them in the figure.

If characters <0,0>, <3,0>, <0,2>, and <3,2> of a cell are the characters ‘*’, then the cell contains gold. If there are several such cells, then visiting any of them will be considered taking the gold. It is provided in a single copy per visit of the labyrinth, i.e. if there are several cells with the gold, then visiting any of them removes it from the others until the next visit of the labyrinth. If those characters at the corners of a cell are the characters ‘#’, then the cell is a labyrinth exit. A labyrinth can have more than one exit. When modifying the picture of the labyrinth, make sure that cells are properly aligned and connected, and there are no paths outside of the labyrinth, otherwise program behavior is undefined.

The file samples/labyr2.c in the package distribution provides the source code of this example. Below is a copy of that source code. The command make builds this sample program when the configure script configured the QSMM package to use the Curses library. See the file INSTALL in the root of the package distribution for information on the configure script.

#include <assert.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#if defined(HAVE_CURSES_H)
#   include <curses.h>
#elif defined(HAVE_NCURSES_CURSES_H)
#   include <ncurses/curses.h>
#endif

#include <qsmm/qsmm.h>

#define NVISIT  200
#define MAX_X   14
#define MAX_Y   14
#define ENTRY_X 0
#define ENTRY_Y 14
#define NSIG_IN (MAX_X+1>MAX_Y+1?MAX_X+1:MAX_Y+1)
#define KT_0    NVISIT
#define KT_1    1.0


#define CHK_FAIL(func, ...)                                               \
    do {                                                                  \
        int rc=func(__VA_ARGS__);                                         \
        if (rc<0) {                                                       \
            fprintf(stderr, #func ": %s\n", qsmm_err_str(rc));            \
            goto Exit;                                                    \
        }                                                                 \
    }                                                                     \
    while (0)


enum direct_e {
    DIRECT_NORTH=0,
    DIRECT_EAST =1,
    DIRECT_SOUTH=2,
    DIRECT_WEST =3
};


static long state_fq[3][MAX_Y+1][MAX_X+1];
static qsmm_actor_t actor=0;


static int show_gradient(
    char is_gf
);


static int
opaque_maze(
    enum direct_e direct
) {
    static const char *picture[]={
        // 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14
        "+--------+-----+--------------------------#..#",
        "|        |     |                             |",  // 0
        "|  *  *  |     |                          #  #",
        "|        |     |                             |",  // 1
        "|  *  *  |     |     +-----+                 |",
        "|        |     |     |     |                 |",  // 2
        "|        |     |     |     |     +-----+     |",
        "|        |     |     |     |     |     |     |",  // 3
        "|        |     |     |     |     |     |     |",
        "|        |     |     |     |     |     |     |",  // 4
        "|        +-----+     |     |     |     |     |",
        "|                    |     |     |     |     |",  // 5
        "|                    |     |     |     |     |",
        "|                    |     |     |     |     |",  // 6
        "|  +-----------+     +-----+     |     |     |",
        "|  |           |                 |     |     |",  // 7
        "|  |           |                 |     |     |",
        "|  |           |                 |     |     |",  // 8
        "|  +-----------+     +-----+     |     |     |",
        "|                    |     |     |     |     |",  // 9
        "|                    |     |     |     |     |",
        "|                    |     |     |     |     |",  // 10
        "+--------------+     |     |     +-----+     |",
        "|              |     |     |                 |",  // 11
        "|              |     |     |                 |",
        "|              |     |     |                 |",  // 12
        "+--------------+     +-----+     +-----+     |",
        "|                                |     |     |",  // 13
        "|                                |     |     |",
        "|                                |     |     |",  // 14
        "+..+-----------------------------+-----+-----+"
    };
    static char is_gf=0;
    static int path_len, visit=0, xx=-1, yy=-1;
    int col, row;
    if (xx<0 || yy<0) {
        if (visit==NVISIT-1) {
            if (!initscr()) exit(2);
            noecho();
            for (row=0; row<(MAX_Y+1)*2+1; row++)
                mvaddstr(row,0,picture[row]);
        }
        xx=ENTRY_X;
        yy=ENTRY_Y;
        path_len=0;
    }
    assert(xx>=0 && xx<=MAX_X);
    assert(yy>=0 && yy<=MAX_Y);
    col=xx*3+1;
    row=yy*2+1;
    assert(picture[row][col]==' ' && picture[row][col+1]==' ');
    assert((picture[row-1][col]==' ' && picture[row-1][col+1]==' ') ||
           (picture[row-1][col]!=' ' && picture[row-1][col+1]!=' '));
    assert((picture[row+1][col]==' ' && picture[row+1][col+1]==' ') ||
           (picture[row+1][col]!=' ' && picture[row+1][col+1]!=' '));
    switch (direct) {
        case DIRECT_NORTH:
            if (picture[row-1][col]!=' ' || picture[row-1][col+1]!=' ')
                return 3;
            yy--;
            break;
        case DIRECT_EAST:
            if (picture[row][col+2]!=' ') return 3;
            xx++;
            break;
        case DIRECT_SOUTH:
            if (picture[row+1][col]!=' ' || picture[row+1][col+1]!=' ')
                return 3;
            yy++;
            break;
        case DIRECT_WEST:
            if (picture[row][col-1]!=' ') return 3;
            xx--;
            break;
        default:
            assert(0);
    }
    path_len++;
    if (visit==NVISIT-1) mvaddstr(row,col,"  ");
    col=xx*3+1;
    row=yy*2+1;
    state_fq[2][yy][xx]++;
    state_fq[(int) is_gf][yy][xx]++;
    if (visit==NVISIT-1) {
        char ss[128];
        const int picture_w=strlen(picture[0]);
        mvaddstr(row,col,"[]");
        sprintf(ss," Gold found: %d",is_gf);
        mvaddstr(1,picture_w+2,ss);
        sprintf(ss,"Path length: %d",path_len);
        mvaddstr(3,picture_w+2,ss);
        move((MAX_Y+1)*2+3,0);
        refresh();
        usleep(125000);
    }
    int result=0;
    if (picture[row-1][col-1]=='*' && picture[row-1][col+2]=='*' &&
        picture[row+1][col-1]=='*' && picture[row+1][col+2]=='*') {
        if (!is_gf) {
            is_gf=1;
            result=1;
        }
    }
    else if (picture[row-1][col-1]=='#' && picture[row-1][col+2]=='#' &&
             picture[row+1][col-1]=='#' && picture[row+1][col+2]=='#') {
        is_gf=0;
        result=2;
        xx=-1;
        yy=-1;
        if (visit==NVISIT-1) {
            int sel=-1;
            while (1) {
                row=(MAX_Y+1)*2+3;
                mvaddstr(row,2,"[g] - gradient before taking the gold");
                mvaddstr(row+1,2,"[G] - gradient after taking the gold");
                mvaddstr(row+2,2,"[Q] - quit");
                const int rc=getch();
                if (sel>=0) mvaddstr(row+sel,0," ");
                if (rc=='q' || rc=='Q') break;
                switch (rc) {
                    case 'g': sel=0; show_gradient(0); break;
                    case 'G': sel=1; show_gradient(1); break;
                }
                mvaddstr(row+sel,0,">");
            }
            endwin();
        }
        else visit++;
    }
    return result;
}


static int
show_gradient(
    char is_gf
) {
    int result=-1;
    assert(!is_gf || is_gf==1);
    qsmm_sig_t *const sig_ngram_p=qsmm_get_actor_sig_ngram(actor);
    sig_ngram_p[0]=is_gf;
    for (int yy=0; yy<=MAX_Y; yy++) {
        sig_ngram_p[2]=yy;
        for (int xx=0; xx<=MAX_X; xx++) {
            const char *ccp;
            if (state_fq[(int) is_gf][yy][xx]<1) {
                if (state_fq[!is_gf][yy][xx]<1) continue;
                else ccp="  ";
            }
            else {
                sig_ngram_p[1]=xx;
                CHK_FAIL(qsmm_actor_calc_action_prob, actor,
                         0, NSIG_IN, NSIG_IN+4, QSMM_PROB_LEARNT);
                const double *const prob_p=
                    qsmm_get_actor_choice_sig_prob(actor);
                double prob_max=-1;
                enum direct_e direct=DIRECT_NORTH;
                for (int ii=0; ii<4; ii++) {
                    const double prob=prob_p[NSIG_IN+ii];
                    if (prob_max>=prob) continue;
                    prob_max=prob;
                    direct=ii;
                }
                qsmm_actor_choice_sig_prob_release(actor);
                switch (direct) {
                    case DIRECT_NORTH: ccp="^ "; break;
                    case DIRECT_EAST:  ccp="> "; break;
                    case DIRECT_SOUTH: ccp="v "; break;
                    case DIRECT_WEST:  ccp="< "; break;
                    default: assert(0);
                }
            }
            mvaddstr(yy*2+1,xx*3+1,ccp);
        }
    }
    result=0;

Exit:
    return result;
}


int
main(
    int argc,
    char **argv
) {
    int path_len=0, n_gold_found=0, exit_code=1;
    const double mut=pow(KT_1/KT_0,1.0/NVISIT);
    double ktemperature=KT_0;
    struct qsmm_pair_sig_s range_sig[3];
    memset(range_sig,0,sizeof(range_sig));
    range_sig[0].second=1;
    range_sig[1].second=MAX_X;
    range_sig[2].second=MAX_Y;
    struct qsmm_actor_desc_s actor_desc;
    memset(&actor_desc,0,sizeof(actor_desc));
    actor_desc.nspur=1;
    actor_desc.ngram_sz=3;
    actor_desc.compat=1;
    actor_desc.sig_spec_type=QSMM_ACTOR_SIG_SPEC_IN_OUT;
    actor_desc.range_sig_p=range_sig;
    struct qsmm_actor_sig_spec_in_out_s *const iop=&actor_desc.sig_spec.in_out;
    iop->nsig_in=NSIG_IN;
    iop->nsig_out=4;
    CHK_FAIL(qsmm_actor_create,&actor_desc,&actor);
    CHK_FAIL(qsmm_set_actor_naction_per_evt,actor,0.25);
    const int seed=(argc>1?atoi(argv[1]):0);
    if (seed<0) qsmm_set_actor_random(actor,1);
    qsmm_rng_seed(qsmm_get_actor_rng(actor),abs(seed));
    for (int visit=0; visit<NVISIT; visit++) {
        char is_gf=0;
        int xx=ENTRY_X, yy=ENTRY_Y;
        qsmm_sig_t sig_action=QSMM_SIG_INVALID;
        CHK_FAIL(qsmm_set_actor_ktemperature,actor,ktemperature);
        while (1) {
            CHK_FAIL(qsmm_actor_shl_sig,actor,is_gf,0);
            CHK_FAIL(qsmm_actor_shl_sig,actor,xx,0);
            CHK_FAIL(qsmm_actor_reg_sig_in,actor,yy);
            CHK_FAIL(qsmm_actor_calc_action_prob, actor,
                     0, NSIG_IN, NSIG_IN+4, QSMM_PROB_LEARNED);
            CHK_FAIL(qsmm_get_actor_sig_action, actor,
                     0, NSIG_IN, NSIG_IN+4, &sig_action);
            CHK_FAIL(qsmm_actor_reg_sig_action,actor,sig_action);
            CHK_FAIL(qsmm_actor_time_delta,actor,1);
            const enum direct_e direct=sig_action-NSIG_IN;
            const int rc=opaque_maze(direct);
            switch (rc) {
                case 0:
                    break;
                case 1:
                    is_gf=1;
                    n_gold_found++;
                    break;
                case 2:
                    if (is_gf) CHK_FAIL(qsmm_actor_spur_delta,actor,0,1);
                    break;
                case 3:
                    continue;
                default:
                    assert(0);
            }
            if (rc==2) break;
            switch (direct) {
                case DIRECT_NORTH: yy--; break;
                case DIRECT_EAST:  xx++; break;
                case DIRECT_SOUTH: yy++; break;
                case DIRECT_WEST:  xx--; break;
                default: assert(0);
            }
            path_len++;
        }
        ktemperature*=mut;
    }
    printf("\nn_gold_found=%d\n",n_gold_found);
    exit_code=0;

Exit:
    qsmm_actor_destroy(actor);
    return exit_code;
}

Previous: , Up: Adaptive Probabilistic Mapping   [Contents][Index]