Previous: , Up: Adaptive Probabilistic Mapping   [Contents][Index]


2.11 Example of Using the Actor API

In the example of using the Actor API, an agent controlled by a sample program has to find a short path to gold located in a labyrinth and then to an exit from the labyrinth. The source code of the sample program contains the picture of that labyrinth encoded using a subset of ASCII characters resembling pseudographics. You can change that picture to test agent behavior for various labyrinth configurations.

To find a resulting path, the agent visits the labyrinth the number of times defined by the macro NVISIT. Moving the agent to a cell designated as a labyrinth exit finishes a labyrinth visit. A custom function returning the relative probability of an output signal is the exponent of an expression equal to mean spur increment for a cycle type divided by actor temperature. The sample program uses a simulated annealing approach, where actor temperature gradually decreases with every subsequent labyrinth visit. The macro KT_0 defines initial actor temperature, and the macro KT_1 defines final temperature.

At last labyrinth visit, the sample program shows the picture of the labyrinth and agent movements in it, prints an indication whether the agent took the gold, and displays the current length of a visiting path. After finishing the last visit, a user can press Q to create the files labyr_0.asy and labyr_1.asy, print the number of labyrinth visits when the agent took the gold, and exit the sample program. Those two files correspond to the case before taking the gold by the agent and the case after taking the gold respectively. Each file includes the file labyr.asy and represents a labyrinth picture containing the following information:

To process both files by Asymptote and produce the image files labyr_0.png and labyr_1.png, use the commands:

asy -f png labyr_0.asy
asy -f png labyr_1.asy

You can specify a random seed by a program argument. If the random seed is non-negative, the actor operates adaptively (normally). If the random seed is negative, the actor operates randomly. You can compare agent behavior and resulting files labyr_0.png and labyr_1.png for these two modes of program execution.

For example, after executing the command

./labyr2 1

the file labyr_0.png produced from labyr_0.asy contains the image

Image labyr_0.png produced from labyr_0.asy

Figure 2.6: Image labyr_0.png produced from labyr_0.asy

and the file labyr_1.png produced from labyr_1.asy contains the image

Image labyr_1.png produced from labyr_1.asy

Figure 2.7: Image labyr_1.png produced from labyr_1.asy

In the source code of the sample program, each labyrinth cell is a rectangle 4x3 (WxH). The picture of a labyrinth cell is represented in Figure 2.8. Adjacent cells have common border columns or rows. For adjacent cells located vertically, row 2 of an upper cell and row 0 of a lower cell are the same rows. For adjacent cells located horizontally, column 3 of a left cell and column 0 of a right cell are the same columns.

Labyrinth cell

Figure 2.8: Labyrinth cell

The macros MAX_X and MAX_Y define maximum zero-based coordinates of a labyrinth cell. Thus, the labyrinth has (MAX_X+1)*3+1 character columns in width and (MAX_Y+1)*2+1 character rows in height. The macros ENTRY_X and ENTRY_Y define zero-based coordinates of the entry cell of the labyrinth relative to its upper left corner.

Within a labyrinth cell, space characters denote allowed movement paths, and other characters denote disallowed movement paths. Let us denote a character in column x and row y of a labyrinth cell by <x,y>. If a cell is reachable from the labyrinth entry, then characters <1,1> and <2,1> of the cell must be spaces; small circles denote them in the figure. If the cell allows the agent to move to a left and/or right adjacent cell, then characters <0,1> and/or <3,1> respectively must be spaces; left and right arrows denote them in the figure. If the cell allows the agent to move to an upper adjacent cell, then characters <1,0> and <2,0> must be spaces; up arrows denote them in the figure. If the cell allows the agent to move to a lower adjacent cell, then characters <1,2> and <2,2> must be spaces; down arrows denote them in the figure.

If characters <0,0>, <3,0>, <0,2>, and <3,2> of a cell are the characters ‘*’, the cell contains gold. If there are multiple such cells, visiting any of them means taking the gold. It is in a single copy per labyrinth visit—if there are multiple cells with the gold, visiting any of them implicitly removes it from the others until the next labyrinth visit. If characters at the cell corners are the characters ‘#’, the cell is a labyrinth exit. A labyrinth can have more than one exit. When modifying the picture of a labyrinth, make sure that its cells are properly aligned and connected, and there are no paths outside of the labyrinth—violating these rules leads to undefined program behavior.

The file samples/labyr2.c in the package distribution provides the source code of this sample program. Below is a copy of that source code. The command make builds the sample program if the configure script has configured QSMM to use the ncurses library. See the file INSTALL in the root of the package distribution for information on the configure script.

#include <assert.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#if defined(HAVE_CURSES_H)
#   include <curses.h>
#elif defined(HAVE_NCURSES_CURSES_H)
#   include <ncurses/curses.h>
#endif

#include <qsmm/qsmm.h>

#define NVISIT  200
#define MAX_X   14
#define MAX_Y   14
#define ENTRY_X 0
#define ENTRY_Y 14
#define NSIG_IN (MAX_X+1>MAX_Y+1?MAX_X+1:MAX_Y+1)
#define KT_0    (NVISIT*2)
#define KT_1    (1.0/KT_0)


#define CHK_FAIL(func, ...)                                               \
    do {                                                                  \
        const int rc=func(__VA_ARGS__);                                   \
        if (rc<0) {                                                       \
            fprintf(stderr, #func ": %s\n", qsmm_err_str(rc));            \
            goto Exit;                                                    \
        }                                                                 \
    }                                                                     \
    while (0)


#define ERREXIT(fmt, ...)                                                 \
    do {                                                                  \
        fprintf(stderr,fmt "\n", ## __VA_ARGS__);                         \
        goto Exit;                                                        \
    }                                                                     \
    while (0)


enum direct_e {
    DIRECT_NORTH,  // move one step up
    DIRECT_EAST,   // move one step right
    DIRECT_SOUTH,  // move one step down
    DIRECT_WEST,   // move one step left
    DIRECT_COUNT   // number of movement directions
};


static char last_path[2][MAX_Y+1][MAX_X+1];
static long fq_max[3], state_fq[3][MAX_Y+1][MAX_X+1];
static qsmm_actor_t actor=0;


// Generate an Asymptote file containing a picture of the labyrinth with
// learned movement gradients for the cases when the gold was or was not
// taken.
// Returns: 0 = success;
//         -1 = failure.

static int
create_pic(
    const char **pic_pp,
    char is_gf
) {
    char fln[]="labyr_?.asy";
    int ii, result=-1;
    fln[6]=is_gf?'1':'0';
    FILE *const filep=fopen(fln,"w");
    if (!filep) ERREXIT("%s: failed to open the file for writing",fln);
    fprintf(filep,"import labyr;\n\n");
    fprintf(filep,"set_dims(%d,%d);\n",MAX_X,MAX_Y);
    qsmm_sig_t *const sig_ngram_p=qsmm_get_actor_sig_ngram(actor);
    sig_ngram_p[0]=is_gf;
    for (int yy=0; yy<=MAX_Y; yy++)
        for (int xx=0; xx<=MAX_X; xx++) {
            if (state_fq[2][yy][xx]<1) {
                fprintf(filep,"black_box(%d,%d);\n",xx,yy);
                continue;
            }
            fprintf(filep, "fill_box(%d,%d,%.2f);\n",
                    xx, yy, (double) state_fq[(int) is_gf][yy][xx]/
                            fq_max[(int) is_gf]);
            const int row=yy*2, col=xx*3;
            for (const char *ccp="#*"; *ccp; ccp++)
                if (pic_pp[row][col]==*ccp && pic_pp[row][col+3]==*ccp &&
                    pic_pp[row+2][col+3]==*ccp && pic_pp[row+2][col]==*ccp)
                    fprintf(filep, "%s(%d,%d);\n",
                            *ccp=='#'?"exit":"gold", xx, yy);
            sig_ngram_p[1]=xx;
            sig_ngram_p[2]=yy;
            CHK_FAIL(qsmm_actor_calc_action_prob, actor,
                     0, NSIG_IN, NSIG_IN+DIRECT_COUNT, QSMM_PROB_LEARNED);
            const double *const prob_p=
                qsmm_get_actor_choice_sig_prob(actor);
            double prob_max=0, ww[DIRECT_COUNT];
            for (ii=0; ii<DIRECT_COUNT; ii++) {
                const double prob=prob_p[NSIG_IN+ii];
                if (prob_max<prob) prob_max=prob;
                ww[ii]=prob;
            }
            for (ii=0; ii<DIRECT_COUNT; ii++) ww[ii]/=prob_max;
            fprintf(filep, "arrows(%s,%d,%d,%.2f,%.2f,%.2f,%.2f);\n",
                    last_path[(int) is_gf][yy][xx]?"true":"false",
                    xx, yy, ww[0], ww[1], ww[2], ww[3]);
            qsmm_actor_choice_sig_prob_release(actor);
        }
    fprintf(filep,"entry(%d,%d);\n",ENTRY_X,ENTRY_Y);
    if (ferror(filep)) ERREXIT("%s: failed to write to the file",fln);
    result=0;

Exit:
    if (filep) fclose(filep);
    return result;
}


// Make a movement in the labyrinth in a specified direction possibly
// showing the movement on the screen.
// Returns: 3 = movement resulted in exiting the labyrinth;
//          2 = movement resulted in taking the gold;
//          1 = movement made successfully;
//          0 = cannot make the movement because a new location cell of the
//              agent is not empty.

static int
opaque_maze(
    enum direct_e direct
) {
    static const char *picture[]={
        // 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14
        "+--------+-----+--------------------------#..#",
        "|        |     |                             |",  // 0
        "|  *  *  |     |                          #  #",
        "|        |     |                             |",  // 1
        "|  *  *  |     |     +-----+                 |",
        "|        |     |     |     |                 |",  // 2
        "|        |     |     |     |     +-----+     |",
        "|        |     |     |     |     |     |     |",  // 3
        "|        |     |     |     |     |     |     |",
        "|        |     |     |     |     |     |     |",  // 4
        "|        +-----+     |     |     |     |     |",
        "|                    |     |     |     |     |",  // 5
        "|                    |     |     |     |     |",
        "|                    |     |     |     |     |",  // 6
        "|  +-----------+     +-----+     |     |     |",
        "|  |           |                 |     |     |",  // 7
        "|  |           |                 |     |     |",
        "|  |           |                 |     |     |",  // 8
        "|  +-----------+     +-----+     |     |     |",
        "|                    |     |     |     |     |",  // 9
        "|                    |     |     |     |     |",
        "|                    |     |     |     |     |",  // 10
        "+--------------+     |     |     +-----+     |",
        "|              |     |     |                 |",  // 11
        "|              |     |     |                 |",
        "|              |     |     |                 |",  // 12
        "+--------------+     +-----+     +-----+     |",
        "|                                |     |     |",  // 13
        "|                                |     |     |",
        "|                                |     |     |",  // 14
        "+..+-----------------------------+-----+-----+"
    };
    static char is_gf=0;
    static int path_len, visit=0, xx=-1, yy=-1;
    int ii, col, row;
    if (xx<0 || yy<0) {
        if (visit==NVISIT-1) {
            if (!initscr()) exit(2);
            noecho();
            for (row=0; row<(MAX_Y+1)*2+1; row++)
                mvaddstr(row,0,picture[row]);
        }
        xx=ENTRY_X;
        yy=ENTRY_Y;
        path_len=0;
    }
    assert(xx>=0 && xx<=MAX_X);
    assert(yy>=0 && yy<=MAX_Y);
    if (visit==NVISIT-1) last_path[(int) is_gf][yy][xx]=1;
    col=xx*3+1;
    row=yy*2+1;
    assert(picture[row][col]==' ' && picture[row][col+1]==' ');
    assert((picture[row-1][col]==' ' && picture[row-1][col+1]==' ') ||
           (picture[row-1][col]!=' ' && picture[row-1][col+1]!=' '));
    assert((picture[row+1][col]==' ' && picture[row+1][col+1]==' ') ||
           (picture[row+1][col]!=' ' && picture[row+1][col+1]!=' '));
    switch (direct) {
        case DIRECT_NORTH:
            if (picture[row-1][col]!=' ' || picture[row-1][col+1]!=' ')
                return 0;
            yy--;
            break;
        case DIRECT_EAST:
            if (picture[row][col+2]!=' ') return 0;
            xx++;
            break;
        case DIRECT_SOUTH:
            if (picture[row+1][col]!=' ' || picture[row+1][col+1]!=' ')
                return 0;
            yy++;
            break;
        case DIRECT_WEST:
            if (picture[row][col-1]!=' ') return 0;
            xx--;
            break;
        default:
            assert(0);
    }
    path_len++;
    if (visit==NVISIT-1) mvaddstr(row,col,"  ");
    col=xx*3+1;
    row=yy*2+1;
    state_fq[2][yy][xx]++;
    state_fq[(int) is_gf][yy][xx]++;
    for (ii=0; ii<3; ii++)
        if (fq_max[ii]<state_fq[ii][yy][xx])
            fq_max[ii]=state_fq[ii][yy][xx];
    if (visit==NVISIT-1) {
        char ss[64];
        const int picture_w=strlen(picture[0]);
        mvaddstr(row,col,"[]");
        sprintf(ss," Gold found: %d",is_gf);
        mvaddstr(1,picture_w+2,ss);
        sprintf(ss,"Path length: %d",path_len);
        mvaddstr(3,picture_w+2,ss);
        move((MAX_Y+1)*2+3,0);
        refresh();
        usleep(125000);
    }
    int result=1;
    for (const char *ccp="#*"; *ccp; ccp++)
        if (picture[row-1][col-1]==*ccp && picture[row-1][col+2]==*ccp &&
            picture[row+1][col-1]==*ccp && picture[row+1][col+2]==*ccp) {
            if (*ccp=='#') {
                is_gf=0;
                result=3;
                xx=-1;
                yy=-1;
                if (visit==NVISIT-1) {
                    row=(MAX_Y+1)*2+3;
                    mvaddstr(row,0,"Press [Q] to exit");
                    while (1) {
                        int key=getch();
                        if (key=='q' || key=='Q') break;
                    }
                    for (ii=0; ii<2; ii++)
                        if (create_pic(picture,ii)<0) exit(2);
                    endwin();
                }
                else visit++;
            }
            else if (!is_gf) {
                is_gf=1;
                result=2;
            }
            break;
        }
    return result;
}


// Helper function for computing the relative probability of an
// output signal.

static double
relprob_user2(
    qsmm_actor_t actor,
    qsmm_sig_t sig_cycle,
    const struct qsmm_state_s *state_p,
    const struct qsmm_cycle_s *cycle_p,
    const struct qsmm_sspur_s *sspur_p,
    const struct qsmm_cspur_s *cspur_p,
    void *paramp
) {
    const double period_sum_c=cycle_p->period_sum_c;
    return (period_sum_c?cspur_p[0].delta_sum/period_sum_c:0)/
        qsmm_get_actor_ktemperature(actor);
}


// Find a path to gold in the labyrinth and then to its exit.

int
main(
    int argc,
    char **argv
) {
    int exit_code=1;
    struct qsmm_pair_sig_s range_sig[3];
    memset(range_sig,0,sizeof(range_sig));
    range_sig[0].second=1;
    range_sig[1].second=MAX_X;
    range_sig[2].second=MAX_Y;
    struct qsmm_actor_desc_s actor_desc;
    memset(&actor_desc,0,sizeof(actor_desc));
    actor_desc.nspur=1;
    actor_desc.ngram_sz=3;
    actor_desc.compat=1;
    actor_desc.sig_spec_type=QSMM_ACTOR_SIG_SPEC_IN_OUT;
    actor_desc.range_sig_p=range_sig;
    struct qsmm_actor_sig_spec_in_out_s *const iop=&actor_desc.sig_spec.in_out;
    iop->nsig_in=NSIG_IN;
    iop->nsig_out=DIRECT_COUNT;
    CHK_FAIL(qsmm_actor_create,&actor_desc,&actor);
    qsmm_set_actor_relprob_type(actor,QSMM_RELPROB_USER2);
    qsmm_set_actor_relprob_helper(actor,&relprob_user2,0);
    const int seed=(argc>1?atoi(argv[1]):0);
    if (seed<0) qsmm_set_actor_random(actor,1);
    qsmm_rng_seed(qsmm_get_actor_rng(actor),abs(seed));
    int n_gold_found=0;
    const double mut=pow(KT_1/KT_0,1.0/NVISIT);
    double ktemperature=KT_0;
    for (int visit=0; visit<NVISIT; visit++) {
        char is_gf=0;
        int xx=ENTRY_X, yy=ENTRY_Y;
        qsmm_sig_t sig_action=QSMM_SIG_INVALID;
        CHK_FAIL(qsmm_set_actor_ktemperature,actor,ktemperature);
        while (1) {
            CHK_FAIL(qsmm_actor_shl_sig,actor,is_gf,0);
            CHK_FAIL(qsmm_actor_shl_sig,actor,xx,0);
            CHK_FAIL(qsmm_actor_reg_sig_in,actor,yy);
            CHK_FAIL(qsmm_actor_calc_action_prob, actor,
                     0, NSIG_IN, NSIG_IN+DIRECT_COUNT, QSMM_PROB_LEARNED);
            CHK_FAIL(qsmm_get_actor_sig_action, actor,
                     0, NSIG_IN, NSIG_IN+DIRECT_COUNT, &sig_action);
            CHK_FAIL(qsmm_actor_reg_sig_action,actor,sig_action);
            CHK_FAIL(qsmm_actor_time_delta,actor,1);
            const enum direct_e direct=sig_action-NSIG_IN;
            const int rc=opaque_maze(direct);
            switch (rc) {
                case 0:
                    continue;
                case 1:
                    break;
                case 2:
                    is_gf=1;
                    n_gold_found++;
                    break;
                case 3:
                    if (is_gf) CHK_FAIL(qsmm_actor_spur_delta,actor,0,1);
                    break;
                default:
                    assert(0);
            }
            if (rc==3) break;
            switch (direct) {
                case DIRECT_NORTH: yy--; break;
                case DIRECT_EAST:  xx++; break;
                case DIRECT_SOUTH: yy++; break;
                case DIRECT_WEST:  xx--; break;
                default: assert(0);
            }
        }
        ktemperature*=mut;
    }
    printf("\nn_gold_found=%d\n",n_gold_found);
    exit_code=0;

Exit:
    qsmm_actor_destroy(actor);
    return exit_code;
}

Previous: , Up: Adaptive Probabilistic Mapping   [Contents][Index]