activations_1

Activation Function ์ด๋ž€?

Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)์€ ์ธ๊ณต์‹ ๊ฒฝ๋ง์—์„œ ์ž…๋ ฅ ์‹ ํ˜ธ๋ฅผ ์ฒ˜๋ฆฌํ•œ ํ›„, ์ถœ๋ ฅ ์‹ ํ˜ธ๋ฅผ ๋งŒ๋“ค์–ด๋‚ด๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๊ฐ’์„ ๊ฒฐ์ •ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ๋น„์„ ํ˜• ํ•จ์ˆ˜(non-linear function)์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง์ด ๋ณต์žกํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ , ๋‹ค์–‘ํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•จ์ž…๋‹ˆ๋‹ค. ๋งŒ์•ฝ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์„ ํ˜• ํ•จ์ˆ˜(linear function)์ด๋ผ๋ฉด, ์‹ ๊ฒฝ๋ง์ด ๊นŠ์–ด์งˆ์ˆ˜๋ก ์ž…๋ ฅ๊ฐ’๊ณผ ์ถœ๋ ฅ๊ฐ’์ด ์„ ํ˜•์ ์ธ ๊ด€๊ณ„๋ฅผ ๊ฐ–๊ฒŒ ๋˜์–ด, ํšจ๊ณผ์ ์ธ ํ•™์Šต์ด ๋ถˆ๊ฐ€๋Šฅํ•ด์ง‘๋‹ˆ๋‹ค.

๋Œ€ํ‘œ์ ์ธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ๋Š” ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, ReLU ํ•จ์ˆ˜, tanh ํ•จ์ˆ˜ ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•ด ๋‹ค์–‘ํ•œ ๋ณ€ํ™”๋ฅผ ์ฃผ์–ด, ์‹ ๊ฒฝ๋ง์ด ๋ณต์žกํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

// darknet.h

typedef enum{
    LOGISTIC, RELU, RELIE, LINEAR, RAMP, TANH, PLSE, LEAKY, ELU, LOGGY, STAIR, HARDTAN, LHTAN, SELU
} ACTIVATION;

ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ข…๋ฅ˜๋ฅผ ์ •์˜ํ•˜๋Š” ์—ด๊ฑฐํ˜•(enum)์ž…๋‹ˆ๋‹ค.

๊ฐ๊ฐ์˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ํ•ด๋‹น ํ•จ์ˆ˜์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ ์„ ํƒ๋˜์–ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋“ค์€ ๋‹ค์–‘ํ•œ ๋น„์„ ํ˜•์„ฑ(non-linearity)์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ธ๊ณต์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด ๋ณต์žกํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.

๊ฐ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ํŠน์ง•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • LOGISTIC : ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜

  • RELU : ReLU ํ•จ์ˆ˜

  • RELIE : ReLU ํ•จ์ˆ˜์˜ ๋ณ€ํ˜•

  • LINEAR : ์„ ํ˜• ํ•จ์ˆ˜

  • RAMP : RAMP ํ•จ์ˆ˜

  • TANH : ํ•˜์ดํผ๋ณผ๋ฆญ ํƒ„์  ํŠธ ํ•จ์ˆ˜

  • PLSE : ํ”Œ๋ฆฝ-์„ ํ˜• ํ•จ์ˆ˜

  • LEAKY : Leaky ReLU ํ•จ์ˆ˜

  • ELU : Exponential Linear Units ํ•จ์ˆ˜

  • LOGGY : ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜์˜ ๋ณ€ํ˜•

  • STAIR : ๊ณ„๋‹จ ํ•จ์ˆ˜

  • HARDTAN : ํ•˜๋“œ ํƒ„์  ํŠธ ํ•จ์ˆ˜

  • LHTAN : LeCun ํ•˜์ดํผ๋ณผ๋ฆญ ํƒ„์  ํŠธ ํ•จ์ˆ˜

  • SELU : Scaled Exponential Linear Units ํ•จ์ˆ˜

์ด๋Ÿฌํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋“ค์€ ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•ด ์ ํ•ฉํ•œ ๋น„์„ ํ˜• ๋ณ€ํ™˜์„ ์ˆ˜ํ–‰ํ•˜์—ฌ, ๋ชจ๋ธ์ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํŒจํ„ด์„ ํŒŒ์•…ํ•˜๊ณ  ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

get_activation_string

char *get_activation_string(ACTIVATION a)
{
    switch(a){
        case LOGISTIC:
            return "logistic";
        case LOGGY:
            return "loggy";
        case RELU:
            return "relu";
        case ELU:
            return "elu";
        case SELU:
            return "selu";
        case RELIE:
            return "relie";
        case RAMP:
            return "ramp";
        case LINEAR:
            return "linear";
        case TANH:
            return "tanh";
        case PLSE:
            return "plse";
        case LEAKY:
            return "leaky";
        case STAIR:
            return "stair";
        case HARDTAN:
            return "hardtan";
        case LHTAN:
            return "lhtan";
        default:
            break;
    }
    return "relu";
}

ํ•จ์ˆ˜ ์ด๋ฆ„: get_activation_string

์ž…๋ ฅ:

  • a: ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)

๋™์ž‘:

  • ์ž…๋ ฅ๋œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)์— ๋Œ€์‘๋˜๋Š” ๋ฌธ์ž์—ด์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช…:

  • ์ด ํ•จ์ˆ˜๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ํ•ด๋‹น ํ•จ์ˆ˜์— ๋Œ€์‘๋˜๋Š” ๋ฌธ์ž์—ด์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • ๋ฌธ์ž์—ด์€ ํ•ด๋‹น ํ•จ์ˆ˜์˜ ์ด๋ฆ„๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

  • ๋งŒ์•ฝ ์ž…๋ ฅ๋œ ํ•จ์ˆ˜์— ๋Œ€์‘ํ•˜๋Š” ๋ฌธ์ž์—ด์ด ์—†๋Š” ๊ฒฝ์šฐ, ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ "relu"๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

get_activation

ACTIVATION get_activation(char *s)
{
    if (strcmp(s, "logistic")==0) return LOGISTIC;
    if (strcmp(s, "loggy")==0) return LOGGY;
    if (strcmp(s, "relu")==0) return RELU;
    if (strcmp(s, "elu")==0) return ELU;
    if (strcmp(s, "selu")==0) return SELU;
    if (strcmp(s, "relie")==0) return RELIE;
    if (strcmp(s, "plse")==0) return PLSE;
    if (strcmp(s, "hardtan")==0) return HARDTAN;
    if (strcmp(s, "lhtan")==0) return LHTAN;
    if (strcmp(s, "linear")==0) return LINEAR;
    if (strcmp(s, "ramp")==0) return RAMP;
    if (strcmp(s, "leaky")==0) return LEAKY;
    if (strcmp(s, "tanh")==0) return TANH;
    if (strcmp(s, "stair")==0) return STAIR;
    fprintf(stderr, "Couldn't find activation function %s, going with ReLU\n", s);
    return RELU;
}

ํ•จ์ˆ˜ ์ด๋ฆ„: get_activation

์ž…๋ ฅ:

  • s: ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฌธ์ž์—ด

๋™์ž‘:

  • ์ž…๋ ฅ๋œ ๋ฌธ์ž์—ด์— ๋Œ€์‘ํ•˜๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช…:

  • ์ด ํ•จ์ˆ˜๋Š” ๋ฌธ์ž์—ด๋กœ ํ‘œํ˜„๋œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ด๋ฆ„์„ ์ž…๋ ฅํ•˜๋ฉด ํ•ด๋‹น ํ•จ์ˆ˜์— ๋Œ€์‘ํ•˜๋Š” enum ๊ฐ’(์ •์ˆ˜)์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • ํ•จ์ˆ˜ ๋‚ด๋ถ€์—์„œ๋Š” ์ž…๋ ฅ๋œ ๋ฌธ์ž์—ด์„ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ด๋ฆ„๋“ค๊ณผ ๋น„๊ตํ•˜์—ฌ ๋Œ€์‘ํ•˜๋Š” enum ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๋ฉฐ, ์ž…๋ ฅ๋œ ๋ฌธ์ž์—ด์ด ์–ด๋– ํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์™€๋„ ๋Œ€์‘๋˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ์—๋Š” ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ RELU๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • ์ด ๋•Œ, ํ•จ์ˆ˜๋Š” stderr์„ ์ด์šฉํ•˜์—ฌ ์—๋Ÿฌ ๋ฉ”์‹œ์ง€๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

activate

float activate(float x, ACTIVATION a)
{
    switch(a){
        case LINEAR:
            return linear_activate(x);
        case LOGISTIC:
            return logistic_activate(x);
        case LOGGY:
            return loggy_activate(x);
        case RELU:
            return relu_activate(x);
        case ELU:
            return elu_activate(x);
        case SELU:
            return selu_activate(x);
        case RELIE:
            return relie_activate(x);
        case RAMP:
            return ramp_activate(x);
        case LEAKY:
            return leaky_activate(x);
        case TANH:
            return tanh_activate(x);
        case PLSE:
            return plse_activate(x);
        case STAIR:
            return stair_activate(x);
        case HARDTAN:
            return hardtan_activate(x);
        case LHTAN:
            return lhtan_activate(x);
    }
    return 0;
}

ํ•จ์ˆ˜ ์ด๋ฆ„: activate

์ž…๋ ฅ:

  • x: ํ™œ์„ฑํ™” ํ•จ์ˆ˜์— ๋Œ€ํ•œ ์ž…๋ ฅ ๊ฐ’

  • a: ์ ์šฉํ•  ํ™œ์„ฑํ™” ํ•จ์ˆ˜

๋™์ž‘:

  • ์ž…๋ ฅ๋œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)์— ๋”ฐ๋ผ x ๊ฐ’์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช…:

  • ์ด ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๋œ ์‹ค์ˆ˜๊ฐ’ x์™€ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)์„ ์ž…๋ ฅ๋ฐ›์•„, ํ•ด๋‹น ํ™œ์„ฑํ™” ํ•จ์ˆ˜์— ๋”ฐ๋ผ x ๊ฐ’์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” float ํ˜• ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ์„ ํ˜•(linear), ๋กœ์ง€์Šคํ‹ฑ(logistic), ๋กœ๊ทธ(loggy), ReLU(relu), ELU(elu), SELU(selu), RELIE(relie), RAMP(ramp), LeakyReLU(leaky), ํ•˜์ดํผ๋ณผ๋ฆญ ํƒ„์  ํŠธ(tanh), PLSE(plse), STAIR(stair), HardTanh(hardtan), LHTan(lhtan) ๋“ฑ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

  • ํ•ด๋‹น ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๋œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(enum ๊ฐ’)์— ๋”ฐ๋ผ ์ ์ ˆํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ์‹ค์ˆ˜๊ฐ’ x๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

activate_array

void activate_array(float *x, const int n, const ACTIVATION a)
{
    int i;
    for(i = 0; i < n; ++i){
        x[i] = activate(x[i], a);
    }
}

ํ•จ์ˆ˜ ์ด๋ฆ„: activate_array

์ž…๋ ฅ:

  • x: ์ž…๋ ฅ๊ฐ’ ๋ฐฐ์—ด

  • n: ๋ฐฐ์—ด ํฌ๊ธฐ

  • a: ํ™œ์„ฑํ™” ํ•จ์ˆ˜

๋™์ž‘:

  • ์ž…๋ ฅ๊ฐ’ ๋ฐฐ์—ด x์— ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a๋ฅผ ์ ์šฉํ•˜์—ฌ ๊ฐ ์›์†Œ๋ฅผ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช…:

  • ์ด ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๊ฐ’ ๋ฐฐ์—ด x์˜ ๊ฐ ์›์†Œ์— ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ์ž…๋ ฅ๊ฐ’ ๋ฐฐ์—ด x์™€ ๋ฐฐ์—ด ํฌ๊ธฐ n, ๊ทธ๋ฆฌ๊ณ  ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์Šต๋‹ˆ๋‹ค.

  • ๋ฐฐ์—ด x์˜ ๊ฐ ์›์†Œ์— ๋Œ€ํ•ด activate ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ํ™œ์„ฑํ™”๋œ ๊ฐ’์„ ๋‹ค์‹œ ๋ฐฐ์—ด x์˜ ํ•ด๋‹น ์›์†Œ์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

  • ์ด ๊ณผ์ •์„ ๋ฐฐ์—ด x์˜ ๋ชจ๋“  ์›์†Œ์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•˜๋ฉด, ์ž…๋ ฅ๊ฐ’ ๋ฐฐ์—ด x์— ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a๋ฅผ ์ ์šฉํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

gradient

float gradient(float x, ACTIVATION a)
{
    switch(a){
        case LINEAR:
            return linear_gradient(x);
        case LOGISTIC:
            return logistic_gradient(x);
        case LOGGY:
            return loggy_gradient(x);
        case RELU:
            return relu_gradient(x);
        case ELU:
            return elu_gradient(x);
        case SELU:
            return selu_gradient(x);
        case RELIE:
            return relie_gradient(x);
        case RAMP:
            return ramp_gradient(x);
        case LEAKY:
            return leaky_gradient(x);
        case TANH:
            return tanh_gradient(x);
        case PLSE:
            return plse_gradient(x);
        case STAIR:
            return stair_gradient(x);
        case HARDTAN:
            return hardtan_gradient(x);
        case LHTAN:
            return lhtan_gradient(x);
    }
    return 0;
}

ํ•จ์ˆ˜ ์ด๋ฆ„: gradient

์ž…๋ ฅ:

  • x: ํ™œ์„ฑํ™” ํ•จ์ˆ˜์— ๋Œ€ํ•œ ์ž…๋ ฅ ๊ฐ’

  • a: ์ ์šฉํ•  ํ™œ์„ฑํ™” ํ•จ์ˆ˜

๋™์ž‘:

  • ์ž…๋ ฅ ๊ฐ’ x์™€ ์ ์šฉํ•  ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a์— ๋”ฐ๋ผ ํ•ด๋‹น ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜

์„ค๋ช…:

  • ์‹ ๊ฒฝ๋ง์—์„œ ์—ญ์ „ํŒŒ(backpropagation) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•  ๋•Œ, ์˜ค์ฐจ(error)๋ฅผ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ€์ค‘์น˜(weight)๋ฅผ ์กฐ์ ˆํ•ด์•ผ ํ•˜๋Š”๋ฐ, ์ด๋ฅผ ์œ„ํ•ด ๊ฐ ๋…ธ๋“œ์˜ ์ž…๋ ฅ ๊ฐ’์ด ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์ณ ์ถœ๋ ฅ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜๋˜๋Š” ๊ณผ์ •์—์„œ ํ•ด๋‹น ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ๋„ํ•จ์ˆ˜(gradient)๋ฅผ ๊ตฌํ•ด์•ผ ํ•œ๋‹ค.

  • gradient ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ ๊ฐ’ x์™€ ์ ์šฉํ•  ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a๋ฅผ ๋ฐ›์•„ ํ•ด๋‹น ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜๋กœ, switch๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์€ ํ™œ์„ฑํ™” ํ•จ์ˆ˜ a์— ๋”ฐ๋ผ ํ•ด๋‹น ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

gradient_array

void gradient_array(const float *x, const int n, const ACTIVATION a, float *delta)
{
    int i;
    for(i = 0; i < n; ++i){
        delta[i] *= gradient(x[i], a);
    }
}

ํ•จ์ˆ˜ ์ด๋ฆ„: gradient_array

์ž…๋ ฅ:

  • x: ์ž…๋ ฅ ๋ฐฐ์—ด ํฌ์ธํ„ฐ

  • n: ์ž…๋ ฅ ๋ฐฐ์—ด์˜ ํฌ๊ธฐ

  • a: ํ™œ์„ฑํ™” ํ•จ์ˆ˜

  • delta: ์ถœ๋ ฅ ๋ฐฐ์—ด ํฌ์ธํ„ฐ

๋™์ž‘:

  • ์ž…๋ ฅ ๋ฐฐ์—ด x์˜ ๊ฐ ์š”์†Œ์— ๋Œ€ํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„ ๊ฐ’์„ delta์— ๊ณฑํ•˜์—ฌ ์ถœ๋ ฅ ๋ฐฐ์—ด์„ ๊ณ„์‚ฐํ•œ๋‹ค.

  • ์ด ํ•จ์ˆ˜๋Š” ์—ญ์ „ํŒŒ(backpropagation) ์•Œ๊ณ ๋ฆฌ์ฆ˜์—์„œ ์‚ฌ์šฉ๋˜๋ฉฐ, ๋ฏธ๋ถ„ ๊ฐ’(delta)์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„, ์ด์ „ ์ธต์—์„œ ์ „๋‹ฌ๋œ ๋ฏธ๋ถ„ ๊ฐ’์— ๋Œ€ํ•œ ํ˜„์žฌ ์ธต์˜ ๋ฏธ๋ถ„ ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ด์ „ ์ธต์œผ๋กœ ์ „๋‹ฌํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค.

์„ค๋ช…:

  • ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ธฐ๋ฒ•์œผ๋กœ, ์ถœ๋ ฅ ๊ฐ’๊ณผ ์‹ค์ œ ๊ฐ’ ์‚ฌ์ด์˜ ์˜ค์ฐจ๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์„ ์—…๋ฐ์ดํŠธํ•œ๋‹ค.

  • ์ด ๊ณผ์ •์—์„œ gradient_array ํ•จ์ˆ˜๋Š” ๊ฐ ์ธต์—์„œ ๊ณ„์‚ฐ๋œ ๋ฏธ๋ถ„ ๊ฐ’๊ณผ ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„ ๊ฐ’์„ ๊ณฑํ•˜์—ฌ ์ด์ „ ์ธต์œผ๋กœ ์ „๋‹ฌํ•˜๋ฉฐ, ์ด์ „ ์ธต์—์„œ ์ „๋‹ฌ๋œ ๋ฏธ๋ถ„ ๊ฐ’์„ ํ˜„์žฌ ์ธต์—์„œ ๊ณฑํ•˜์—ฌ ์ถœ๋ ฅ ๋ฐฐ์—ด์˜ ๋ฏธ๋ถ„ ๊ฐ’์„ ๊ณ„์‚ฐํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๊ณ„์‚ฐ๋œ ๋ฏธ๋ถ„ ๊ฐ’์€ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์„ ์—…๋ฐ์ดํŠธํ•  ๋•Œ ์‚ฌ์šฉ๋œ๋‹ค.

Last updated

Was this helpful?