Ronald CHRISLEY, Erik MCDERMOTT and Shigeru KATAGIRI
Objective Functions for Improved Pattern
Classification with Back-propagation Networks
Abstract:A discrepancy is noted between the error measure implied by standard
objective functions used for the training of back-propagation networks
and their actual error in performance. Specifically, if one uses such a
network for pattern classification, with one output node per class, and
the most active output node indicating the network's classification of the
input, then standard objective functions will 1) ascribe non-zero error to
network states that are classifying correctly and 2) modify the network
more than is necessary to account for incorrectly classified input, thus
violating the "minimal disturbance principle." It is hypothesized that
objective functions that lack these two characteristics will more closely
reflect the actual recognition error and thus their use will result in better
performance (i.e., fewer classification errors). Several such functions are
presented, and a few are benchmarked against standard error functions on
phoneme recognition tasks. Two of the methods show a consistent
improvement in performance on a small (BDG) task, but result in worse
performance for a large (all consonants) task.