Part of Advances in Neural Information Processing Systems 3 (NIPS 1990)
John Bridle, Stephen J. Cox
A particular form of neural network is described, which has terminals for acoustic patterns, class labels and speaker parameters. A method of training this network to "tune in" the speaker parameters to a particular speaker is outlined, based on a trick for converting a supervised network to an unsupervised mode. We describe experiments using this approach in isolated word recognition based on whole-word hidden Markov models. The results indicate an improvement over speaker-independent perfor(cid:173) mance and, for unlabelled data, a performance close to that achieved on labelled data.