𝔖 Bobbio Scriptorium
✦   LIBER   ✦

The effects of learning, parameter tying and model refinement for improving probabilistic tagging

✍ Scribed by Y.-C. Lin; T.-H. Chiang; K.-Y. Su


Publisher
Elsevier Science
Year
1995
Tongue
English
Weight
184 KB
Volume
9
Category
Article
ISSN
0885-2308

No coin nor oath required. For personal study only.

✦ Synopsis


To reduce the estimation error introduced by insufficient training data, the parameters of probabilistic models are usually smoothed by different techniques, such as Good-Turing smoothing and back-off smoothing. However, the discriminative power of the model cannot be significantly enhanced simply with the smoothing techniques. Therefore, in this paper an adaptive learning method is adopted to enhance the discrimination power of a probabilistic model. Also, a novel tying scheme is proposed to tie the unreliable parameters which never or rarely occurred in the training data, so that those unreliable parameters can have more chance to be adjusted by the learning procedure. In the task of tagging Brown Corpus, this approach greatly reduces the number of parameters from 578 759 to 27 947 and reduces the error rate of the ambiguous words (i.e. the words with more than one possible part of speech) from 5β€’48 to 4β€’93%, corresponding to 10β€’4% error reduction rate. Furthermore, a probabilistic model is usually simplified to enable reliable estimates of its parameters using the limited amount of training data. As a consequence, the modelling error is increased because some discriminative features are sacrificed while simplifying that model. Therefore, a probabilistic classification model is proposed to reduce the modelling error by better using the discriminative features selected by the Classification and Regression Tree method. This proposed model achieves 19β€’16% error reduction rate for the top 30 errorcontributing words, which contribute 31β€’64% of the overall tagging errors.


πŸ“œ SIMILAR VOLUMES


Covariate measurement error and the esti
✍ Tor D. Tosteson; John P. Buonaccorsi; Eugene Demidenko πŸ“‚ Article πŸ“… 1998 πŸ› John Wiley and Sons 🌐 English βš– 163 KB πŸ‘ 3 views

We explore the effects of measurement error in a time-varying covariate for a mixed model applied to a longitudinal study of plasma levels and dietary intake of beta-carotene. We derive a simple expression for the bias of large sample estimates of the variance of random effects in a longitudinal mod