arXiv:2407.05116v1 Announce Type: new
Abstract: We present a new parser performance prediction (PPP) model using machine translation performance prediction system (MTPPS), statistically independent of any language or parser, relying only on extrinsic and novel features based on textual, link structural, and bracketing tree structural information. This new system, MTPPS-PPP, can predict the performance of any parser in any language and can be useful for estimating the grammatical difficulty when understanding a given text, for setting expectations from parsing output, for parser selection for a specific domain, and for parser combination systems. We obtain SoA results in PPP of bracketing $F_1$ with better results over textual features and similar performance with previous results that use parser and linguistic label specific information. Our results show the contribution of different types of features as well as rankings of individual features in different experimental settings (cased vs. uncased), in different learning tasks (in-domain vs. out-of-domain), with different training sets, with different learning algorithms, and with different dimensionality reduction techniques. We achieve $0.0678$ MAE and $0.85$ RAE in setting +Link, which corresponds to about $7.4%$ error when predicting the bracketing $F_1$ score for the Charniak and Johnson parser on the WSJ23 test set. MTPPS-PPP system can predict without parsing using only the text, without a supervised parser using only an unsupervised parser, without any parser or language dependent information, without using a reference parser output, and can be used to predict the performance of any parser in any language.
Source link
lol