Christian Boitet
On the automatic transformation of a
set of EBMT Constituent Boundary
Patterns into a Context-Free Grammar,
and associated bottom-up algorithms
Abstract:EBMT as pursued at ATR uses a set of “Constituent Boundary Patterns" to describe the input language.
We show how to automatically convert such a set P into an equivalent CFG G=T(P), which is not only
weakly (generatively) equivalent, but quasi-structurally equivalent: there is a very simple homomorphism
transforming a G-tree into the corresponding P-tree.
As a result, any classical CFG-based analysis algorithm may be used with G=T(P), and the parse trees
converted to P-trees. In particular, all bottom-up algorithms are applicable. However, due to the layered
character of the set of non-terminals of any such G, it is possible to propose a simple and efficient (non-
deterministic) bottom-up analysis algorithm.
Because special "constituent boundary" tags are introduced by a pre-processing phase into the strings
described by P (and hence by G), it is also possible to adapt the idea of "operator precedence parsing" to
G, and perhaps even directly to P.