Extended Tree Transducers in Natural Language Processing Andreas Maletti Institute for Natural Language Processing Universität Stuttgart
Grenoble — May 28, 2015
Machine Translation Original
Übersetzung (G OOGLE T RANSLATE) I
The addressees of this paper are students and students will be in the audience are.
Machine Translation Original I
Die Adressaten dieses Vortrags sind Studierende und im Publikum werden sich Studierende befinden. (The addressees of this talk are students, and students will be in the audience.)
Übersetzung (G OOGLE T RANSLATE) I
The addressees of this paper are students and students will be in the audience are.
Machine Translation Original I
Die Adressaten dieses Vortrags sind Studierende und im Publikum werden sich Studierende befinden. (The addressees of this talk are students, and students will be in the audience.)
Übersetzung (G OOGLE T RANSLATE) I
The addressees of this paper are students and students will be in the audience are.
I
To scientific lecture, a public discussion follows on.
Machine Translation Original I
Die Adressaten dieses Vortrags sind Studierende und im Publikum werden sich Studierende befinden. (The addressees of this talk are students, and students will be in the audience.)
I
An den wissenschaftlichen Vortrag schließt sich eine öffentliche Diskussion an. (The scientific lecture is followed by a public discussion.)
Übersetzung (G OOGLE T RANSLATE) I
The addressees of this paper are students and students will be in the audience are.
I
To scientific lecture, a public discussion follows on.
Machine Translation VAUQUOIS triangle: foreign
German semantics
syntax
phrase
Translation model:
Machine Translation VAUQUOIS triangle: foreign
German semantics
syntax
phrase
Translation model: string-to-tree
Machine Translation VAUQUOIS triangle: foreign
German semantics
syntax
phrase
Translation model: tree-to-tree
Machine Translation
Training data I
parallel corpus
I
word alignments
I
parse trees for the target sentences
Machine Translation
Training data I
parallel corpus
I
word alignments
I
parse trees for the target sentences
Parallel Corpus linguistic resource containing example translations (sentence level)
Machine Translation parallel corpus, word alignments, parse tree
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
¨ Unzulassigkeit
geben
NN
VV
Machine Translation parallel corpus, word alignments, parse tree
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
via GIZA++ [O CH , N EY, 2003]
¨ Unzulassigkeit
geben
NN
VV
Machine Translation parallel corpus, word alignments, parse tree
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
via B ERKELEY parser [P ETROV et al., 2006]
¨ Unzulassigkeit
geben
NN
VV
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
PP
VV
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
PP
VV
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
PP
VV
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
PP
VV
Extended Tree Transducer Extended top-down tree transducer (STSG) I I
variant of [M., G RAEHL , H OPKINS , K NIGHT, 2009] rules of the form NT → r , r1 I I I
I
nonterminal NT right-hand side r of context-free grammar rule right-hand side r1 of regular tree grammar rule
(bijective) synchronization of nonterminals PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
PP
VV
Extended Tree Transducer PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
Rule application 1. Selection of synchronous nonterminals
PP
VV
Extended Tree Transducer PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
NP S
Rule application 1. Selection of synchronous nonterminals
PP
VV
Extended Tree Transducer PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
PP
VV
NP S
Rule application would
like
1. Selection of synchronous nonterminals 2. Selection of suitable rule
KOUS →
¨ Konnten
KOUS
Extended Tree Transducer PPER
would KOUSlike
¨ Konnten
PPER
advice
eine
Auskunft
ART
NN
PP
geben
S→ KOUS
PPER
PPER
PP
VV
NP S
Rule application would
like
1. Selection of synchronous nonterminals 2. Selection of suitable rule
KOUS →
¨ Konnten
3. Replacement on both sides KOUS
Extended Tree Transducer PPER
would
¨ Konnten S→
KOUS
PPER
PPER
like
PPER
advice
eine
Auskunft
ART
NN
APPR
NN PP CD
geben
APPR
CD
NN PP
NP S
Rule application 1. synchronous nonterminals
PP
PP
VV
Extended Tree Transducer PPER
would
¨ Konnten S→
KOUS
PPER
PPER
like
PPER
advice
eine
Auskunft
ART
NN
APPR
NN PP CD
geben
APPR
CD
NN PP
NP S
Rule application 1. synchronous nonterminals
PP
PP
VV
Extended Tree Transducer PPER
would
¨ Konnten S→
KOUS
PPER
PPER
like
PPER
advice
eine
Auskunft
ART
NN
APPR
NN PP CD
PP
geben
APPR
NN
CD
PP
APPR
NN
VV
PP NP S
Rule application CD
PP
1. synchronous nonterminals 2. suitable rule
PP →
APPR
CD
NN
PP
PP
Extended Tree Transducer PPER
would
¨ Konnten S→
KOUS
PPER
PPER
like
PPER
advice
eine
Auskunft
ART
NN
APPR
NN PP CD
PP
geben
APPR
NN
CD
PP
APPR
NN
VV
PP NP S
Rule application CD
PP
1. synchronous nonterminals 2. suitable rule
PP →
APPR
CD
NN
3. replacement PP
PP
Rule extraction following [G ALLEY, H OPKINS , K NIGHT, M ARCU, 2004]
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
¨ Unzulassigkeit
geben
NN
VV
Rule extraction following [G ALLEY, H OPKINS , K NIGHT, M ARCU, 2004]
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
extractable rules marked in red
¨ Unzulassigkeit
geben
NN
VV
Rule extraction following [G ALLEY, H OPKINS , K NIGHT, M ARCU, 2004]
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
extractable rules marked in red
¨ Unzulassigkeit
geben
NN
VV
Rule extraction following [G ALLEY, H OPKINS , K NIGHT, M ARCU, 2004]
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
extractable rules marked in red
¨ Unzulassigkeit
geben
NN
VV
Rule extraction following [G ALLEY, H OPKINS , K NIGHT, M ARCU, 2004]
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
extractable rules marked in red
¨ Unzulassigkeit
geben
NN
VV
Rule extraction Removal of extractable rule:
¨ Konnten KOUS
Sie
I
would
mir
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN
PPER PPER
like
your
advice
about
Rule
143
concerning
mit
inadmissibility
der
APPR ART PP PP
PP NP
S
¨ Unzulassigkeit
geben
NN
VV
Rule extraction Removal of extractable rule:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Rule extraction Repeated rule extraction:
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Extended Tree Transducer Advantages I
very simple
I
implemented in M OSES [KOEHN et al., 2007]
I
“context-free”
Extended Tree Transducer Advantages I
very simple
I
implemented in M OSES [KOEHN et al., 2007]
I
“context-free”
Disadvantages I
problems with discontinuities
I
composition and binarization not possible [M. et al., 2009] and [Z HANG et al., 2006]
I
“context-free”
Extended Tree Transducer
Remarks I
synchronization breaks almost all existing constructions (e.g., the normalization construction)
→ the basic grammar model very important
Extended Tree Transducer
Remarks I
synchronization breaks almost all existing constructions (e.g., the normalization construction)
→ the basic grammar model very important I
tree-to-tree models use trees on both sides
Extended Tree Transducer
Major (tree-to-tree) models 1. linear top-down tree transducer (with look-ahead) I I I
input-side: tree automaton output-side: regular tree grammar synchronization: mapping output NT to input NT
Extended Tree Transducer
Major (tree-to-tree) models 1. linear top-down tree transducer (with look-ahead) I I I
input-side: tree automaton output-side: regular tree grammar synchronization: mapping output NT to input NT
2. linear extended top-down tree transducer (w. look-ahead) I I I
input-side: regular tree grammar output-side: regular tree grammar synchronization: mapping output NT to input NT
Extended Tree Transducer Synchronous grammar rule: VP VP q1
q2
q
q3
—
q2
VP q1
q3
“Classical” top-down tree transducer rule: VP
q →
VP x1
x2
x3
q2 x2
VP q1
q3
x1
x3
Extended Tree Transducer Syntactic restrictions I
nondeleting if synchronization bijective
(in all rules)
I
strict if r1 not a nonterminal
(for all rules q → (r , r1 ))
I
ε-free if r not a nonterminal
(for all rules q → (r , r1 ))
Composition (C OMP) executing transformations τ ⊆ TΣ × T∆ and τ 0 ⊆ T∆ × TΓ one after the other: τ ; τ 0 = {(s, u) | ∃t ∈ T∆ : (s, t) ∈ τ, (t, u) ∈ τ 0 }
Extended Tree Transducer Rotations (R OT) {hσ(σ(t1 , t2 ), t3 ), σ(t1 , σ(t2 , t3 ))i | t1 , t2 , t3 ∈ TΣ } σ σ t1
σ t3
t2
7→
σ
t1 t2
t3
Extended Tree Transducer Rotations (R OT) {hσ(σ(t1 , t2 ), t3 ), σ(t1 , σ(t2 , t3 ))i | t1 , t2 , t3 ∈ TΣ } σ σ t1
σ t3
7→
σ
t1
t2
t2
t3
Preservation of regularity (P RES) Given τ ⊆ TΣ × T∆ and L ⊆ TΣ regular, is τ (L) regular? τ (L) = {u | ∃t ∈ L : (t, u) ∈ τ }
Extended Tree Transducer
Notation I
(X)TOP = class of tree transformations computable by (extended) top-down tree transducers
I
(X)TOPR = class of . . . transducers with regular look-ahead
I
x-(X)TOP(R) = class of . . . transducers with properties x
Example ln-TOP = class of tree transformations computable by linear and nondeleting top-down tree transducers
Top-down Tree Transducer TOPR ∞ TOP∞
l-TOPR 1
l-TOP2
ls-TOPR 1
ls-TOP2
ln-TOP1 lns-TOP1 composition closure indicated in subscript
Top-down Tree Transducer
Model \ Criterion lns-TOP ln-TOP ls-TOP l-TOP ls-TOPR l-TOPR TOP TOPR
R OT S YM P RES P RES−1 C OMP 7 7 7 7 7 7 3 3
7 7 7 7 7 7 7 7
3 3 3 3 3 3 7 7
3 3 3 3 3 3 3 3
3 3 72 72 3 3 7∞ 7∞
(S YM = symmetric)
Extended Top-down Tree Transducer XTOPR ∞
XTOP∞
l-XTOPR ∞
l-XTOP∞
ln-XTOP∞
lε-XTOPR 3
lε-XTOP4
lns-XTOP∞
lsε-XTOPR 2
lnε-XTOP∞
lnsε-XTOP2
TOPR ∞
ε-XTOP∞
lsε-XTOP2
l-TOPF2
l-TOPR 1
TOP∞
ls-TOP2
l-TOP2
ln-TOP1
lns-TOP1
composition closure indicated in subscript
Extended Top-down Tree Transducer Model \ Criterion
R OT S YM P RES P RES−1 C OMP
ln-TOP l-TOP l-TOPR TOPR
7 7 7 3
7 7 7 7
3 3 3 7
3 3 3 3
3 72 3 7∞
lnsε-XTOP lns-XTOP lsε-XTOP(R) lε-XTOP lε-XTOPR (s)l-XTOP(R) XTOP XTOPR
3 3 3 3 3 3 3 3
3 7 7 7 7 7 7 7
3 3 3 3 3 3 7 7
3 3 3 3 3 3 3 3
72 7∞ 72 74 73 7∞ 7∞ 7∞
Rule extraction PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
I
very specific rule
I
every rule for “advice” contains sentence structure
I
(syntax “in the way”)
Extended Tree Transducer Extended Multi Bottom-up Tree Transducer (MBOT) I I
variant of [M., 2010] rules of the form NT → r , hr1 , . . . , rn i I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand sides r1 , . . . , rn of regular tree grammar rule
Extended Tree Transducer Extended Multi Bottom-up Tree Transducer (MBOT) I I
variant of [M., 2010] rules of the form NT → r , hr1 , . . . , rn i I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand sides r1 , . . . , rn of regular tree grammar rule
advice
ART-NN-VV →
eine
Auskunft
geben
ART
NN
VV
Extended Tree Transducer Extended Multi Bottom-up Tree Transducer (MBOT) I I
variant of [M., 2010] rules of the form NT → r , hr1 , . . . , rn i I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand sides r1 , . . . , rn of regular tree grammar rule
advice
ART-NN-VV →
eine
Auskunft
geben
ART
NN
VV
Extended Tree Transducer Extended Multi Bottom-up Tree Transducer (MBOT) I I
variant of [M., 2010] rules of the form NT → r , hr1 , . . . , rn i I I I
nonterminal NT right-hand side r of context-free grammar rule right-hand sides r1 , . . . , rn of regular tree grammar rule
ART-NN-VV
about
Rule
143
zu
Artikel
143
APPR
NN
CD
PP
NP-VV → ART
NN
PP NP
PP
VV
Extended Tree Transducer Extended Multi Bottom-up Tree Transducer (MBOT) I I
variant of [M., 2010] rules of the form NT → r , hr1 , . . . , rn i I I I
I
nonterminal NT right-hand side r of context-free grammar rule right-hand sides r1 , . . . , rn of regular tree grammar rule
synchronization via map NT r1 , . . . , rn to NT r ART-NN-VV
about
Rule
143
zu
Artikel
143
APPR
NN
CD
PP
NP-VV → ART
NN
PP NP
PP
VV
Extended Multi Bottom-up Tree Transducer ART-NN-VV
about
Rule
143
zu
Artikel
143
APPR
NN
CD
PP
NP-VV → ART
NN
PP NP
Rule application 1. synchronous nonterminals
PP
VV
Extended Multi Bottom-up Tree Transducer ART-NN-VV
about
Rule
143
zu
Artikel
143
APPR
NN
CD
PP
NP-VV → ART
NN
PP NP
Rule application 1. synchronous nonterminals
PP
VV
Extended Multi Bottom-up Tree Transducer ART-NN-VV
about
Rule
143
zu
Artikel
143
APPR
NN
CD
PP
NP-VV → ART
NN
PP
VV
PP NP
advice
Rule application 1. synchronous nonterminals
ART-NN-VV →
eine
Auskunft
geben
ART
NN
VV
2. suitable rule
Extended Multi Bottom-up Tree Transducer advice
about
Rule
143
eine
Auskunft
zu
Artikel
143
ART
NN
APPR
NN
CD
PP
geben
NP-VV → PP
VV
PP NP
advice
Rule application 1. synchronous nonterminals
ART-NN-VV →
eine
Auskunft
geben
ART
NN
VV
2. suitable rule 3. replacement
Rule extraction following [M., 2011]
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
Rule extraction following [M., 2011]
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
Rule extraction following [M., 2011]
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Rule extraction following [M., 2011]
PPER
¨ Konnten KOUS
Sie PPER PPER
would
like
your
advice
about
Rule
143
PP
eine
Auskunft
zu
Artikel
143
geben
ART
NN
APPR
NN
CD
VV
PP PP NP
S
extractable rules marked in red
Extended Multi Bottom-up Tree Transducer
I
complicated discontinuities
I
also available in M OSES [B RAUNE et al., 2013]
I
binarizable, composable
Extended Multi Bottom-up Tree Transducer
I
complicated discontinuities
I
also available in M OSES [B RAUNE et al., 2013]
I
binarizable, composable
Disadvantages I
output not regular (as tree language)
I
not symmetric (input context-free; output not)
Discontinuity He
Er
hat
bought
ein
PPER VAFIN ART
a
new
and
fuel-efficient
car
neues
und
sparsames
Auto
gekauft
ADJA
KON
ADJA
NN
VVPP
CAP NP VP S
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] l-XTOPR = l-XBOT
Proof. Standard construction trading input-deletion for output-deletion see l-TOP ⊆ l-BOT by [E NGELFRIET ’75]
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] l-XTOPR = l-XBOT
Proof. Standard construction trading input-deletion for output-deletion see l-TOP ⊆ l-BOT by [E NGELFRIET ’75] ln-XMBOT ln-XBOT
l-XBOT
ln-MBOT
ln-XTOP
l-XTOP
l-MBOT lε-XMBOT sen-XTOP
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] l-XMBOT = ln-XMBOT
Proof. I
guess subtrees that will be deleted
I
process them using look-ahead
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] l-XMBOT = ln-XMBOT
Proof. I
guess subtrees that will be deleted
I
process them using look-ahead ln-XMBOT ln-XBOT
l-XBOT
ln-MBOT
ln-XTOP
l-XTOP
l-MBOT lε-XMBOT sen-XTOP
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] l-XMBOT = ln-XMBOT
Proof. I
guess subtrees that will be deleted
I
process them using look-ahead ln-XMBOT ln-XBOT
l-XBOT
ln-MBOT
ln-XTOP
l-XTOP
l-MBOT lε-XMBOT sen-XTOP
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] lε-XMBOT = l-MBOT
Proof. I
decompose large left-hand sides using “multi”-states
I
attach finite effect of ε-rules
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] lε-XMBOT = l-MBOT
Proof. I
decompose large left-hand sides using “multi”-states
I
attach finite effect of ε-rules ln-XMBOT ln-XBOT
l-XBOT
ln-MBOT
ln-XTOP
l-XTOP
l-MBOT lε-XMBOT sen-XTOP
Extended Multi Bottom-up Tree Transducer Theorem [E NGELFRIET et al., 2009] lε-XMBOT = l-MBOT
Proof. I
decompose large left-hand sides using “multi”-states
I
attach finite effect of ε-rules ln-XMBOT ln-XBOT
l-XBOT
ln-MBOT
ln-XTOP
l-XTOP
l-MBOT lε-XMBOT sen-XTOP
Extended Multi Bottom-up Tree Transducer
Theorem [M., 2014] ln-MBOT 6⊆ ln-XTOPR
∗
Extended Multi Bottom-up Tree Transducer
Theorem [M., 2014] ln-MBOT 6⊆ ln-XTOPR
∗
Theorem [G ILDEA, 2012] ydout (ln-MBOT) = LCFRS
Summary Model \ Criterion
R OT S YM P RES P RES−1 C OMP
ln-TOP l-TOP l-TOPR TOPR
7 7 7 3
7 7 7 7
3 3 3 7
3 3 3 3
3 72 3 7∞
lnsε-XTOP lns-XTOP lsε-XTOP(R) lε-XTOP lε-XTOPR (s)l-XTOP(R) XTOP(R)
3 3 3 3 3 3 3
3 7 7 7 7 7 7
3 3 3 3 3 3 7
3 3 3 3 3 3 3
72 7∞ 72 74 73 7∞ 7∞
l(n)-XMBOT XMBOT reg.-preserving l-XMBOT invertable l-XMBOT
3 3 3 3
7 7 7 3
7 7 3 3
3 3 3 3
3 7∞ 3 3
Evaluation Task English → German
English → Arabic
English → Chinese
System STSG MBOT phrase-based hierarchical GHKM STSG MBOT phrase-based hierarchical GHKM STSG MBOT phrase-based hierarchical GHKM
BLEU 15.22 15.90 16.73 16.95 17.10 48.32 49.10 50.27 51.71 46.66 17.69 18.35 18.09 18.49 18.12
from [S EEMANN , B RAUNE , M., 2015]
Literature Selected references A RNOLD, DAUCHET: Morphismes et Bimorphismes d’Arbres Theoret. Comput. Sci. 20, 1982 E NGELFRIET: Bottom-up and Top-down Tree Transformations — A Comparison. Math. Systems Theory 9, 1975 E NGELFRIET, M ANETH: Macro Tree Translations of Linear Size Increase are MSO Definable. SIAM J. Comput. 32, 2003 E NGELFRIET, L ILIN, ∼: Extended Multi Bottom-up Tree Transducers — Composition and Decomposition. Acta Inf. 46, 2009 R OUNDS: Mappings and Grammars on Trees Math. Systems Theory 4, 1970 T HATCHER: Generalized 2 Sequential Machine Maps J. Comput. System Sci. 4, 1970
Current Research Decoding I
input regular tree language
I
extended CYK algorithm for translation (parse the input; translation develops)
Current Research Decoding I
input regular tree language
I
extended CYK algorithm for translation (parse the input; translation develops)
Observations I
phrase-based system makes no search errors [C HANG , C OLLINS, 2011]
Current Research Decoding I
input regular tree language
I
extended CYK algorithm for translation (parse the input; translation develops)
Observations I
phrase-based system makes no search errors [C HANG , C OLLINS, 2011]
I
STSG and MBOT do I I
heuristics exact decoding with syntax forest
(??? BLEU) (+2–3 BLEU)
Current Research Rule extraction I
too many extractable rules I I
which restrictions? [S EEMANN , B RAUNE , M., 2015] efficient representation (maybe symbolic)
Current Research Rule extraction I
too many extractable rules I I
I
which restrictions? [S EEMANN , B RAUNE , M., 2015] efficient representation (maybe symbolic)
only best syntax tree I
rule extraction with syntax forest
(ambitious)
Current Research
I
¨ Konnten KOUS
Sie
mir
PPER PPER
would
like
your
advice
about
Rule
143
concerning
eine
Auskunft
zu
Artikel
143
im
Zusammenhang
ART
NN
APPR
NN
CD
AART
NN PP
PP NP
S
mit
inadmissibility
der
APPR ART PP
¨ Unzulassigkeit
geben
NN
VV
Current Research Rule extraction I
too many extractable rules I I
I
which restrictions? [S EEMANN , B RAUNE , M., 2015] efficient representation (maybe symbolic)
only best syntax tree I
rule extraction with syntax forest
Translation models I
only word-based systems for word alignment I I
efficient restrictions of modern systems unsupervised learning
(ambitious)
Current Research Rule extraction I
too many extractable rules I I
I
which restrictions? [S EEMANN , B RAUNE , M., 2015] efficient representation (maybe symbolic)
only best syntax tree I
rule extraction with syntax forest
Translation models I
only word-based systems for word alignment I I
I
efficient restrictions of modern systems unsupervised learning
models for semantics-based translation I
graph-based models
(ambitious)