Main Profile

At A Glance

Forest-based Search Algorithms in Parsing and Machine Translation

Google Tech TalksMarch, 14 2008ABSTRACTMany problems in Natural Language Processing (NLP) involves anefficient search for the best derivation over (exponentially) manycandidates, especially in parsing and machine translation. In thesecases, theconcept of "packed forest" provides a compact representation of thehuge search spaces, where efficient inference algorithms based onDynamic Programming (DP) are possible.In this talk we address two important open problems within thisframework: exact k-best inference which is often used in NLPpipelines such as parse reranking and MT rescoring, and approximateinference when the search space is too big for exact search.We first present a series of fast and exact k-best algorithms onforests, which are orders of magnitudes faster than previously usedmethods on state-of-the-art parsers such as Collins (1999). We thenextend these algorithms for approximate search when the forests aretoo big for exact inference. We discuss two particular instances ofthis new method, forest rescoring for MT decoding with integratedlanguage models, and forest reranking for discriminative parsing. Inthe former, our methods perform orders of magnitudes faster thanconventional beam search on both state-of-the-art phrase-based andsyntax-based systems, with the same level of search error ortranslation quality. In the latter, faster search also leads tobetter learning, where our approximate decoding makes whole-Treebankdiscriminative training practical and results in the best accuracy todate for parsers trained on the Treebank.This talk includes joint work with David Chiang (USC InformationSciences Institute).Liang Huang (2008). Forest Reranking: Discriminative Parsing with Non-Local Features. Proceedings of ACL 2008 (to appear). Huang and David Chiang (2007). Forest Rescoring: FasterDecoding with Integrated Language Models.Proceedings of ACL 2007. Huang and David Chiang (2005). Better k-best Parsing.Proceedings of IWPT 2005. Liang HuangLiang Huang is a final-year PhD student at the University ofPennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/ISI). He is mainly interested in the theoretical aspects ofcomputational linguistics, in particular, efficient algorithms inparsing and machine translation, generic dynamic programming, andformal properties of synchronous grammars. He also works on applyingcomputational linguistics to structural biology.
Length: 01:01:19


Questions about Forest-based Search Algorithms in Parsing and Machine Translation

Want more info about Forest-based Search Algorithms in Parsing and Machine Translation? Get free advice from education experts and Noodle community members.

  • Answer