I'm a computational linguist and computational biologist,
most fascinated by
the mathematical connections between the two.
CL (parsing, translation, generation),
compbio (RNA/protein folding),
ML (structured prediction, weakly-supervised learning).
efficient algorithms and provable theory in the above areas.
2018/09: Ph.D. student Mingbo Ma defended (4th PhD graduate) and will become a Research Scientist at Baidu Research (Silicon Valley).
2018/08: 4 papers (3 long, 1 short) in EMNLP 2018.
2018/07: Funded by NSF, NIH, and Intel.
2018/05: SIGMOD 2018 Best Paper Finalist.
2018/03: csrankings.org ranks OSU 15th in NLP (mostly our group), 20th in broad AI, and 36th in CS (in the US).
2018/02: We released the world's fastest RNA secondary structure prediction server,
powered by the first linear-time prediction algorithm,
based on our earlier work in computational linguistics.
It is orders of magnitude faster than existing ones, with comparable or even higher accuracy. Code on Github.
2017/10: Serving as an Senior Area Chair (syntax/parsing) for ACL 2018.
2016/07: Ph.D. alumni (USC) Ashish Vaswani joined Google Brain as a Research Scientist.
2015/09: Postdoc Feifei Zhai joined IBM T. J. Watson as a Research Staff Member.
2015/09: Our group moved to Oregon State University after three years at the City University of New York.
2015/07: We received a Yahoo! Faculty Engagement Award.
My research interests are mainly in the algorithmic and formal aspects of computational linguistics (esp. parsing and machine translation)
and their applications in computational structural biology.
The key questions that motivate my research are:
Why are computers so bad at understanding and processing natural language?
Can we teach computers to process natural language the way we humans do,
that is, both fast and accurate?
Or, can computers process natural language the way they process programming languages in spite of the inherent ambiguity of the former?
So in the past 10 years or so,
I have been focusing on linear-time incremental algorithms for parsing and translation,
inspired by both human sentence processing (psycholinguistics)
and compiler theory.
Recently, my group has two main research directions:
(a) simultaneous translation
(including incremental predictive parsing, and incremental text-to-speech),
(b) linear-time incremental algorithms for computational biology,
where I recently adapted incremental parsing to RNA folding
(resulting in the first linear-time RNA folding algorithm),
thanks to a deep mathematical connection between linguistics and structural biology
(underneath the surface,
both are modeled by context-free and mildly context-sensitive grammars; cf. the Chomsky Hierarchy).
I am also interested in using incremental parsing as a model for human sentence processing,
as well as using incremental RNA folding as a model for cotranscriptional folding.
Both humans and nature are fundamentally incremental and simultaneous!
Mingbo Ma*, Liang Huang*, Hao Xiong, Renjie Zheng, Kaibo Liu, Baigong Zheng,
Chuanqiang Zhang, Zhongjun He, H. Liu, X. Li, H. Wu, Haifeng Wang (2019).
STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency
using Prefix-to-Prefix Framework.
Proceedings of ACL 2019.
[demos and media reports]
Liang Huang, Hao Zhang, Daniel Gildea, and Kevin Knight (2009).
Binarization of Synchronous Context-Free Grammars. Computational Linguistics, 35 (4). Conference version appeared at NAACL 2006.
(The core linear-time synchronous binarization algorithm
was inspired by the Graham Scan for Convex Hull.
It was a rather unexpected connection.)
grammars and dynamic programming for computational biology
For representative or recent publications please see the "Research" Tab.
For a more complete list you can visit my google scholar page (or by year).
Some time in the future I will compile a real publication list, but for the time-being please use google scholar.
You can also use Semantic Scholar (by AI2) to see the influences of my work.
This brief note develops a k-best extension of the powerful Knuth (1977) algorithm, which itself extends the classical Dijkstra (1959) shortest-path from graphs to directed hypergraphs (AND-OR graphs) and context-free grammars.
My extension was quite straightforward, following a simple k-best Dijkstra extension by Mohri and Riley (2002), combined with the lazy k-best frontier idea from my k-best CKY parsing work (Huang and Chiang, 2005).
I thought this algorithm was only of theoretical interest,
and was surprised (and glad) when Pauls and Klein (2009) further developed it to "k-best A* parsing" by adding admissible heuristics,
which won the ACL 2009 Best Paper Award.
See also  below on the relationship between A*, Knuth, and Dijkstra.
As a side note, the k-best parsing algorithms (Huang and Chiang, 2005)
were used in the award winning papers
in ACL 2005, NAACL 2006, ACL 2008, and ACL 2009.
I used to work on computational biology during my PhD, which resulted in this paper.
Although published in the top chemistry journal,
it does not get much attention at all, which I believe was due to
the difficulty of the technique (dynamic programming) for a chemistry audience
(a computational biology audience would have been much better -- they use DP a lot, e.g. Smith-Waterman is edit-distance; even CKY is widely used in comp-bio).
(more DP details such as function operators left to the Supplement).
I present a unified algebraic theory of dynamic programming based on
monotonic weight functions,
semirings (distributivity implies monotonicity),
and hypergraphs (distributivity is the key to factorization).
Here DP includes both Viterbi-style (topological order, requires acyclicity of the graph) and Dijkstra-style (best-first, requires superiority of the semiring);
also DP includes both optimization (using idempotent semirings)
and summation/expection (using inside/expectation semirings).
This view of DP is much broader than those in the textbooks.
The only implementation paradigm not covered in my framework is "memoized top-down recursion",
which, as the CLRS book also pointed out (Sec. 15.1), could also
be viewed as the DFS version of topological sort
(whereas Viterbi, or bottom-up, resembles a BFS version of topological sort).
A* search on graphs and hypergraphs (e.g. A* parsing)
are also discussed as generalizations of Dijkstra 1959 and Knuth 1977.
It was the most popular tutorial in both COLING 08 and NAACL 09.
We wrote this book in college while I was training alongside the SJTU team
which won the World Champions for the first time.
When it was first published
I had just come to the States and was no longer active in the programming contest community,
so I had no clue that it would become a best-seller in the field of
programming contests (I initially thought it would only sell about a thousand copies).
I wrote one of the three chapters of the book, the one on computational geometry, my second favorite topic in algorithms after dynamic programming.
After many years of hiatus,
I finally resumed contest coaching, at USC (encouraged by David Kempe)
and later at CUNY (encouraged by Jerry Waxman),
helping the former win its first Southern California Regionals in 2011 (and again in 2012),
and helping the latter achieve its best ranking in Greater New York Regionals in 10 years.
I developed a course in CUNY to train for ACM/ICPC
(and also for job interviews as a by-product),
which became students' favorite class in the CS Dept.
I hope to write a new book based on the course notes (with Python code for each problem).
I really think ACM/ICPC should allow Python in all Regionals and the World Finals (many Regionals in the US
already support Python, including my current region, the Pacific Northwest).
I also advocate for more algorithmic problems in the Regionals (esp. in the US),
and more geometry problems in the Finals.
Teaching is always rewarding, and is indeed the best part of being a professor.
Welcome to the Algorithms for Computational Linguistics (ACL) Group!
We study efficient algorithms and provable theory in
computational linguistics, computational structural biology,
and structured machine learning.
I am very fortunate to have been working with the following students:
BS/MS, CS, Tongji. Joined ACL Group in Fall 2017 (officially Winter 2018).
Interned at Fudan University and Baidu Research.
Liang Zhang, BS, EE, Nankai; MS, EE, HKUST; MS, Finance, LBS.
Joined ACL Group Summer 2018.
Junkun Chen, BS, CS, Central South; MS, CS, Fudan. Interned at Bytedance AI Lab.
Sizhen Li, BS, EE, MS, CS, BUPT. Interned at Bytedance AI Lab.
(#4, OSU, 2018) Mingbo Ma, BS, EE, Jilin. Joined ACL Group in 2013. Interned at IBM TJ Watson, Apple, and Baidu Research. Defended Sep. 2018.
First job: Research Scientist at Baidu Research (Silicon Valley).
(#3, OSU, 2017) Kai Zhao, BS, CS, USTC. Joined ACL Group in 2012. Interned at IBM TJ Watson, MSR, and Google Research. 11 ACL/EMNLP/NAACL papers. Defended May 2017. First job: Research Scientist at Google (NYC).
(#1, USC, 2014) Ashish Vaswani, Ph.D. at USC (co-advised by David Chiang). Interned at IBM TJ Watson and Google Research. Thesis defended May 2014. First job: Research scientist at USC/ISI.
Current job: Senior Research Scientist at Google Brain.
Primary author of the attention is all you need paper.
(#2, CUNY) Feifei Zhai (Ph.D., CAS/IA), postdoc, 2014--2015. First job: research staff member at IBM TJ Watson.
(#1, CUNY) Lemao Liu (Ph.D., Harbin), postdoc, 2013--2014. First job: researcher at NICT, Nara, Japan.
MS Thesis Graduates (4)
Matthew Meyn, BS, Music, Santa Clara; BS, CS, OSU (e-campus). Defended CS MS Thesis, June 2019.
He Zhang, BS/MS, EE, Beihang. Defeneded CS MS thesis, Sep 2018.
First job: Research Engineer at Baidu Research USA.
Kaibo Liu, BS, Physics, PKU; MS, EE, PKU. Defended CS MS thesis, Dec 2018.
First job: Research Engineer at Baidu Research USA.
Luyao Zhang, BS, ZJU; PhD, USC. Defended CS MS thesis, Sep 2016.
First job: CS Lecturer, Oregon State.
My past summer interns include
Sasha Rush (Ph.D., MIT, now Prof. at Harvard),
Yoav Goldberg (Ph.D., Ben Gurion, now Prof. at Bar Ilan),
Heng Yu (Ph.D., CAS/ICT, now with Samsung),
and Zhuoran Yu (M.S., NYU, now with Google NYC's NLP research team).
To prospective students:
please use csrankings.org
for a data-driven and more up-to-date CS ranking than
the purely subjective and totally obsolete US News system.
If you look at the past ten years (2007--2017),
we are ranked
15th in NLP (mostly our group), 18th in vision, 21st in narrow AI (AAAI/IJCAI), and 20th in broad AI.
So the AI group is super strong.
Other strong groups include Software Engineering (11th),
Crypto (15th), and HCI (23rd).
The overall CS ranking is 36th and is improving.
I am from Shanghai, China and speak Wu as my native language.
Ph.D., Computer Science, University of Pennsylvania, 2008. (old old homepage)
B.S., Computer Science, Shanghai Jiao Tong University, 2003. summa cum laude. (minor studies in French and English)
Assistant Professor, School of EECS, Oregon State University, 2015/9--present. (on leave at Baidu Research USA, 2018/6-2019/9)
Distinguished Scientist and Head, Institute of Deep Learning USA, Baidu Research, 2018/6-2019/9 (on leave from OSU).
Research Scientist (part-time), IBM Watson Group, 2014/6--2017/1.
Assistant Professor, Queens College and Graduate Center, City University of New York (CUNY), 2012/8--2015/8.
Research Assistant Professor, University of Southern California (USC), 2010/7--2012/8.
Computer Scientist, USC Information Sciences Institute, 2009/7--2012/8.
Research Scientist, Google Research (Mountain View), 2009/1--7.
Visiting Scholar, Hong Kong Univ. of Science and Technology, 2008/10--2008/11.
Visiting Scholar, Institute of Computing Technologies, Chinese Academy of Sciences, 2007/10--2008/1.
Summer Intern, USC/ISI, 2005/5--10 and 2006/5--10.
NSF Panelist, 2014, 2015, 2017 x 2, 2019 x 2
Grant Reviewer, Foreign NSFs: Hong Kong (RGC), 2017; Canadian (NSERC), 2017; Dutch (NWO), 2017; Israeli (ISF), 2015
Area Chair (sentence-level semantics), ACL 2019
Area Chair (syntax and parsing), EMNLP 2018
Senior Area Chair (syntax and parsing), ACL 2018
Area Chair (syntax and parsing), IJCNLP 2017
Area Chair (syntax and parsing), EMNLP 2016
Area Chair (machine translation), ACL 2014
Area Chair (syntax and parsing), ACL 2012
Program Co-Chair, IWPT 2013
Textbook Reviewer, Cambridge Univ. Press, 2016, and Oxford Univ. Press, 2010
Liang Huang is currently an Assistant Professor of EECS at Oregon State University and Principal Scientist at Baidu Research USA.
Before that he was Assistant Professor for three years at the City University of New York (CUNY) and a part-time Research Scientist with IBM's Watson Group. He graduated in 2008 from Penn and has worked as a Research Scientist at Google and a Research Assistant Professor at USC/ISI. Most of his work develops fast algorithms and provable theory to speedup large-scale natural language processing, structured machine learning,
and computational structural biology. He has received a Best Paper Award at ACL 2008, a Best Paper Honorable Mention at EMNLP 2016, several best paper nominations (ACL 2007, EMNLP 2008, and ACL 2010), two Google Faculty Research Awards (2010 and 2013), a Yahoo! Faculty Research Award (2015), and a University Teaching Prize at Penn (2005). His research has been supported by DARPA, NSF, Google, and Yahoo.
He also co-authored a best-selling textbook in China on algorithms for programming contests.
Note: since 2006, following David Chiang
(who wrote some of the most elegant large-scale Python code in this world),
almost all my code has been written in Python2.7, with some Python extension libraries in C
(hacking CPython source code).
I also love functional programming and declarative programming in general
(OCaml, Haskell, and Prolog), but hate C++, Perl, and Matlab which are ugly.
Compared to Python/Haskell/Ocaml, languages like C/C++ and Java are
stone-age artifacts; don't use them unless absolutely necessarily (such as for Python libraries).
I dislike Matlab because it is so numerically and engineering oriented,
which is not ideal for elegant, symbolic, algorithmic, or large-scale programming by
computer scientists, statisticians, or mathematicians (I like Mathematica and R much better).
Written in C as a Python extension module based on collections.defaultdict.
Much faster and slimmer (4 times less memory usage) than David Chiang's svector.
Builtin support for averaged parameters in online learning (e.g. perceptron, MIRA, etc.).
Note: for decoding (e.g. parsing), defaultdict is fast enough (mine is even faster by doing dot-product in C, which is also possible via Cython), but for learning (e.g. perceptron), defaultdict
becomes terrible on big data because Python float/int are immutable, which caused too many unnecessary hash operations. Using my hvector can make your learner up to 5 times faster.
This parser/reranker is described in the following paper:
Liang Huang (2008). Discriminative Parsing with Non-Local Features.
Proceedings of ACL 2008. (Best Paper Award)
errata: Following Charniak, the dev set was section 24, not section 22.
This software has three components:
The forest-dumping version of Charniak parser.
The forest reranker.
The perceptron trainer.
Currently part 1 is downloadable as a standalone package.
Parts 2 and 3 are being packaged for release.
Important: If you're using 64-bit Ubuntu, it is recommended that you install Python from source code (see Python.org).
The default Python2.7 in those Ubuntus (at least 12.04) has an obscure floating point problem
which gives inconsistent results.
Papers which use multiple pieces of my software: there are
many papers from other researchers
which use a certain piece of my software,
but there are also some which combine different pieces together, for example:
Disclaimer: I am known to be highly opinionated, and some points below might sound offensive to some readers.
I am a big fan of Classical Music.
The composers I admire the most are
Johann Sebastian Bach (whose music is so mathematical),
Peter Ilych Tchaikovsky (whose melodic talent is second only to Mozart),
and Antonin Dvorak (whose music blends Bohemia with America).
I also love, among others, (in chronological order)
Ludwig van Beethoven,
and Sergei Rachmaninoff.
Yes, I do have a preference for Baroque, Slavic, and melodic beauty ("lyricism").
On the other hand, I don't have a taste or much respect for Richard Wagner
(whom I found rather disgusting) or Franz Lizst.
Compared to Frederic Chopin or Nicolo Paganini,
Lizst has almost nothing original to himself
(like comparing Clementi to Mozart). I don't like Haydn either: he is innovative in form,
but his melodic talent is certainly
not comparable to those I love.
Even "his" most beautiful work, String Quartet Op. 3 No. 5
that contains the famous 2nd movement Andante Cantabile ("the Serenade")
is now firmly attributed to Hoffstettter (an admirer of Haydn).
When I first heard that movement without knowing this, I was like,
"There is no way Haydn could have written this!"
Some people even called Boccherini "Haydn's wife" (in terms of stylistic similiarity);
to me that's bullshit -- Boccherini wrote many achingly beautiful themes (such as
this and that) that
Hadyn could never have written.
Brahms, on the other hand, does have superb melodic talent,
but he is so reluctant in showing it.
I think the ideal teaching style in computer science
should be something like this "illustrated lecture" below
(passionate, and demo on your instrument (the computer) as much as you can):
A Personal History of Languages
I grew up speaking Wu,
but in a multilingual environment.
Back in the old days, Shanghai was just as multicultural as New York City today
with speakers and communities of all kinds of languages or dialects.
When I grew up, my parents spoke Shanghainese,
and my grandparents Ningbonese,
which I understood but could not speak well;
the majority of our neighbors, however,
spoke yet another distinctive language called
Lower Yangtze Mandarin,
which is a "linear interpolation" between Wu and Northern Mandarin,
and because of this "interpolation" nature I am still fluent in it today.
I started learning Standard Mandarin as a de facto first foreign language
in the elementary school,
but like most of my peers, I ended up with a heavy southern accent.
During college I took up French seriously and got admitted into one of the Grandes Ecoles in Paris,
but forgot all of it after moving to the US;
those memories somehow all came back when I later took up Spanish and Italian.
On the other thand, living in the US helped me
get rid of my heavy accent in Mandarin
where finally the "training data" around me
had more native samples than non-native ones.
Living here also exposed me to other Chinese languages and dialects
which I never heard back in Shanghai,
such as the Upper Yangtze Mandarin (aka "Sichuan"), Cantonese, and Hokkien (aka "Taiwanese"),
and more interestingly, various English and Spanish dialects.
I still enjoy learning new languages and dialects today.
If you've studied at least one Romance languages (French, Italian, Spanish, Portugues, Romanian, etc.), then
Latin shouldn't be too hard.
Studying Latin has many benefits, but to me the sound changes in Latin and descendant languages really helped me understand more about Indo-European. Just the hard-vs-soft "c" alone tells you so much about Indo-European history, Centum vs. Satem split, etc.
Here is a great guide on Latin pronunciation written by a fellow computational linguist
Michael Covington (yes, you probably knew him for the "Convington algorithm", an O(n^2) algorithm for dependency parsing).
Enjoy some beautiful music in Latin (Operatic Pop or Catholic):
My interests in Latin were, once again, influenced by David Chiang.
What to do when stuck in traffic jam...
Well, driving in LA is like hell, so my former colleagues at ISI had developed various ways
of doing something useful while stuck on the freeways.
The NLP group at ISI, being a set of language lovers, is particularly fond of listening to the
Pimsleur language programs
while driving (someone even got a ticket!), which is arguably the best language learning tool,
and I myself did it for Spanish which worked out great.
However, Pimsleur doesn't work that well for Japanese and much worse for Chinese
where the writing systems are (partially or completely)
not phonetic which means you do have to read on the page.
You might be able to do some simple Japanese conversations with Pimsleur,
but reading kanas and kanjis requires different training;
Chinese is much worse which made Pimsleur basically useless even for basic conversations
because of homonyms and tones (Jon May used to complain about similar-sounding words such as
yiyue "January", yiyuan "hospital", and yinyue "music"; the tones are also different
but most westerners can't perceive it).
In the same logic, Pimsleur works the best for Italian and Spanish,
where the spellings are the most regular
Jason Riesa did it for Persian (among the many languages he studied) and claimed it to be really easy
partly because it's a well-behaved Indo-European (which makes sense).
Besides Pimsleur, I also listened to a great many The Teaching Company courses while driving,
which has a large catelog of courses in all subjects, and
they made it in a way that you can get much of it just by listening (which you can do while cooking, jogging, etc)
although they also have videos if you wanna study more carefully.
The two professors I recommend the most are
Robert Greenberg of UC Berkeley
for classical music and music theory,
and John McWhorter of Columbia for linguistics.
Both Pimsleur and Teaching Company CDs are extremely expensive, but you can borrow them
from local libraries.
For English-language courses, I liked this one in particular: Buddhism and Modern Psychology, by Robert Wright of Princeton,
and all courses by Curtis (String Quartet, Beethoven Piano Sonatas,
and Western Music History).
For Chinese-language courses, I am deeply moved by the two from Lu, Shih-Hao of National Taiwan University: Shiji, and Qin Shi Huang.
I wish he could offer more online and I think Taiwanese students are so lucky to be able to sit in his classrooms;
compared to those in Taiwan, many history/humanities professors in the mainland are just a joke (except
a handful of real scholars, such as Gao, Hua who wrote
How did the Red Sun Rise).
For those learning algorithms and programming, the one by PKU is recommended; Liu Jiaying is a real fun teacher.
I highly recommend the two documentaries by the Hungarian-American George Csicsery on legendary mathematicians:
N is a Number on Paul Erdos,
and Counting from Infinity on Yitang Zhang.
In the Erdos film, many scenes were shot on the Penn campus (the Math/Physics building DRL in particular)
as he visited Prof. Herbert Wilf frequently.
In the Yitang film, I love the way he speaks, with a heavy but cute Shanghainese accent,
even though he moved to Beijing when he was 13.
He is my model as a pure scholar (who does not need recognition or funding)
and a dedicated teacher.
Aravind Joshi's Legacy in Computational Biology
People that I learned the most from
So far I have been extremely grateful to these people from whom I learned a great deal of:
and to a slightly lesser extent:
I certainly learned a whole lot from many many other people. So I apologize if your name is not listed here. :)