Martin Henk van den Berg

Managed groups developing and deploying natural language components within research and product groups. Developed both back-end and front-end tools and technology for Natural Language applications. Has extensive experience in the area of document analysis for the purposes of summarization, knowledge extraction and search.

Architected a multi-year knowledge extraction project in cooperation with the Natural Language and Knowledge Representation groups at Xerox PARC that eventually spun-out as the wikipedia search startup Powerset. Was a member of the early (pre-Google) web-search team at IBM Almaden.

Co-designed one of the first big-data parsing projects (“Data-Oriented Parsing”) with Remko Scha.

Co-authored 24 granted patents. Additionally, 4 Intel patents filed.

As a formal semantic with an interest in linguistics above the level of the sentence, I try to understand how the structure of discourse informs the hearer about the meaning encoded. Using the Linguistic Discourse Model (a theory of discourse structure developed by Livia Polanyi) and versions of dynamic logic (for discourse meaning), I try to understand how different anaphors find their antecedents in texts and dialogs, how they get their meaning once they found them and what that tells us about the way we encode information structure.

Other subjects I am interested in are the semantics of questions and answers, the encoding of information structure, and storing that information efficiently in a database for semantic document search and general knowledge representation.

Experience

Research Scientist

Intel Corp., Santa Clara, CA — 2012–present

Senior Research SDE

Microsoft Corp. — SVC, Mountain View, CA — 2008–2012

Senior Scientist

Powerset, San Francisco, CA — 2006–2008

Research Scientist, Senior Research Scientist

FXPAL, Palo Alto, CA — 1998–2002, 2002–2006

Postdoctoral Research Fellow

IBM, Almaden Research Center, San Jose, CA — 1997–1998

Postdoctoral Research Fellow

IBM, Santa Teresa, San Jose, CA — 1996–1997

Visiting Scholar

CSLI, Stanford University, Stanford, CA — 1995

Teaching Assistant

Department of Computational Linguistics, University of Amsterdam, Amsterdam, NL — 1994–1996

Research Assistant

Department of Computational Linguistics, University of Amsterdam, Amsterdam, NL — 1989 –1993

Professional Services

Education

PhD — Computational Linguistics

University of Amsterdam, Amsterdam, the Netherlands — 1996

Thesis: Some Aspects of the Internal Structure of Discourse. The Dynamics of Nominal Anaphora.
Supervisors: Professors Remko Scha and Johan van Benthem.

Undergraduate ("doctoraal" degree) - Theoretical Physics

University of Amsterdam, Amsterdam, the Netherlands — 1989

Thesis: Invariants of Transformations. Comparing Relativistic and Non-relativistic Space-Time Structures.

Publications

Discourse Structure and Sentiment with: Livia Polanyi. icdmw, pp.97-102, 2011 IEEE 11th International Conference on Data Mining Workshops, 2011.
LiveTree: An Integrated Workbench for Discourse Processing. with: Gian Lorenzo Thione, Chris Culy, Livia Polanyi Proceedings of the ACL2004 Workshop on Discourse Annotation Barcelona, Spain July 25-26, 2004.
Sentential Structure and Discourse Parsing with: Livia Polanyi, Chris Culy, Gian Lorenzo Thione, David Ahn Proceedings of the ACL2004 Workshop on Discourse Annotation Barcelona, Spain, July 25-26, 2004.
Hybrid Text Summarization: Combining external relevance measures with Structural Analysis with: Livia Polanyi, Chris Culy, Gian Lorenzo Thione Proceedings of the ACL2004 Workshop Text Summarization Branches Out, Barcelona, Spain July 25-26, 2004
A Rule Based Approach to Discourse Parsing with: Livia Polanyi, Christopher Culy and Gian Lorenzo Thione, Proceedings of the 5th SIGdial Workshop in Discourse And Dialogue. Cambridge, MA USA, pp. 108-117. May 1, 2004
Discourse Structure and Sentential Information Structure: An Initial Proposal. with: Livia Polanyi and David Ahn in: the Journal of Logic, Language and Information 12-3:337-350 Kluwer Academic Publishers, Dordrecht, The Netherlands 2003
Making Ontologies Work for Resolving Redundancies Across Documents. with: Livia Polanyi FXPAL, Dick Crouch, PARC/NLTT and Danny Bobrow, Cleo Condoravdi, John Everett, Valeria Paiva and Reinhard Stolle, PARC/RDC in Communications of the ACM, February 2002, Vol. 45,Number 2:p.55&endash;60, 2002.
Counting Concepts with: Dick Crouch and Cleo Condoravdi, in Robert van Rooy and Martin Stokhof (eds), Proceedings of the Thirteenth Amsterdam Colloquium, ILLC/Department of Philosophy, University of Amsterdam, 2001.
Preventing Existence with: Dick Crouch, PARC/NLTT and Danny Bobrow, Cleo Condoravdi, John Everett, Valeria Paiva and Reinhard Stolle, PARC/RDC. in: Proceedings of the ACM Conference on Formal Ontology in Information Systems, 2001.
Questions as First Class Citizens in: Paul Dekker (ed), Proceedings of the Twelfth Amsterdam colloquium, ILLC/Department of Philosophy, University of Amsterdam, 1999.
Logical Structure and Discourse Anaphora Resolution with: Livia Polanyi, in: Dan Cristea, Nancy Ide, Daniel Marcu (eds.), Proceedings of the Workshop on The relation of Discourse/Dialogue Structure and Reference, 37th Annual Meeting of the Association of Computational Linguistics, 1999.
Distributed Hypertext Resource Discovery Through Examples with: Soumen Chakrabarti and Byron Dom, in Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (e), VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, 375-386. Morgan Kaufman.
Focused Crawling: a New Approach to Topic-Specific Web Resource Discovery with: Soumen Chakrabarti and Byron Dom, in: Computer Networks, Volume 31, p. 1623--1640. Elsevier. 1999. Amsterdam, the Netherlands.
Some Aspects of the Internal Structure of Discourse. The Dynamics of Nominal Anaphora PhD-thesis, ILLC-DS-1996-3, Institute for Logic Language and Computation (ILLC), University of Amsterdam, 1996.
Dynamic Generalized Quantifiers in: van der Does, J. and van J. Eijck, (eds.), Quantifiers, Logic, and Language, CSLI Lecture Notes No.54, Stanford, California, 1996.
Discourse Grammar and Dynamic Logic in: Dekker, P. and M. Stokhof, (eds.): Proceedings of the Tenth Amsterdam Colloquium, ILLC/Department of Philosophy, University of Amsterdam, 1996.
Discourse Structure and Discourse Interpretation with: L. Polanyi, in Dekker, P. and M. Stokhof, (eds.): Proceedings of the Tenth Amsterdam Colloquium, ILLC/Department of Philosophy, University of Amsterdam, 1996.
Discourse Grammar and Verb Phrase Anaphora with: H. Prüst and R. J. H. Scha, Linguistics and Philosophy 17, 261--327 1994.
Plurality in: R.E. Asher and J.M.Y. Simpson (eds) The Encyclopedia of Language and Linguistics, Pergamon Press. 3198--3200, 1994.
A Corpus-based Approach to Semantic Interpretation with: R. Bod and R. Scha, in Dekker, P. and M. Stokhof, (eds.): Proceedings of the Ninth Amsterdam Colloquium, ILLC/Department of Philosophy, University of Amsterdam, 1994.
A Direct Definition of Generalized Dynamic Quantifiers in Dekker, P. and M. Stokhof, (eds.): Proceedings of the Ninth Amsterdam Colloquium, ILLC/Department of Philosophy, University of Amsterdam, 1994.
Full Dynamic Plural Logic in: Bimbó, K. and A. Máté, (eds.), Proceedings of the Fourth Symposium on Logic and Language, Budapest, 1993.
Dynamic semantics in: Pinkal, M., R. Scha, and L. Schubert, (eds.),Semantic Formalisms in Natural Language Processing, No.57 in Dagstuhl-Seminar-Report, Schloß Dagstuhl, Wadern, Germany, 1993.
A formal grammar tackling verb-phrase anaphora with: H. Prüst and Remko Scha, Technical Report CL-91-03, ILLC, Amsterdam, 1991.
A Coherent Approach to Underspecification in Natural Language Discourse with: H. Prüst, in: Caenepeel M. et.al. (eds.), Workshop on Discourse Coherence, University of Edinburgh, 1991.
Common Denominators and Default Unification with: H. Prüst, in: T. van der Woude and W. Sijtsma (eds.), Computational Linguistics in the Netherlands. Papers for the first CLIN meeting, OTS-WP-CL-91-001 in OTS Working Papers, University of Utrecht, 1990.
A Dynamic Logic for Plurals in: Stokhof, M. and L. Torenvliet (eds.), Proceedings of the Seventh Amsterdam Colloquium, ITLI, University of Amsterdam, 1990.

Patents

Microsoft (Inventions developed at Powerset)

FXPAL

IBM