28 April 2009
2:30 - 4:00   Mehrnoosh Sadrzadeh
What is the vector space content of what we say?
A compact categorical approach to distributed meaning

Mehrnoosh Sadrzadeh (joint work with S. Clark, Coecke, A. Preller),
Oxford University Computing Laboratory, visiting McGill University, Montreal

The two antipode approaches to semantics of natural languages are the symbolic and the distributed models of meaning. The former is type-logical and compositional, but does not say anything about meaning of words. The latter provides vector space meaning for words (based on their contexts), but says nothing about meaning of sentences. Developing a compositional distributed model of meaning has been a challenge to the computational linguists. Building on a proposal by linguists Clark and Pulman (Cambridge and Oxford), we merge these two approaches by using the axiomatics of compact closed categories and their diagrammatic toolkit of proofs. The key to this solution is our choice of Lambek's Pregroups as a syntax calculus, and the fact that both a Pregroup seen as a posetal category and the category of finite dimensional vector spaces are compact closed. We compute the meaning of a sentence by pre-composing the tensor product of the vectors of the words therein with the map of its syntactic structure. The pre-composition is a prescription on how to apply the units and counits of the adjunctions of the compact closed category.

Surprisingly enough, the categorical Quantum Mechanics language of Abramsky and Coecke (Oxford) provides a nice intuitive explanation for the application of these maps: they create entangled Bell states (function abstraction) and take inner products (function application) and as such allow the information to flow among the words within a sentence. The use of two or more Bell states in the meaning map of negative sentences resembles the entanglement swapping protocols of quantum information.