First-Order Logic: Syntax and Semantics --------------------------------------- What are some of the limitations of propositional logic? Propositional logic assumes that the world or system being modeled can be described in terms of a fixed, known set of propositions. This assumption can make it awkward or even impossible to specify many pieces of knowledge. For example, consider the following general statement about people. "If a person is rich then they have a nice car." We could encode this knowledge in propositional knowledge by creating a large set of rules, one rule for each person, BobIsRich => BobHasNiceCar TonyIsRich => TonyHasNiceCar JonIsRich => JonHasNiceCar ....... ....... We see that propositional logic requires that we basically the nice and concise general statement into many statements about specific people. This is a completely impractical and unintuitive way to represent and communicate such knowledge. As another example, consider the following general statement about natural numbers. "If n is a natural number, then n+1 is also a natural number." We could try to encode this statement in propositional logic as, Natural1 => Natural2 Natural2 => Natural3 ...... ...... However, we would need to have an infinite number of propositional formulas, one for each natural number. We could place a bound on the natural numbers we are willing to consider, but this is unsatisfactory. For example, with such a bound we would not be able to correctly answer many simple queries about natural numbers, e.g. "Does there exist a largest natural number?". Any true AI should certainly be able to reason about the natural numbers, yet propositional logic does not allow for this. What is missing from propositional logic? In both of these examples we need the ability to directly talk about objects (e.g. people and numbers) and to write down logical statements that generalize (or quantify) over those objects. First-order logic gives us this ability. The above examples can be encoded in first-order logic as, \forall x Rich(x) => \exists y [Owns(x,y) & Car(y) & Nice(y)] and \forall x Natural(x) => Natural(x+1) where \forall and \exists are universal and existential quantifiers respectively. As we will see, the syntax and semantics of first-order logic allow us to explicitly represent objects and relationships among object, which provides us with much more representational power than the propositional case. First-order logic, for example, can be used to represent number theory, set theory, and even the computations of Turing machines. Syntax: Figure 8.3 in your book gives a nice grammar for the syntax of first-order logic. Six types of symbols are used to construct first-order formulas (not counting parens). 1) constants symbols, which will be interpreted as representing objects, e.g. Bob might be a constant. 2) function symbols, each having a specified arity (i.e. number of input arguments). Function symbols will be interpreted as functions from the specified number of input objects to objects. For example, let fatherOf be a single-arity function symbol, then the natural interpretation of fatherOf(Bob) would be Bob's father. Zero-arity function symbols are considered to be constants. 3) predicate symbols, each having a specified arity. Single argument predicates can be thought of as specifying properties of objects. For example, let Rich be a single arity predicate, then Rich(Bob) would be used to denote that Bob is rich. Multi-arity predicates denote relations among objects. For example, let Owns be a two-arity predicate, then Owns(Bob,Car) indicates that Bob owns Car. Zero-arity predicate symbols are treated as propositions as in propositional logic---so first-order logic subsumes propositional logic. 4) variable symbols, will be used as "place holders" for quantifiers. 5) universal and existential quantifier symbols, will be used to quantify over objects. For example, \forall x Alive(x) => Breathing(x) is a universally quantified statement that uses the variable x as a place holder. 6) logical connectives and negation Using these symbols there are two main types of strings in first-order logic. First there are "terms", which will be interpreted as objects. Intuitively terms are used to index or name objects in the world. Second, there are "formulas" which will be interpreted as either true or false. Terms are defined recursively: - constants and variables are terms - any function symbol applied to the appropriate number of terms is also a term. So for example, Bob, FatherOf(Bob), FatherOf(FatherOf(Bob)) are all terms. Formulas are defined recursively: - "primitive formulas" or "atoms" which are simply predicate symbol applied to the appropriate number of terms - a logical combination of formulas - an expression of the form "quantifier variable formula" For example, TallerThan(Bob,FatherOf(Bob)) is a primitive formula, which indicates that Bob is taller than his father. Also, TallerThan(Bob,FatherOf(Bob)) & TallerThan(FatherOf(FatherOf(Bob)),Bob) is a logical combination of formulas that says bob is taller than his father but shorter than his grandfather. Finally, \exist x TallerThan(FatherOf(x),x) is a quantified formula that says there exists a person who is shorter than their father. Any string that is not formed according the above syntactic rules is not a well-formed formula. It will be helpful to be familiar with some common terminology regarding formulas. - A "ground formula" is a formula without variables. A ground atom is an atom without variables. Ground atoms are often used to state specific facts such as Rich(Bob). - A "closed formula" is a formula in which all variables are associated with quantifiers. - We say a formula has "free variables" if it is not closed. For example, if x is a variable then Rich(x) has x as a free variable. Typically free variables are treated as universally quantified in first-order logic. Our primary reason for introducing the concept of free variables is to help define the semantics of formulas (described below). Although we have not yet defined the semantics of first-order logic lets consider some example formulas along with their intuitive natural language interpretations. * "Not all birds can fly." ~(\forall x Bird(x)=>Fly(x)), which is the same as \exists x Bird(x) & ~Fly(x) * "All birds cannot fly." \forall x Bird(x) => ~Fly(x), which is the same as ~(\exists x Bird(x) & Fly(x)) * "If anyone can solve the problem, then Hilary can." (\exists x Solves(x,problem)) => Solves(Hilary,problem) * "Nobody in the Calculus class is smarter than everyone in the AI class" ~[\exists x TakesCalculus(x) & (\forall y TakesAI(y) => SmarterThan(x,y))] * "John hates all people who do not hat themselves." \for x Person(x) & ~Hates(x,x) => Hates(John,x) Semantics: As for all logics, the first step in defining the semantics is to define the models of first-order logic. Recall that one of the benefits of using first-order logic is that it allows us to explicitly talk about objects and relations among them. Thus, our models will contain objects along with information about the relationships among objects. More formally, a first-order model is a tuple where D is a non-empty domain of objects and I is an interpretation function. The domain D is simply a set of objects or elements and can be finite, infinite, even uncountable. The interpretation function I assigns a mean or interpretation to each of the available constant, function, and predicate symbols as follows: - If c is a constant symbol then I(c) is an object in D. Thus, given a model, a constant can be viewed as naming an object in the domain. - If f is a function symbol of arity n then I(f) is a total function from D^n to D. That is the interpretation of f is a function that maps n domain objects to the domain. - If p is a predicate symbol of arity n then I(p) is a subset of D^n. That is a predicate symbol is interpreted as a set of tuples from the domain. If a tuple O= is in I(p) then we say that p is true for the object tuple O. For example, suppose that we have one predicate TallerThan, one function FatherOf, and one constant Bob. A model for these symbols might be the following: D = {BOB,JON,NULL} I(Bob) = BOB I(TallerThan) = {} Recall that I(FatherOf) is a function, so to give the interpretation of FatherOf we will just show the value for each input I(FatherOf)(BOB) = JON I(FatherOf)(JON) = NULL I(FatherOf)(NULL) = NULL Another possible interpretation might be, D = {BOB,JON} I(Bob) = BOB I(TallerThan) = {,} I(FatherOf)(BOB) = BOB I(FatherOf)(JON) = JON In the above, it is important to note the distinction between "Bob" which is a constant (a syntactic entity) and "BOB" which is an object in the domain (a semantic entity). The second interpretation is not what we might have in mind (e.g. the objects are fathers of themselves and TallerThan is inconsistent), but it is still a valid model. It is the job of the knowledge base to rule out such unintended models from consideration by placing appropriate constraints on the symbols. Before we define the semantics of strings, we will need to introduce one more piece of notation for dealing with variables. Given a model M=, a variable x, and object o \in D we define the "extended model" M[x->o] as a model that is identical to M, except that I is extended to interpret x as o, i.e. I(x)=o. We are now ready to define the meaning or interpretations of strings (terms and formulas) relative to a given model M=. Recall that we will denote the interpretation of a string phi relative to a model M by phi^M. First, we define recursively the semantics for a term t. - If t is a constant or variable then t^M = I(t) (note that if t is a variable we assume that M has been "extended" to interpret that variable) - If t is of the form f(t1,...,tn) where f is a function symbol and the ti are terms we have t^M = I(f)(t1^M,...,tn^M) So we see that each term t will be interpreted as a distinct object of the domain D. So given the first model above, call it M1, we have the following, fatherOf(Bob)^M1 = I(fatherOf)(Bob^M1) = I(fatherOf)(BOB) = JON and fatherOf(fatherOf(Bob))^M1 = I(fatherOf)(fatherOf(Bob)^M1) = I(fatherOf)(JON) = NULL Now given the ability to interpret terms we can define the interpretation of a formula phi relative to M. The interpretation of any formula is either true or false and is given by the following recursive definition. - If phi is a primitive formula of the form p(t1,...,tn), where p is a predicate and the ti are terms we have, phi^M = true, if \in I(p) phi^M = false, otherwise - If phi is of the form phi1 c phi2 where c is a logical connective, we have phi^M = phi1^M c phi2^M - If phi is of the form ~phi1 where phi1 is a formulas, we have phi^M = ~phi1^M - If phi is of the form, \exists x phi1 where phi1 is a formula (that may or may not involve the variable x), we get phi^M = true, if there exists an o \in D such that phi1^M[x->o] = true (here "\in" means "an element of") = false, otherwise - If phi is of the form, \forall x phi1 where phi1 is a formula (that may or may not involve the variable x), we get phi^M = true, if for all o \in D we have phi1^M[x->o] = true (here "\in" means "an element of") = false, otherwise Notice the use of extended models M[x->o] to define the semantics of quantified formulas. As an example consider the first of the above models (call it M), D = {BOB,JON,NULL} I(Bob) = BOB I(TallerThan) = {} Recall that I(FatherOf) is a function, so to give the interpretation of FatherOf we will just show the value for each input I(FatherOf)(BOB) = JON I(FatherOf)(JON) = NULL I(FatherOf)(NULL) = NULL and the atomic formulas TallerThan(Bob,FatherOf(Bob). To compute the interpretation, TallerThan(Bob,FatherOf(Bob)^M we need to check whether \in I(TallerThan). Since = which is in I(TallerThan) we get that TallerThan(Bob,FatherOf(Bob)^M = true. For the second of the above models we get that this same formula is false (verify this on your own). As an example involving quantifiers consider the formula, \exists x TallerThan(x,FatherOf(x)) consulting the above definition of the semantics we get that, [\exists x TallerThan(x,FatherOf(x))]^M is true if and only if we can find an object o in D such that TallerThan(x,FatherOf(x))^M[x->o] is true. BOB is such an object (verify on your own) so we see that the interpretation of the quantified formula is true. Your book gives several nice examples of first-order knowledge bases that you should read about an understand. The notion of entailment for first-order formulas is defined exactly as for propositional logic. That is KB |= phi if all the models of KB are also models of phi'. For example, KB might be a set of formulas about the natural numbers and phi might ask whether there is a largest prime number. If KB accurately captures the natural numbers then phi should be entailed. Computing whether KB |= phi holds or not (i.e. deduction) is much more difficult for first-order logic than in the propositional case, and will be our next topic. Basic things to make sure you can do: ------------------------------------- 1) Determine which strings are well-formed formulas and terms of first-order logic. 2) Translate English statements into first-order logic expressions and vice versa. 3) Determine the interpretation of any formula or term relative to any model, showing the recursive steps. Likewise given a formula create models that the formula is true or false in.