Implementing the Prolog Unification algorithm in Python? Backtracking

Question:

I’m trying to implement Unification, but having problems.. already got dozen of examples,but all they do is to muddy the water. I get more confused than enlightened :

http://www.cs.trincoll.edu/~ram/cpsc352/notes/unification.html

https://www.doc.ic.ac.uk/~sgc/teaching/pre2012/v231/lecture8.html
[code below is based on this intro]

http://www.cs.bham.ac.uk/research/projects/poplog/paradigms_lectures/lecture20.html#representing

https://norvig.com/unify-bug.pdf

How can I implement the unification algorithm in a language like Java or C#?

The art of Prolog one …and several others.
The biggest problem is that I could not figure clear statement of the problem. More mathy or lispy explanations are confusing me even more.

As a good start it seems a good idea to follow the representation to be list based (like in the lispy cases) i.e. :

pred(Var, val)  =becomes=> [pred, Var, val] 
p1(val1, p2(val2, Var1)) ==> [p1, val1, [p2, val2, Var1]]

except how do you represent lists themselves !? i.e. [H|T]

I would love if you can show me a Python pseudo code and/or more detailed algorithm description or a pointer to one.

Some points I grasp is the need to separate the code in general-unifier and var-unification, but then I cant see the mutual-recusive case ! … and so on.


As a side note : I would also love for you to mention how would you handle Unification on Backtracking. I think I have backtracking squared-away, but I know something has to happen to substitution-frame on backtracking.


Added an answer with the current code.

http://www.igrok.site/bi/Bi_language.html

http://www.igrok.site/bi/TOC.html

https://github.com/vsraptor/bi/blob/master/lib/bi_engine.py

Asked By: sten

||

Answers:

I will quickly summarize the chapter about Unification Theory by Baader and Snyder from the Handbook of Automated Reasoning:

Terms are built from constants (starting with a lower case letter) and variables (starting with an upper case letter):

  • a constant without arguments is a term: e.g. car
  • a constant with terms as arguments, a so called function application, is a term. e.g. date(1,10,2000)
  • a variable is a term, e.g. Date (variables never have arguments)

A substitution is a map assigning terms to variables. In the literature, this is often written as {f(Y)/X, g(X)/Y} or with arrows {X→f(Y), Y→g(X)}. Applying a substitution to a term replaces each variable by the corresponding term in the list. E.g. the substitution above applied to tuple(X,Y) results in the term tuple(f(Y),g(X)).

Given two terms s and t, a unifier is a substitution that makes s and t equal. E.g. if we apply the substitution {a/X, a/Y} to the term date(X,1,2000), we get date(a,1,2000) and if we apply it to date(Y,1,2000) we also get date(a,1,2000). In other words, the (syntactic) equality date(X,1,2000) = date(Y,1,2000) can be solved by applying the unifier {a/X,a/Y}. Another, simpler unifier would be X/Y. The simplest such unifier is called the most general unifier. For our purposes it’s enough to know that we can restrict ourselves to the search of such a most general unifier and that, if it exists, it is unique (up to the names of some variables).

Mortelli and Montanari (see section 2.2. of the article and the references there) gave a set of rules to compute such a most general unifier, if it exists. The input is a set of term pairs (e.g. { f(X,b) = f(a,Y), X = Y } ) and the output is a most general unifier, if it exists or failure if it does not exist. In the example, the substitution {a/X, b/Y} would make the first pair equal (f(a,b) = f(a,b)), but then the second pair would be different (a = b is not true).

The algorithm nondeterministically picks one equality from the set and applies one of the following rules to it:

  • Trivial: an equation s = s (or X=X) is already equal and can be safely removed.
  • Decomposition: an equality f(u,v) = f(s,t) is replaced by the equalities u=s and v=t.
  • Symbol Clash: an equality a=b or f(X) = g(X) terminates the process with failure.
  • Orient: an equality of the form t=X where t is not another variable is flipped to X=t such that the variable is on the left side.
  • Occurs check: if the equation is of the form X=t, t is not X itself and if X occurs somewhere within t, we fail. [1]
  • Variable elimination: of we have an equation X=t where X does not occur in t, we can apply the substitution t/X to all other problems.

When there is no rule left to apply, we end up with a set of equations {X=s, Y=t, ...} that represents the substitution to apply.

Here are some more examples:

  • {f(a,X) = f(Y,b)} is unifiable:
    decompose to get {a=Y, X=b} and flip to get {Y=a, X=b}
  • {f(a,X,X) = f(a,a,b)} is not unifiable:
    decompose to get {a=a,X=a, X=b}, eliminate a=a by triviality, then eliminate the variable X to get {a=b} and fail with symbol clash
  • {f(X,X) = f(Y,g(Y))} is not unifiable:
    decompose to get {X=Y, X=g(Y)}, eliminate the variable X to get {Y=g(Y)}, fail with occurs check

Even though the algorithm is non-deterministic (because we need to pick a equality to work on), the order does not matter. Because you can commit to any order, it is never necessary to undo your work and try a different equation instead. This technique is usually called backtracking and is necessary for the proof search in Prolog, but not for unification itself.

Now you’re only left to pick a suitable data-structure for terms and substitutions and implement the algorithms for applying a substitution to a term as well as the rule based unification algorithm.

[1] If we try to solve X = f(X), we would see that X needs to be of the form f(Y) to apply decomposition. That leads to solving the problem f(Y) = f(f(Y)) and subsequently Y = f(Y). Since the left hand side always has one application of f less than the right hand side, they can not be equal as long we see a term as a finite structure.

Answered By: lambda.xy.x

I get more confused than enlightened

Been there, done that.

Note: For any source code referenced I did not test the code and can not say it is valid, they are given as an example and look correct enough that I would load them up and run test cases against them to determine their validity.

First: You will get much better search results if you use the correct terminology, use backward chaining instead of Backtracking. e.g. backward-chaining/inference.py

Second: Understand that your question has three separate phases listed.
1. Unification algorithm
2. Backward chaining that uses Unification
3. A data structure for a list. You would not implement this as Python source code but as text to be passed to your functions. See: cons

You should first develop and fully test unification before moving onto backward chaining. Then fully develop and test backward chaining before creating a list data structure. Then fully test your list data structure.

Third: There is more than one way to implement the unification algorithm.
a. You noted the one that uses transformation rules, or noted as A rule based approach in Unification Theory by Baader and Snyder, e.g. delete decompose etc.
b. I prefer the algorithm noted as Unification by recursive descent in Unification Theory by Baader and Snyder given in this OCaml example or Python example
c. I have seen some that use permutations but can’t find a good reference at present.

Fourth: From personal experience, understand how each phase works by using pen and paper first, then implement it in code.

Fifth: Again from personal experience, there is lots of information out there on how to do this but the math and technical papers can be confusing as many gloss over something critical to a self-learner or are too dense. I would suggest that instead you focus on finding implementations of the source code/data structures and use that to learn.

Sixth: compare your results against actual working code, e.g. SWI-Prolog.

I can’t stress enough how much you need to test each phase before moving on to the next and make sure you have a complete set of test cases.

When I wanted to learn how to write this in a functional language the books on AI 1 2 3 and The Programming Languages Zoo were invaluable. Had to install environments for Lisp and OCaml but was worth the effort.

Answered By: Guy Coder

This so far works for all cases I come up with (except one case which requires occurs check, which I have not done yet):

def unify_var(self, var, val, subst):
#   print "var> ", var, val, subst

    if var in subst :   
        return self.unify(subst[var], val, subst)
    elif isinstance(val, str) and val in subst : 
        return self.unify(var, subst[val], subst)
    #elif (var occurs anywhere in x) then return failure
    else :
        #print "%s := %s" % (var, val)
        subst[var] = val ; return subst

def unify(self, sym1, sym2, subst):
    #print 'unify>', sym1, sym2, subst

    if subst is False : return False
    #when both symbols match
    elif isinstance(sym1, str) and isinstance(sym2, str) and sym1 == sym2 : return subst
    #variable cases
    elif isinstance(sym1, str) and is_var(sym1) : return self.unify_var(sym1, sym2, subst)
    elif isinstance(sym2, str) and is_var(sym2) : return self.unify_var(sym2, sym1, subst)
    elif isinstance(sym1, tuple) and isinstance(sym2, tuple) : #predicate case
        if len(sym1) == 0 and len(sym2) == 0 : return subst
        #Functors of structures have to match
        if isinstance(sym1[0], str) and  isinstance(sym2[0],str) and not (is_var(sym1[0]) or is_var(sym2[0])) and sym1[0] != sym2[0] : return False
        return self.unify(sym1[1:],sym2[1:], self.unify(sym1[0], sym2[0], subst))
    elif isinstance(sym1, list) and isinstance(sym2, list) : #list-case
        if len(sym1) == 0 and len(sym2) == 0 : return subst
        return self.unify(sym1[1:],sym2[1:], self.unify(sym1[0], sym2[0], subst))

    else: return False

FAIL cases are supposed to fail :

OK: a <=> a : {}
OK: X <=> a : {'X': 'a'}
OK: ['a'] <=> ['a'] : {}
OK: ['X'] <=> ['a'] : {'X': 'a'}
OK: ['a'] <=> ['X'] : {'X': 'a'}
OK: ['X'] <=> ['X'] : {}
OK: ['X'] <=> ['Z'] : {'X': 'Z'}
OK: ['p', 'a'] <=> ['p', 'a'] : {}
OK: ['p', 'X'] <=> ['p', 'a'] : {'X': 'a'}
OK: ['p', 'X'] <=> ['p', 'X'] : {}
OK: ['p', 'X'] <=> ['p', 'Z'] : {'X': 'Z'}
OK: ['X', 'X'] <=> ['p', 'X'] : {'X': 'p'}
OK: ['p', 'X', 'Y'] <=> ['p', 'Y', 'X'] : {'X': 'Y'}
OK: ['p', 'X', 'Y', 'a'] <=> ['p', 'Y', 'X', 'X'] : {'Y': 'a', 'X': 'Y'}
 ================= STRUCT cases ===================
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('p', 'a')] : {'X': 'Y'}
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('p', 'Z')] : {'X': 'Y', 'Z': 'a'}
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('P', 'Z')] : {'X': 'Y', 'Z': 'a', 'P': 'p'}
OK: [('p', 'a', 'X')] <=> [('p', 'Y', 'b')] : {'Y': 'a', 'X': 'b'}
OK: ['X', 'Y'] <=> [('p', 'a'), 'X'] : {'Y': ('p', 'a'), 'X': ('p', 'a')}
OK: [('p', 'a')] <=> ['X'] : {'X': ('p', 'a')}
-----
FAIL: ['e', 'X', ('p1', 'a')] <=> ['e', 'Y', ('p2', 'Z')] : False
FAIL: ['e', 'X', ('p1', 'a')] <=> ['e', 'Y', ('p1', 'b')] : False
FAIL: [('p', 'a', 'X', 'X')] <=> [('p', 'a', 'a', 'b')] : False
(should fail, occurs) OK: [('p1', 'X', 'X')] <=> [('p1', 'Y', ('p2', 'Y'))] : {'Y': ('p2', 'Y'), 'X': 'Y'}
================= LIST cases ===================
OK: ['e', 'X', ['e', 'a']] <=> ['e', 'Y', ['e', 'a']] : {'X': 'Y'}
OK: ['e', 'X', ['a', 'a']] <=> ['e', 'Y', ['a', 'Z']] : {'X': 'Y', 'Z': 'a'}
OK: ['e', 'X', ['e', 'a']] <=> ['e', 'Y', ['E', 'Z']] : {'X': 'Y', 'Z': 'a', 'E': 'e'}
OK: ['e', 'X', ['e1', 'a']] <=> ['e', 'Y', ['e1', 'a']] : {'X': 'Y'}
OK: [['e', 'a']] <=> ['X'] : {'X': ['e', 'a']}
OK: ['X'] <=> [['e', 'a']] : {'X': ['e', 'a']}
================= FAIL cases ===================
FAIL: ['a'] <=> ['b'] : False
FAIL: ['p', 'a'] <=> ['p', 'b'] : False
FAIL: ['X', 'X'] <=> ['p', 'b'] : False
Answered By: sten