Epilog | Will O'Pedia

tags: Logic Programming

Epilog is a language for writing logic programs. A logic program is a finite set of rules.

Questions

What is the history of Epilog? Where did it come from, what problem was it intended to solve?
What is the relationship between Epilog and Prolog?
What is the relationship between Epilog and Datalog?

Uses

Symbium
Corpus Legis
http://compk.stanford.edu/ferpa.html

Expressions

Terms and sentences

In logic, a term is an expression that refers to an object in the world. In the context of Epilog, terms are constants or compound expressions.

Ground terms are terms that contain no variables, e.g. a, b, pair(a,b), pair(a,pair(a,b))

Non-ground terms are sometimes called complex or composite terms.

Terms can be distinguished from logical sentences, which are expressions denoting conditions that are either true or false. Basic sentences are expressed with predicates in Epilog.

Constants

The simplest terms in Epilog are constants. A constant is a string of lower case letters, digits, underscores, and periods, or a string of arbitrary ASCII characters enclosed by double quotes. e.g.:

will
3.14
"hi there"

It's useful to distinguish constants used for a few different purposes: symbols, constructors, predicates, and operation constants.

Symbols

Symbols (aka "object constants") directly represent objects in the world, e.g. mike, will. The simplest use of a constant.

Constructors

Constructors (aka "function constants") are used to create compound names for objects, e.g. tuple in tuple(a,b,c)

Predicates

Predicates (aka "relation constants") are used to represent relationships on objects, e.g. parent in parent(rob,will)

Operation constants

Operation constants are used to represent operations, rules defined in terms of changes to base relations.

Literals

A literal is formed by a n-ary predicate and n arguments, which can be ground terms or variables. You can also negate the whole thing, giving you a negative literal. Otherwise you have a positive literal, also called an atom. An atom in which all arguments are ground terms is also called a fact (or "base relation", "factoid", "datum", "unit clause"). A fact represents a statement that a certain relation holds between objects.

p(a,b) is a fact, and therefore an atom, and therefore a positive literal.
p(a,X) is an atom and therefore a positive literal.
~p(a,b) and ~p(a,X) are negative literals, and therefore neither atoms nor facts.

Substitution

A pattern is an expression that may include variables.

An instance of an expression is that expression with all variables consistently replaced with ground terms.

goal(a,b) :- p(a,b) & ~q(b)
is an instance of
goal(X,Y) :- p(X,Y) & ~q(Y).

The set of bindings of variables to terms that gets us an instance of an expression is called a substitution. e.g. for the above:

{X<-a,Y<-b}

A substitution is a matcher for a pattern and an instance if applying the substitution to the pattern yields the instance.

Matching is the process of finding a matcher for a given pattern and instance. This is computationally inexpensive in Epilog.

Unification is the process of determining whether any two expressions can be unified, i.e. made identical by some substitution. Again, this is inexpensive in Epilog.

Rules

Rules are expressions that define new relationships in terms of existing relationships. Query rules and view rules are written as expressions consisting of an atom as the head followed by zero or more literals, like so:

grandparent(X,Z) :- parent(X,Y) & parent(Y,Z)

There are also operation rules, which are written slightly differently.

Rules read as reverse implications: the head goal is true when the subgoals are true.

It's useful, for our purposes, to write rules so that they have certain desirable characteristics. Here we discuss safety, semipositivity, and stratification.

Safety

A rule is safe if and only if every variable appearing in the head or a negative literal also appears in at least one positive literal in the body.

Safe:
goal(X) :- p(X,Y,Z) & ~q(X,Z)

Not safe:
goal(X,Y,Z) :- p(X,Y)

Say p(a,b) is true. Then goal(a,b,Z) will be true no matter what we substitute for Z. Computationally problematic

Not safe:
goal(X,Y,X) :- p(X,Y) & ~q(Y,Z)

Say p(a,b) and q(b,c) are true. We can conclude goal(a,b,a) is true by substituting anything other than c for Z. That is, q(Y,Z) only needs to be false for some value of Z. It is incorrect, but very common, to read the negation as universal and think that the subgoal is true when there is no Z for which q(Y,Z) is true.

Semipositivity

A semipositive program is one in which negations apply only to base relations, i.e. there are no subgoals with negated views.

The nice thing about a semipositive program is that we can easily compute its extension.

Stratification

A set of view definitions is stratified if its rules can be partitioned into strata such that:

Rules defining relations that appear in positive subgoals of a rule r1 appear in the same stratum as r1 or lower.
Rules defining relations that appear in negative subgoals of a rule r1 occur in some lower stratum.
Every stratum contains at least one rule.

To rephrase, a rule r in stratum n can use

rules from strata 1...n in positive subgoals, and
rules from strata 1...n-1 in negative subgoals.

To rephrase again, if you want to use a rule r2 in a negative subgoal of rule r1, r1 needs to go in a stratum above r2's.

e.g.
r(X,Y) :- q(X,Y)
r(X,Z) :- q(X,Y) & r(Y,Z)
s(X,Y) :- p(X) & p(Y) & ~r(X,Y)

can be stratified as follows:

Stratum 2
s(X,Y) :- p(X) & p(Y) & ~r(X,Y)
-------------------------------
Stratum 1
r(X,Y) :- q(X,Y)
r(X,Z) :- q(X,Y) & r(Y,Z)

The good thing about a stratified ruleset is that it avoids a certain kind of ambiguity. Consider the following unstratifiable rule:
s(X,Y) :- ~s(Y,X)
Take s(a,b) to be true and s(b,a) to be false, and the rule is satisfied. But it is also satisfied if we take the opposite, s(b,a) true and s(a,b) false! So there is ambiguity inherent in s.

Like a semipositive program, a safe and stratified program has just one extension, and that extension can be computed.

Datasets

A dataset is a collection of facts. Formally, any subset of the Herbrand base for a vocabulary is a dataset in that vocabulary.

A vocabulary is a collection of constants.

Herbrand base: Set of all facts that can be formed from the constants in a vocabulary. e.g. for a vocabulary with symbols a, b and predicate r/2: {r(a,a),r(a,b),r(b,a),r(b,b)}. Any subset of the Herbrand base for a vocabulary is a dataset in that vocabulary.

Herbrand universe: Set of all ground terms that can be formed from a vocabulary. Infinite if vocabulary contains constructors. e.g. {pair(a,b),pair(a,pair(a,b)),...}

Extension

The extension of a logic program is the set of all facts that can be inferred from its dataset.

Queries

A non-empty, finite set of query rules (typically just one) forms a query.

Query rule: A rule used for a query. Couple of constraints to make a rule a query rule:

Use a single generic predicate as the head (e.g. goal)
Subgoals can mention only predicates for base relations (not view relations).

Applying a rule r to a dataset Δ gives us the set of all instances of r such that every positive subgoal of the instance is in Δ and no negative subgoal of the instance is in Δ. Notice the semantics of negation are a bit subtle here, and do not always give the results you might expect.

Negation

Be careful with universal negation. Say e.g. we want childless(X) to hold if X is no one's parent.

Incorrect but tempting:
childless(X) :- person(X) & ~parent(X,Y)

~parent(X,Y) holds if there is some Y for which X is not a parent.

Correct:
isparent(X) :- parent(X,Y)
childless(X) :- person(X) & ~isparent(X)

Defining the rule isparent allows us to express the universal quantification.

Views

We can define relations in terms of other relations, e.g. grandparent(X,Z) :- parent(X,Y) & parent(Y,Z). We'll call these view relations, as opposed to base relations.

View relations are written similarly to query rules, but they are less constrained:

You can use a specific predicate in the head for the relation being defined (e.g. grandparent).
Subgoals can mention predicates for view relations as well as base relations.

Loosening the constraints gives us enough leash to get tangled, so there are some best practices for datasets and rulesets, viz. compatibility, semipositivity, stratification, and safety.

Operations

Operation rules

Operation rules are rules that change base relations (facts). They add dynamism to logic programs, useful for e.g. event handling, database management, constraint propagation… Expressed as e.g.

click(X) :: ~p(X) ==> p(X) & highlight(X)

that is:

an action expression followed by ::,
optionally, a literal or conjunction of literals followed by ==>, and
a literal or action expression or a conjunction thereof.

As with query and view rules, safety is a consideration.

Operation definition: A collection of operation rules sharing a same operation in the head. Usually such a collection is finite.

Action: Application of an operation to specific objects. Denoted by operation constant followed by n terms in parentheses, e.g. f(a,b)

Predefined concepts

Lists

Three equivalent ways to write lists:

[a,b,c]
a!b!c!nil
cons(a,cons(b,cons(c,nil)))

Ordered lists can be used to represent sets reasonably efficiently. See the text

Trees

cons can also be used to represent trees, e.g. cons(cons(a,b),cons(c,d))

Arithmetic

evaluate represents equations, e.g. evaluate(plus(times(3,3),times(2,3),1),16)

countofall, e.g. childless(X) :- person(X) & evaluate(countofall(Y,parent(X,Y)),0)

Misc

Identity operators: same and distinct

Rule optimization

Subgoal ordering

It is usually faster to order subgoals in a way that binds variables as late as possible. e.g.

goal(X,Y) :- p(X) & r(X,Y) & q(X)

is slower than

goal(X,Y) :- p(X) & q(X) & r(X,Y)

Intuitively, it is cheaper to carry forward a set of possible answers of size O(X) than one of size O(X*Y).

Redundant subgoal removal

goal(X,Y) :- p(X,Y) & q(Y) & q(Z)

is equivalent to

goal(X,Y) :- p(X,Y) & q(Y)

If q(Y) is true, we can use the same binding for Z and get q(Z) is true.

Detecting redundant subgoals can be done mechanically; see the text.

Redundant rule removal

Consider the rules:
goal(X) :- p(X,b) & q(b) & r(Z)
goal(X) :- p(X,Y) & q(Y)

Any answer produced by the first rule is also produced by the second. The first rule is therefore redundant and can be eliminated.

Theory: Logic programs

A logic program is a finite set of rules.

An existentially qualified goal G is a logical consequence of a program P if A is an instance of G and there is a clause in P with a ground instance A <- B_1, ..., B_n, n >= 0 such that B_1, ..., B_n are logical consequences of P.

Put another way: G is a logical consequence of P iff G can be deduced from P by a finite number of applications of the rule of universal modus ponens.

The meaning of a program P, M(P), is the set of ground goals deducible from P. Given some intended meaning M, M(P) is correct if M(P) is a subset of M, and M(P) is complete if M is a subset of M(P).

Deduction rules

Identity: From P deduce P
Generalization: For any substitution ϴ and instance Pϴ, existential query P is logical consequence
Instantiation: For any universally quantified statement P and substitution ϴ, deduce an instance Pϴ
Modus ponens: Given B and A <- B we can deduce A. Note that identity and instantiation are just special cases of modus ponens