Jump to content

Prolog

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 194.134.189.73 (talk) at 01:46, 2 June 2004. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Prolog is a leading logical programming language. The name prolog is an acronym for PROgramming in LOGic. It was created by Alain Colmerauer around 1972. It was an attempt to make a programming language that enabled the expression of logic instead of carefully specified instructions on the computer.

Prolog is used in many artificial intelligence programs and in computational linguistics (especially natural language processing). Its syntax and semantics are considered very simple and clear. (The original goal was to provide a tool for computer-illiterate linguists.) A lot of the research leading up to modern implementations of prolog came from spin-off effects caused by the fifth generation computer systems project (FGCS) which chose to use a variant of Prolog named Kernel Language for their operating system.

Prolog is based on predicate calculus (more precisely first-order predicate calculus); however it is restricted to allow only Horn clauses. Execution of a Prolog program is effectively an application of theorem proving by first order resolution. Fundamental concepts are unification, tail recursion, and backtracking.

Data types

Prolog does not employ data types in the way usual in the common programming languages. We may rather speak about Prolog lexical elements instead of data types.

Atoms

The text constants are introduced by means of atoms. An atom is a sequence consisting of letters, numbers and underscores, which begins with a lower-case letter. Usually, if non-alphanumeric atom is needed, it is surrounded with apostrophes (e.g. '+' is an atom, + is an operator).

Numbers

Most Prolog implementations don't differentiate between integral and real numbers.

Variables

Variables are denoted by a string consisting of letters, numbers and underscore characters, and beginning with an upper-case letter. In the Prolog environment, a variable is not a container, which can be assigned to (unlike procedural programming languages). Its behaviour is closer to a pattern, which is increasingly specified by unification.

The so called anonymous variable is written as a single underscore (_).

Terms

Terms are the only way Prolog can represent complex data. A term consists of a head, also called functor (which must be an atom) and parameters (unrestricted types). The number of parameters, so called arity of term, is significiant. A term is identified by its head and arity, usually written as functor/arity.

Lists

A list isn't a standalone data type, because it is defined by a recursive construction (using term '.'/2):

  1. atom [] is an empty list
  2. if T is a list and H is an element, then the term '.'(H, T) is a list.

The first element, called the head, is H, which is followed by the contents of the rest of the list, designated T or tail. The list [1, 2, 3] would be represented internally as '.'(1, '.'(2 , 3 )) A syntactic shortcut is [H | T], which is mostly used to construct rules. The entirety of a list can be processed by processing the first element, and then the rest of the list, in a recursive manner.

For programmer's convenience, the lists can be constructed and deconstructed in a variety of ways.

  • Element enumeration: [abc, 1, f(x), Y, g(A,rst)]
  • Prepending single element: [abc | L1]
  • Prepending multiple elements: [abc, 1, f(x) | L2]
  • Term expansion: '.'(abc, '.'(1, '.'(f(x), '.'(Y, '.'(g(A,rst), [])))))

Strings

Strings are usually written as a sequence of characters surrounded by quotes. They are often internally represented as lists of ASCII codes.


Facts

Programming in Prolog is very different from programming in a procedural language. In Prolog you supply a database of facts and rules; you can then perform queries on the database. The basic unit of Prolog is the predicate, which is defined to be true. A predicate consists of a head and a number of arguments. For example:

cat(tom).

Here 'cat' is the head, and 'tom' is the argument. Here are some sample queries you could ask a Prolog interpreter basing on this fact:

?- cat(tom).
     yes.
?- cat(X).
     X = tom;
     no.

Predicates are usually defined to express some fact the program knows about the world. In most of the cases, the usage of predicates requires a certain convention. Thus, which version of the two below would signify that Pat is the father of Sally?

father(sally,pat). 
father(pat,sally). 

In both cases 'father' is the head and 'sally' and 'pat' are arguments. However in the first case, Sally comes first in the argument list, and in the second, Pat comes first (the order in the argument list matters). The first case is an example of a definition in Verb Subject Object order, and the second of Verb Object Subject order. Since Prolog does not understand English, both versions are fine so far as it is concerned; however it is good programming style to stick to either convention during the writing of a single program, so that to avoid writing something like

father(pat,sally).
father(jessica,james).

Some predicates are built in into the language, and allow a Prolog program to perform routine activities (such as input/output, using graphics and otherwise communicating with the operating system). For example, the predicate write can be used for output to the screen. Thus,

 write('Hello') 

will display the word 'Hello' on the screen.

Rules

The second type of statements in Prolog is rules. An example of a rule is

light(on) :- switch(on). 

The ":-" means "if"; this rule means light(on) is true if switch(on) is true. Rules can also make use of variables; variables begin with capital letters while constants begin with lower case letters. For example,

father(X,Y) :- parent(X,Y),male(Y). 

This means "if someone is a parent of someone and he's male, he is a father". The ancedent and consequent are in reverse order to that normally found in logic. It is possible to place multiple predicates in a consequent, joined with conjunction, for example:

a,b,c :- d.

which is simply equivalent to three separate rules:

a :- d. 
b :- d. 
c :- d.

What is not allowed are rules like:

a;b :- c.

that is "if c then a or b". This is because of the restriction to Horn clauses.

Evaluation

When the interpreter is given a query, it tries to find facts that match the query. If no outright facts are available, it attempts to satisfy all rules that have the fact as a conclusion. For example given the prolog code

sibling(X,Y) :- parent(Z,X), parent(Z,Y).
father(X,Y) :- parent(X,Y), male(X).
mother(X,Y) :- parent(X,Y), female(X).
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
mother(trude, sally).
father(tom, sally).
father(tom, erica).
father(mike, tom).
male(tom).
female(trude).
male(mike).

This results in the following query being evaluated as true:

?- sibling(sally, erica)
     yes.

The interpreter arrives at this result at matching the rule sibling(X, Y) by binding (colloquially; substituting) sally to X and erica to Y. This means the query can be expanded to parent(Z,sally), parent(Z,erica). Matching this conjunction is done by looking at all possible parents of sally. However, parent(trude,sally) doesn't lead to a viable solution, because if 'trude' is substituted for Z, parent(trude,erica) would have to be true, and no such fact (or any rule that can satisfy this) is present. So instead, tom is substituted for Z, and erica and sally turn out to be siblings none the less.

The code

mother(X,Y) :- parent(X,Y), female(X).
parent(X,Y) :- father(X,Y).

Might seem suspicious. After all, not every parent is a father. But it is true that any father is a parent. On the other hand, someone is only some one's mother if she is both that person's parent and female.

To infer that all fathers are male, you'd need to code

male(X) :- father(X,_).

which simply doesn't care whomever the child is (the underscore is an anonymous variable).

Negation

Typically, a query is evaluated to false by merit of not finding any positive rules or facts that support the statement. This is called the Closed World Assumption; it is assumed that everything worth knowing is in the database, so there is no outside world that might contain heretofore unknown evidence. In other words; if a fact is not known to be true (or false), it is assumed to be false.

A rule such as

legal(X) :- NOT illegal(X).

can only be evaluated by exhaustively searching for all things illegal, comparing them to X, and if no illegal fact can be found to be the same as X, X is legal. This is called Negation By Failure.

Execution

Prolog is a logical language, so in theory you shouldn't care about how it executes. However, sometimes it is prudent to take into account how the inference algorithm works, to prevent a prolog program from running unnecessarily long.

For example, we can write code to count the number of elements in a list.

elems([],0).
elems([H|T], X) :- elems(T, Y), X is Y + 1.

This simply says; if the list is empty, the number of elements is zero. If the list is non-empty, then X is one higher than Y, which is the number of elements in the remainder of the list without the first element.

In this case, there is a clear distinction between the cases in the rules' antecedent. But consider the case where you need to decide wheter to keep gambling in a Casino;

gamble(X) :- gotmoney(X).
gamble(X) :- gotcredit(X), NOT gotmoney(X).

If you have money, you keep gambling. If you've lost it all, you need to loan money, or else.. No more gambling. Gotmoney(X) might be a very costly function, for example, it might access your internet banking account to check your balance, which takes time. But the same goes for gotcredit.

In theory, prolog implementations might evaluate these rules out of order, so you might as well have written;

gamble(X) :- gotcredit(X), NOT gotmoney(X).
gamble(X) :- gotmoney(X).

Which is fine, because the two options exclude each other. However, checking whether you can get a loan is not necessary if you know you have money. So in practice, prolog implementations will check the rule you wrote first. You can use the cut operator to tell the interpreter to skip the second option if the first suffices. For example;

gamble(X) :- gotmoney(X),!.
gamble(X) :- gotcredit(X), NOT gotmoney(X).

This is called a green cut operator. The ! simply tells the interpreter to stop looking for alternatives. But you'll notice that if you don't need money it will need to check the second rule, and it will. Checking for gotmoney in the second rule is pretty useless since you already know you don't have any, otherwise the second rule wouldn't be evaluated in the first place. So you can change the code to

gamble(X) :- gotmoney(X),!.
gamble(X) :- gotcredit(X).

This is called a red cut operator, because it is dangerous to do this. You now depend on the proper placement of the cut operator and the order of the rules to determine their logical meaning. Cut-and-paste accidents lurk in dark corners. If the rules got mixed up, you might now max out your credit card before spending your cash.

Implementations

References

Template:List of programming languages