In this chapter you’ll...
- Be introduced to the Jess programming language
- Analyze a rule-based program
- See familiar examples of rule-based systems
Rule-based software is in regular use in practically every business, school, and home. In this chapter, we’ll look at some examples of how rules are used to solve common problems. Because most programmers learn best by doing, you’ll start by writing a rule-based program of your own.
“The answer, please?”
The stern voice startles you. You were dozing in Mrs. Rosencrantz’s high school math class again. You realize at once that she’s been talking to you.
You look at the blackboard. It’s one of those word puzzles, the logic kind. Mrs. Rosencrantz is waiting for you to solve it. You quickly scan what she’s scrawled on the board with her crone’s hand:
- A foursome of golfers is standing at a tee, in a line from left to right. Each golfer wears different colored pants; one is wearing red pants. The golfer to Fred’s immediate right is wearing blue pants.
- Joe is second in line.
- Bob is wearing plaid pants.
- Tom isn’t in position one or four, and he isn’t wearing the hideous orange pants.
- In what order will the four golfers tee off, and what color are each golfer’s pants?”
You get the gist of it right away, but how on earth are you supposed to figure it out? There’s no formula to use, no analytic procedure for deriving a solution. Algebra was one thing, but this? Why weren’t you paying attention in class?
Rules to the rescue! A rule-based program can satisfy Mrs. Rosencrantz by efficiently finding the one combination of names, positions, and colors that fits all the constraints. You can directly translate the problem statement into rules, and the rules will find the solution.
Let’s write that program to see how a rule-based system would solve this problem. You’ll write the program in the Jess language. Don’t be concerned that you don’t know the Jess language yet—right now, I’d just like you to understand the approach. You’re going to:
1. Choose a way to represent the possible combinations of men’s names, positions, and pants colors.
2. Write one rule that describes the problem.
The first step is to define data structures to represent the smallest useful pieces of a possible solution to the problem: a link between a name and either a position or a color:
(deftemplate pants-color (slot of) (slot is)) (deftemplate position (slot of) (slot is))
A deftemplate is a bit like a class declaration in Java. While class objects have member variables, deftemplates have slots. Each slot is a placeholder for a specific piece of information. For example, the pants-color template has a slot named of for a person’s name and a slot named is to hold a color. Whereas a Java class is a definition of a type of object, a template is a definition for a type of fact (a fact is basically what it sounds like: a piece of possibly useful information.) A pants-color fact represents the idea that one specific golfer (named in the of slot) has a certain color pants (named in the is slot.)
You’ll use these templates to create facts representing each of the possible combinations. There are 32 of them altogether—for example:
(pants-color (of Bob) (is red)) (position (of Joe) (is 3))
You can write a rule to create all 32 of these facts and put them into working memory, a kind of scratch space Jess uses to store the facts it knows:
(defrule generate-possibilities => (foreach ?name (create$ Fred Joe Bob Tom) (foreach ?color (create$ red blue plaid orange) (assert (pants-color (of ?name) (is ?color)))) (foreach ?position (create$ 1 2 3 4) (assert (position (of ?name) (is ?position))))))
This code loops (using foreach) over the four names given in the problem and creates (using assert) a pants-color fact for each of the possible name/color pairs and a position fact for each name/position pair, for a total of 32 facts. The function create$ returns a list of its arguments.
Now that you’ve written a rule to create all the possible combinations, you’ll write a second rule to search through them to find the subset of facts that represent the solution. This is the fun part. You’ll translate each sentence in the problem statement directly into code. First, note that you use a symbol starting with a question mark, like ?c, to write a variable in Jess. You’ll use the variable ?c to represent “some color”; ?p to represent “some position”; ?n to mean “some name”; and ?c1...?c4 and ?p1...?p4 to represent Fred, Joe, Bob, and Tom’s pants color and position, respectively.
Here’s the first useful sentence, The golfer to Fred’s immediate right is wearing blue pants:
(defrule find-solution ;; There is a golfer named Fred, whose position is ?p1 ;; and pants color is ?c1 (position (of Fred) (is ?p1)) (pants-color (of Fred) (is ?c1)) ;; The golfer to Fred's immediate right ;; is wearing blue pants. (position (of ?n&~Fred) (is ?p&:(eq ?p (+ ?p1 1)))) (pants-color (of ?n&~Fred) (is blue&~?c1))
In this code snippet, the variable ?n represents the unknown name of the person to Fred’s right, ?p1 is Fred’s unknown position, ?c1 is the unknown color of Fred’s pants, and ?p is the unknown golfer’s position. In these patterns, & means and and ~ means not, so (name ?n&~Fred) means that this person’s name, call it ?n, is not Fred. Here’s the next line (Joe is second in line):
;; Joe is in position #2 (position (of Joe) (is ?p2&2&~?p1)) (pants-color (of Joe) (is ?c2&~?c1))
Note that you must be careful to read between the lines of the problem as you write this rule. You know every golfer is in a different position, so you can say with confidence that ?p2&2&~?p1—Joe’s position, call it ?p2, the value of which is 2, is not the same as Bob’s position ?p1. It’s possible that ?p2 and ?p are the same, though: Joe might be to Fred’s immediate right, so you don’t mention ?p here.
Now the next line of the problem, Bob is wearing plaid pants:
;; Bob is wearing the plaid pants (position (of Bob) (is ?p3&~?p1&~?p&~?p2)) (pants-color (of Bob&~?n) (is plaid&?c3&~?c1&~?c2))
By now you know a lot about Bob’s position ?p3 and pants color ?c3. You know ?p3 is not the same as ?p1 or ?p2, and you also know it’s not the same as ?p. Why? Because the golfer in position ?p wears blue pants, and Bob’s pants are plaid, and the golfers in ?p1 and ?p2 are named Fred and Joe, not Bob.
;; Tom isn't in position 1 or 4 ;; and isn't wearing orange (position (of Tom&~?n) (is ?p4&~1&~4&~?p1&~?p2&~?p3)) (pants-color (of Tom) (is ?c4&~orange&~blue&~?c1&~?c2&~?c3))
There are only four positions, but you know Tom’s position is not 1, not 4, and not ?p1, ?p2, or ?p3—more constraints than there are possibilities. This is actually a good sign; it suggests that you have more than enough information to solve the puzzle. You can place a similar number of constraints on Tom’s fashion choices.
All that is left is to print out the set of variables ?p1...?p4 and ?c1...?c4 that solves the problem:
=> (printout t Fred " " ?p1 " " ?c1 crlf) (printout t Joe " " ?p2 " " ?c2 crlf) (printout t Bob " " ?p3 " " ?c3 crlf) (printout t Tom " " ?p4 " " ?c4 crlf crlf))
The symbol => separates the if part of the rule from the then part—it specifies what the rule should do if all the requirements are satisfied. Here it prints a table of results. If you enter the code for the problem into Jess and then run it, you get the answer directly. The source for this problem is in the file rosencrantz.clp, and you can run it like this:
C:\Jess61> java –classpath jess.jar jess.Main rosencrantz.clp Fred 1 orange Joe 2 blue Bob 4 plaid Tom 3 red
You see that exactly one set of variables satisfies the problem. Joe turns out to be the mysterious man in the blue pants, and Fred (the tasteless golfer in the orange ones) will tee off first.
What would happen if some of the information was missing? For example, suppose you didn’t know that Joe was second at the tee—how would this affect the results? Let’s give it a try. If you change Joe’s section of the program like so:
;; We don't know anything about Joe, really (position (of Joe) (is ?p2&~?p1)) (pants-color (of Joe) (is ?c2&~?c1))
and then run it, you get the following result:
Fred 3 orange Joe 4 blue Bob 1 plaid Tom 2 red Fred 2 orange Joe 4 blue Bob 1 plaid Tom 3 red Fred 2 orange Joe 1 blue Bob 4 plaid Tom 3 red Fred 1 orange Joe 2 blue Bob 4 plaid Tom 3 red Fred 1 orange Joe 3 blue Bob 4 plaid Tom 2 red Fred 1 orange Joe 4 blue Bob 3 plaid Tom 2 red
Now there are six different solutions, and Jess finds and reports them all. This is another strength of rule-based programming: Rule-based systems can degrade gracefully in the presence of incomplete information. You didn’t have to build this quality into the program—it’s just there.
When you wake up from the recurring math-class nightmare, you may be relieved to think that solving logic puzzles is a far cry from your normal duties as a programmer. But that’s really not so. Ill-defined problems like this logic puzzle abound in business environments. A human-resources application may need to flag personnel with a suspicious pattern of insurance claims. A requirement for a financial application might be to recommend buying securities that look promising. Manufacturing software could raise an alarm and shut down an assembly line if quality-assurance results indicate there may be a production problem.
What do suspicious, promising, and may be a problem mean, exactly? They probably can’t be expressed as equations—it’s possible they can’t be given a precise definition at all. If they each can be described as a set of constraints and guidelines, however, then a rule-based program can implement them easily. In the rest of this chapter, we’ll look at some common applications of rule-based programming.
You are probably affected by rule-based software every day, whether you realize it or not. Many commonplace activities in a modern office are controlled by rules. Let’s look at a few examples.
“I can’t keep up with all the email I get!” is a common complaint in this Net-connected world. Huge numbers of email messages are sent, delivered, and read every day. Rules to the rescue! There are many technological solutions to the problem of sorting email, and almost without exception, these solutions are simple rule-based systems.
The venerable program sendmail (http://www.sendmail.org/) delivers most of the email on the Internet. It sports a famously cryptic rule-based configuration language. The following example is lifted verbatim from the sendmail web site. It shows a sendmail rule that translates BITNET addresses like decvax!user and research!user into email@example.com and firstname.lastname@example.org, respectively:
# translate UUCP addresses to SMTP ones LOCAL_RULE_3 UUCPSMTP(`decvax', `decvax.dec.com') UUCPSMTP(`research', `research.att.com')
Microsoft Outlook, Eudora, Netscape Messenger, OS X Mail, and other popular mail clients include the ability to automatically sort messages into mailboxes according to the sender’s address, the recipient’s address, the subject, and other characteristics. In all cases, the user programs the filter in a rule-based language. For these graphical mail clients, the rule language is generally a pattern of checkboxes and pop-up menus in a mail-filtering dialog box.
Programs like procmail (http://www.procmail.org/) and filter (http://www.math.fu-berlin.de/~guckes/elm/elm.index.html) do this same kind of filtering in batch mode on Unix systems. I’d personally perish under the weight of the hundreds of messages I receive each day without procmail to sort through and organize my mail. Instead of dialog boxes, these programs offer a simple textual language. For example, here are two of the rules in my procmail configuration:
# Put all messages mentioning Jess in their subject line # into the IN.jess-users folder. :0: * ^Subject:.*jess Mail/IN.jess-users # Send all messages from "Out-of-Office Autoreply" to # /dev/null – the UNIX "Trash can." :0: * ^From:.*Autoreply /dev/null
Other programs exist that can automatically filter mail at a mail server to remove spam or to strip viruses from attachments. Again, these programs offer a wide range of features and interfaces, generally programmed via rules.
Email handling is one common example of how rules can make life easier for individuals. Now let’s look at some applications of rules in the enterprise.
When a complex, customizable product like a computer is sold to a customer, the seller must make sure the order corresponds to a functional system. If a given peripheral requires a specific cable, then the order must include that cable. Likewise, if the chassis can hold only two full-height disk drives, then the order better not include four of them. For many kinds of custom-manufactured goods, hundreds or thousands of these kinds of restrictions exist, making order validation a difficult and painstaking process.
The XCON system and its predecessors, developed at Digital Equipment Corporation (DEC), are well-known examples of using rule-based systems in this application area. The original XCON included 210 rules for validating orders for DEC hardware. By 1989, XCON included 17,500 rules and knew about 31,000 hardware components. The estimated savings to DEC at the time was $40 million annually due to increased accuracy, reduced testing costs, and higher customer satisfaction, compared to configuration by human workers. Such systems have become common not only in manufacturing but in the mail-order and Internet sales industry, where rule-based systems help to recommend related products, optimize packaging, and perform other routine order-configuration tasks. Part VI of this book describes a series of web-enabled systems for order configuration built around the Jess rule engine using Java servlets and JavaServer pages.
1 D. O’Connor and V. Barker, “Expert Systems for Configuration at Digital: XCON and Beyond,” Communications of the ACM 32, no. 3 (1989).
Corporations invariably define policies and strategies that specify how the business should respond to events ranging from individual sales to hostile takeover attempts. A business rule is a policy or strategy written in executable form, such that a computer can follow it. Here are two simple examples of business rules governing common situations:
IF employee's length of service > 1 year AND employee category is regular employee AND employee contributes to 401k plan THEN employee is vested in 401k plan END IF customer order is more than ten units AND customer type is wholesaler THEN deduct 10 percent from order END
If a company’s business rules are implicit—not written as rules per se, but embedded in procedural logic—and scattered throughout corporate computer applications, then a change in a single policy might require significant programmer effort to implement. Furthermore, if business rules are to be embedded directly into application software, it becomes difficult to use commercial, off-the-shelf (COTS) products, increasing the company’s development costs. The corporation will be forced to make a choice between containing development costs and making policy adjustments in response to changing circumstances.
The solution to this dilemma is to remove the business rules from the individual applications, make them explicit, and embed them in a centralized rule engine for execution. Any business policy can then theoretically be changed at a single point. The rule engine is often embedded in a network-based server so that it can be accessible across an enterprise.
This enterprise-level use of a rule engine is probably the fastest growing and most visible market for rule-based systems programming today. Some application servers, like BEA’s WebLogic, include integrated rule engines. Other vendors like ILOG sell rule engines meant to be used with third-party servers. There are literally dozens of rule engines to choose from, targeted toward this product niche. Part VII of this book discusses the use of rule engines in general and the Jess rule engine in particular in enterprise applications based on the Java 2 Enterprise Edition (J2EE) architecture.
Rule-based programs are everywhere. Their applications include everything from mail filtering to order configuration, and from monitoring chemical plants to diagnosing medical problems. Rule-based programs excel at solving problems that are difficult to solve using traditional algorithmic methods, and they work even when the input data is incomplete.
In the next chapter, we’ll refine our definition of a rule-based system. You’ll learn about the history of rule-based systems and about their architecture. We’ll also look at the development cycle for rule-based programs. By the end of the next chapter you’ll be ready to begin learning how to write your own rule-based software.