Chapter 10. Designing the application – Jess in Action: Rule-Based Systems in Java

Chapter 10. Designing the application

In this chapter you’ll...

  • Design deftemplates for the Tax Forms Advisor
  • Partition the application with defmodules
  • Write code to ask questions of the user

In this chapter, you will begin to develop the Tax Forms Advisor system described in chapter 9. You will decide what the facts should look like and how to divide the rules into modules (when you write them in the next chapter). You’ll also design some I/O functions and other infrastructure the rules in the system need. In chapter 11, you’ll write the actual rules on the foundation you develop here.

The design process you’ll follow in this chapter is idealized: There are no false starts or backtracking. In truth, designing a system like this usually involves experimentation, especially when you’re still gaining experience. Don’t be discouraged; on the contrary, feel free to experiment with different approaches to implementing this application and to the others in this book.

In previous chapters of this book, you’ve entered code directly at the Jess> prompt. This approach is great for experimenting, but when you’re developing an application, you’ll want to save the code in a text file instead. You can then execute your code either by using Jess’s (batch) function (which executes the contents of a file full of Jess code) or by specifying the filename on the command line like this:

C:\> java –classpath jess.jar jess.Main taxes.clp

The .clp extension is traditional but not required. Using a specific extension consistently is helpful, because you may be able to train your programmer’s editor to recognize Jess code by the filename.

10.1. Organizing the data

As you know, Jess rules work by pattern-matching on facts. Therefore, before you can write any rules, you need to have some idea what the facts will look like. Of course, in one of those classic chicken-and-egg problems, you don’t know what the facts should look like until you see the rules. How do you get started?

Generally, the knowledge-engineering effort suggests some possible fact categories. If you record the knowledge as proposed rules or rule-like pseudocode (perhaps using the index-card method described in chapter 9), the possible fact types will be explicitly laid out. Otherwise, you’ll have to read through the collected knowledge to get a feel for the kinds of facts that are required. The whole process is subjective, and there is no “right” answer. With practice, you’ll get a feeling for what will work and what will not.

Looking through chapter 9’s collected knowledge for the Tax Forms Advisor, you can see some possible candidates for deftemplate types:

  • form A specific tax form
  • user The operator of the system
  • deduction A way of reducing your taxable income
  • credit A way of reducing your tax burden
  • dependent A person the user cares for

Thinking about the general organization of the application suggests a few more possibilities:

  • question A question the system might ask the user
  • answer An answer given by the user
  • recommendation A note that the system will recommend a specific form

These eight templates are good candidates for inclusion in the system. Next you need to decide what form they will take—ordered or unordered facts? And for the unordered ones, what slots should they have?

10.2. Filling in details

Most facts in this system will represent physical or conceptual objects, rather than commands or actions. An object generally has observable properties—color, mass, and so on. To represent an object and its properties as a fact, you can use an unordered fact, declaring an explicit deftemplate with multiple slots, one for each property.

The user fact will clearly play a central role. If you look back at the knowledge collected in section 9.3.1, you can see that the user’s income and number of dependents are each fairly important and are each referenced in more than one place. These two items are therefore good candidates to be slots in a user deftemplate, which might look like this:

(deftemplate user
    (slot income)
    (slot dependents))

This is a good start, but you need to worry about one detail: default slot values.

10.2.1. Default slot values

Jess’s mathematical functions generally throw an exception to report the error if you pass in a nonnumeric argument:

Jess> (+ 1 2)
3
Jess> (+ one two)
Jess reported an error in routine Value.numericValue
    while executing (+ one two).
    Message: Not a number: "one" (type = ATOM).
    Program text: ( + one two ) at line 2.
    ......

An empty slot in an unordered fact contains the value nil, which is a symbol, not a number. If you write a rule that matches this empty slot and uses a mathematical function to do it, an exception will be thrown during pattern-matching, like this:

Jess> (deftemplate number (slot value))
TRUE
Jess> (defrule print-big-numbers
    (number (value ?v&:(> ?v 10000)))
    =>
    (printout t ?v " is a big number." crlf))
TRUE
Jess> (assert (number))
Jess reported an error in routine Value.numericValue
    while executing (> ?v 10000)
    while executing rule LHS (TEQ)
    while executing rule LHS (TECT)
    while executing (assert (MAIN::number (value nil))).
    Message: Not a number: "nil" (type = ATOM).
    Program text: ( assert ( number ) ) at line 13.
    ...

If you plan to use mathematical functions on the left-hand side (LHS) of a rule, it makes sense to add numeric defaults to any slots intended to hold numeric values. The income and dependents slots of the user template will hold numbers, so you should modify the template to look like this:

(deftemplate user
    (slot income (default 0))
    (slot dependents (default 0)))

Now the income and dependents slots will be created holding numeric values, and you won’t encounter this kind of error.

10.3. More templates

A form has a code name like 1040A. It also has a descriptive name, like Federal income tax short form. Therefore, the form template might look like this:

(deftemplate form (slot name) (slot description))

Because the system will ask the user a number of questions, you need a generic way to represent a question and its answer. Although you don’t know yet what the question-asking mechanism will look like, because you haven’t written the code, you can guess that the following two templates might be a good start:

(deftemplate question (slot text) (slot type) (slot ident))
(deftemplate answer (slot ident) (slot text))

The question template ties a symbolic name for a question (in the slot ident) to the text of the question. You’ll use the working memory as a convenient database in which to look up the question text by identifier, so that if two rules might need to ask the same question, you won’t have to duplicate the text. You’ll use the type slot to hold an indication of the expected category of answer (numeric, yes or no, and so on). The answer template ties the answer to a question. A question and its corresponding answer will have the same symbolic value in their ident slots. Once an answer for a given question exists, you won’t ask it again. (You’ll develop the code that uses these templates in section 10.6.)

Finally, a recommendation needs a slot to hold a form, and perhaps an explanation:

(deftemplate recommendation (slot form) (slot explanation))

You’ve defined templates named user, form, question, answer, and recommendation. It turns out that this collection is sufficient for your needs. Let’s consider why these templates are enough.

10.4. Templates you don’t need

The other possible templates (dependent, credit, and deduction) probably won’t be a part of the application. Looking back at chapter 9, you don’t see anything about the collected knowledge that requires you to store information about individual dependents—only the total number of dependents, which you’ll store in the user template. As a result, you won’t need a dependent template after all.

The argument for not including credit and deduction is more involved, and it’s related to an important architectural decision. If you stored credits and deductions as facts, you could write a generic set of rules to operate on these facts. The advantage to this architecture is that new forms could be added simply by augmenting the set of credits and deductions—that is, by asserting new facts. You could do this by extending a set of deffacts that would be read at application startup. In general, adding new facts would be an easier way to add new tax forms than modifying the rules. If you hard-coded the credit and deduction information into the rules, though, then you could only extend the application by modifying the rules.

On the other hand, the generic rules might be hard to understand, and that would itself make the code more difficult to modify. For the small set of forms this application will work with, I think hard-coding the tax information will lead to a cleaner, simpler application. If you needed to work with 100 forms, or 1,000, the other approach would be worth considering. For this system, though, you won’t need credit or deduction facts; all the tax laws will be encoded directly in the rules.

10.5. Organizing the rules

You’ve defined five templates to serve as data structures for the application. Now let’s turn our attention from data to actual code. The first order of business is to sketch out a rough structure for how the rules will be organized.

The Tax Forms Advisor needs to do four things:

  1. Initialize the application
  2. Conduct an interview with the user to learn about her tax situation
  3. Figure out what tax forms to recommend
  4. Present the list of forms to the user, removing any duplicate recommendations in the process

These four steps map nicely onto four separate processing phases, each with an independent set of rules. You can put the rules for each phase into a separate defmodule (as described in section 7.6) and take advantage of the support Jess offers for partitioning a problem into steps. The four modules are named startup, interview, recommend, and report, respectively.

Defmodules partition not only the rules of an application, but also the facts. You need to decide which of the templates ought to go into which of the modules. You can do this by looking at which module’s rules need access to the data. Remember that if two or more modules need to share a deftemplate, it should go into the module MAIN. Examination of the list of templates and of the modules listed here shows that every template will be needed by at least two modules. For instance, the question and answer templates need to be shared between the interview and recommend modules, whereas recommendation is needed by both recommend (which asserts recommendation facts) and by report (which displays information derived from them). As a result, all of the deftemplates you define will be in module MAIN. This is not unusual.

10.6. Building the infrastructure

Very often, many of the rules in a rule-based system follow a repeating pattern. You know this application needs to ask the user a series of questions and record the answers in the working memory. You can develop code to ask a question and receive an answer as a kind of subroutine, and all the rules that need this capability can call it. Not only does this approach simplify the code for your system, but it also makes it easier to change the interface—if you need to upgrade from a text-based to a graphical kiosk, then you may only need to change this one part of the system.

10.6.1. Simple text-based I/O

Recall (from section 3.1.4) Jess’s printout function, which you can use to print to standard output. This function can accept any number of arguments and can perform rudimentary formatting (you can control where newlines go by using the special symbol crlf as an argument). There is also a function read that reads a single input token from standard input, returning what it reads. This suggests you can put these two functions together into a deffunction that emits a prompt and reads the response, like this:

(deffunction ask-user (?question)
    "Ask a question, and return the answer"
    (printout t ?question " ")
    (return (read)))

You should test this function to make sure it works (assuming you’ve entered the code for ask-user in the file taxes.clp):

Jess> (batch taxes.clp)
TRUE
Jess> (ask-user "What is the answer?")
What is the answer? 42
42

I entered 42 as the answer, and the function returned 42; it appears to work fine.

So far, ask-user doesn’t do any error checking. You’d like it to only accept answers appropriate to the given question—for example, only yes or no, or only a number. You need another function—one that can check the form of an answer. Here’s one:

(deffunction is-of-type (?answer ?type)
    "Check that the answer has the right form"
    (if (eq ?type yes-no) then
      (return (or (eq ?answer yes) (eq ?answer no)))
    else (if (eq ?type number) then
           (return (numberp ?answer)))
          else (return (> (str-length ?answer) 0))))

The second parameter ?type to this function can be yes-no, number, or anything else. If it is yes-no, the function returns FALSE unless ?answer is "yes" or "no". If ?type is number, then the function returns true only if ?answer is a number (using the built-in numberp function to test for this condition). If ?type is anything else, is-of-type returns TRUE unless ?answer is the empty string.

Now it is easy to rewrite ask-user to use is-of-type for error checking. While you’re at it, you can use the new ?type parameter to enhance the prompt by adding a hint about the possible answers:

(deffunction ask-user (?question ?type)
    "Ask a question, and return the answer"
    (bind ?answer "")
    (while (not (is-of-type ?answer ?type)) do
      (printout t ?question " ")
      (if (eq ?type yes-no) then
        (printout t "(yes or no) "))
      (bind ?answer (read)))
    (return ?answer))

Again, you should test these new functions:

Jess> (is-of-type yes yes-no)
TRUE
Jess> (is-of-type no yes-no)
TRUE
Jess> (is-of-type maybe yes-no)
FALSE
Jess> (is-of-type number abc)
FALSE
Jess> (is-of-type number 123)
TRUE
Jess> (ask-user "What is the answer?" yes-no)
What is the answer? (yes or no) 42
What is the answer? (yes or no) yes
yes

This time when I entered 42 as the answer, the function rejected it. When I typed yes instead, the function returned yes.

10.6.2. Fetching the question text

The question template has a slot to hold the text of a question and another slot to hold a unique identifier. Similarly, the answer template associates that same identifier with an answer. You’d like to call something from the right-hand side (RHS) of a rule in the interview module using just the identifier, and have that something look up the question text, ask the question, and assert an answer fact.

There are two ways to fetch something in working memory: using a defquery or using a defrule. Of the two, rules are cheaper computationally, because invoking a query always involves clearing part of the Rete network and asserting one or more facts. Therefore, your subroutine could take the form of a single defrule in its own defmodule. If that defrule has the auto-focus property (so that it fires as soon as it’s activated, regardless of what other rules may be on the agenda) and uses return on its RHS to resume the previous module focus as soon as it ran, then the interview module can call it as a subroutine just by asserting a fact to activate it. The trigger fact looks like (ask id), where id is a question identifier. Such a rule can look like this:

(defmodule ask)
(defrule ask::ask-question-by-id
    "Ask a question and assert the answer"
    (declare (auto-focus TRUE))
    ;; If there is a question with ident ?id...
    (MAIN::question (ident ?id) (text ?text) (type ?type))
    ;; ... and there is no answer for it
    (not (MAIN::answer (ident ?id)))
    ;; ... and the trigger fact for this question exists
    ?ask <- (MAIN::ask ?id)
    =>
    ;; Ask the question
    (bind ?answer (ask-user ?text ?type))
    ;; Assert the answer as a fact
    (assert (MAIN::answer (ident ?id) (text ?answer)))
    ;; Remove the trigger
    (retract ?ask)
    ;; And finally, exit this module
    (return))

I’ve explicitly qualified all the fact names with MAIN::. Although doing so may not be strictly necessary, it helps to avoid confusion. All of your templates are defined in the module MAIN, and therefore they can be shared by all the other modules you define.

You can test this rule after defining a deffacts to hold a few sample questions. You should definitely put this test deffacts into a file, rather than just entering it interactively—you’ll use it again and again to test the rules as you develop them.

 

Note

You should be thinking about putting together a complete test harness now. The details here will vary depending on your platform. On UNIX, you might write a shell script to execute your test code, and on a Windows operating system, you might use a .BAT file (or run the same UNIX scripts using Cygwin).[1] The important thing is to make it convenient to run your test code, and run it often, ideally after each change you make to the developing system. Watch for changes that lead to test failures; if you catch them right away, it is easy to back them out while they are still fresh in your mind. Appendix C describes one technique for automated testing of Jess language code.

1 Cygwin is a porting layer that lets UNIX tools run on Windows. The Cygwin home page is at http://www.cygwin.com.

 

Here are some test facts you can use to test ask-question-by-id:

(deffacts MAIN::test-facts
    (question (ident q1) (type string)
              (text "What is your name?"))
    (question (ident q2) (type number)
              (text "What is your estimated annual income?"))
    (question (ident q3) (type number)
              (text "How many dependents do you have?")))

To test the rule, you just need to reset, assert an appropriate ask fact, and run. You can use (watch all) to help see what happens:

Jess> (batch taxes.clp)
TRUE
Jess> (reset)
TRUE
Jess> (assert (ask q2))
<Fact-4>
Jess> (watch all)
TRUE
Jess> (run)
FIRE 1 ask::ask-question-by-id f-2,, f-4
What is your estimated annual income? 15000
==> f-5 (MAIN::answer (ident q2) (text 15000))
<== Focus ask
<== f-4 (MAIN::ask q2)
==> Focus MAIN
<== Focus MAIN
1

When you enter (run), the rule ask-question-by-id grabs the module focus and fires. It asks the question and asserts a new fact holding the answer. Then the focus returns to the original module (MAIN, in this case). The subroutine, then, consists of one module, one rule, and two functions, and you can call it just by asserting a fact.

10.7. Summary

In this chapter, you began to turn the knowledge you developed in chapter 9 into a concrete rule-based system. You determined the form the data in working memory will take, you partitioned the system into modules that represent the phases of processing, and you developed some input/output functionality you will use later.

In the next chapter, you will develop the rules that form the actual application. We’ll pay special attention to testing techniques, so that you’ll trust the components of the application to work well individually and as a complete system.