Proceeding of The National Conference by hiy10027


									                                                                                  Proceeding of The National Conference
                                                                               on Undergraduate Research (NCUR) 2002
                                                                                     University of Wisconsin-Whitewater
                                                                                                       April 25-27, 2002

  Efficient Method Of Financial Data Processing: Development Of An XBRL
            Based, Expert System Loan Processing Web Application
                                            Joseph J. Hallett III
                                      Department of Computer Science
                                            Colgate University
                                            Thirteen Oak Drive
                                        Hamilton, NY 13346. USA

                              Faculty Advisor: Dr. Alexander Nakhimovsky

         In August of 1999 the American Institute of Cert ified Public Accountants, six informat ion technology
companies, and the five largest accounting and professional services firms of the time announced the development
of an XM L-based specification for the preparation and exchange of financial data. The intention of these
organizations was to improve access to and lower d istribution costs of financial information. Presently called the
Extensible Business Reporting Language (XBRL), this specificat ion, though still in its formative stages, has gained
strong support as the global business community’s electronic financial report ing schema. The present research aims
to further develop the acceptance and modularity of XBRL by developing a general-purpose application. In addition
to implementing the XBRL standard, this application develops a general methodology for the common task of
analyzing a business loan application. The applicat ion integrates XBRL docu ment processing with an expert system
encapsulating rule-based financial logic. For expert system develop ment, the application uses the Java Exp ert
System Shell (JESS) developed at Sandia National Laboratories. JESS (based on earlier CLIPS) utilizes the Rete
Algorith m developed by Charles L. Forgy at Carnegie-Mellon University to apply the rules to the application’s
current knowledge base. With the addition of the Rete A lgorith m, CLIPS/JESS reduces the computational
complexity of expert systems fro m an exponential order (associated with the standard “rules finding facts”
approach) to a linear order. This results in an efficient method of financial data processing.
Keywords: XML, XB RL, Expert System

1. Organization
          In outline, this paper proceeds as follows. Sect ion 2 provides the reader with a basic working knowledge of
the technologies that the loan processing application will utilize, primarily the XM L standard and expert systems.
Section 3 d iscusses the financial logic encapsulated by the expert system that underlies the business loan decision
process. Section 4 will analyze the dynamic capabilities of the simulated loan expert in co mparison to its human
counterpart. Section 5 presents the design of the Web-based loan processing application. The final section
summarizes the advantages of an XML based, expert system loan-processing tool and discusses future developments
for the application.

2. Background Information
2.1. XML and XBRL
          The World Wide Web Consortium (W3C) first established the eXtensible Markup Language (XM L) in
1997. Since that time XM L has become an integral co mponent of distributed Internet applications. Data formatt ing,
data processing and data transformation are among the numerous applications based on XML.
          XM L is the Internet ready successor of the Standard Generalized Markup Language (SGM L). SGM L is an
international standard for the system-independent representation of text in electronic form. SGM L was orig inally
developed for the publishing industry, without any intention of Internet application use. But as electronic publishing
became increasingly popular the need for system-independent data transmission presented itself. In response, the
W3C announced its XML reco mmendation (derived fro m SGM L) to allev iate the large -scale demand for data
exchange over the Internet.
          XM L, like SGM L, is not as much a language as it is a framework for defin ing and using markup languages.
To better understand XML it is necessary to understand the essence of a language. All languages are composed of
two basic elements: the vocabulary and the syntax. Therefore, to define a language is to define its vocabulary and
its syntax. The vocabulary of a language is easily defined by listing its words. The syntax, or the way in which
words are put together to form constituents of the language (phrases, sentences, or computer co mmands), is defined
by syntactic rules called a grammar. XM L has the ability to list the vocabulary and define the grammar of a markup
language. XM L uses the standard notation of a Docu ment Type Definition (DTD) to perform this task.
          Another important attribute of a language is its semantics. XM L languages are fo rmal or uninterpreted
languages because their definit ion does not impose a meaning. This is XM L’s greatest asset. Not only does XML’s
unconstrained interpretation allow infinite data descriptions, but this also allows different applications to assign
different semantics to the same data set. This flexib ility of data representation provides a bridge of cooperation to
parties feuding over a common language or data format.
          Interoperability is another advantage of the XML technology. This feature stems fro m XM L’s ability to
exist in either linear text fo rm or in t ree form. A textual representation of an XML instance document is easy to
transmit through a network, and is easy to parse into a tree. Two tree rep resentations have developed. XPath is a
language for selecting sets of nodes within an XM L tree rep resentation. The Document Ob ject Model (DOM ) was
developed to allow access to and modification of XM L data via a programming language (Java, C++, Python, and
Visual Basic have all imp lemented DOM). A tree representation certainly makes the parsing of XM L mo re
intuitive, but the data must be converted back into a textual form to allow future transmissions. Together these two
forms of XM L data representation engender a powerful tool for the co mmunicat ion between humans and computer

2.1.1. XBRL as an XML Language

          The application discussed in this paper will make use of the advantages of XM L by processing data in the
form of the extensible Business Reporting Language (XBRL). The develop ment o f XBRL was announced in
August of 1999 as an XM L-based specification fo r the preparation and exchange of financial data. The intent of
XBRL develop ment was to benefit four categories of users: financial informat ion preparers, intermediaries in the
preparation and distribution process, users of financial information, and vendors of software or services to any of the
other three groups. XBRL provides these users with a standard format to prepare financial docu ments that can later
be presented in a variety of ways using the eXtensible Stylesheet Language (XSL). In addit ion, XBRL defines a
standard format for the exchange of financial informat ion between software applications.
          XBRL makes use of several W3C reco mmendations, including XM L 1.0, XM L Namespaces, XSL, and the
working draft of XM L Schema. The XBRL Schema defines the core low -level co mponents of XBRL. Rather than
defining the vocabulary and syntax of an XM L instance document, the XBRL Schema provides another level o f
abstraction, adding the extensible aspect to XBRL. The XBRL Schema is co mprised of XSD and DTD files wh ich
express how XBRL taxonomies are to be built. By acting in co mpliance with the XBRL Schema, an XBRL
taxono my delineates a tag set and grammar for a specific area of financial data record ing. At this present time only
two XBRL taxono mies are in existence and one remains in draft form. Th is paper will use the Financial Reporting
for Co mmercial and Industrial Co mpanies, US GAAP taxono my developed by Sergio de la Fe Jr., Charles Hoffman,
and Elmer Huh. A supplemental taxono my defin ing additional financial elements pertaining to the loan evaluation
process will also be used. A draft form of the International Accounting Standards (IAS) Specification is available
for reference fro m the XBRL official website ( Other taxono mies under development are: Financial
Reporting for US Federal Depart ments and Agencies, US Financial Reporting for Mutual Funds, Financial
Reporting for Co mmercial and Industrial Co mpanies in accordance with the German GAAP, the Australian GAAP,
and the New Zealand GAAP to name a few.

2.2. Expert Systems

          Now that the data representation methodology of the loan -processing application has been discussed, the
focus of this paper will turn to investigating the decision making process. In order for the application to be of any
practical use, it must output an accurate response to the user’s loan request. Any such accuracy requires the
application to mimic the knowledge base of an experienced loan officer. Th is knowledge base is encapsulated and
applied by an expert system shell. Professor Ed ward Feigenbaum of Stanford University, an early pioneer of expert
system technology, defines an expert system to be “… an intellig ent co mputer program that uses knowledge and
inference procedures to solve problems that are difficult enough to require significant human expertise for their
solution.” [1] Joseph Giarratano and Gary Riley further specify expert systems as an emulation of a human expert
rather than a simulation. [1] According to Giarratano and Riley, an expert system must act like its human model in
all respects. Ideally, expert system engineers would be able to construct a general problem-solving machine, with an
infinite knowledge base and comprehension of all problem-solving techniques. Until this ideal solution is reached,
expert systems are confined to a specific problem do main. The problem do main of this applicat ion’s expert system
is finances, in part icular, loan acceptance criteria. The expert system’s knowledge base is known as the knowledge
domain. A knowledge do main is always a subset of the problem domain, although an omnipotent expert system
may possess a knowledge do main that is equivalent to the problem domain.
          The knowledge of an expert system can be expressed in several ways. One co mmon method is in the form
of “if-then-else” type rules. A simple co mparison of the input to a set of “if-then-else” rules will imp lement the
desired actions. The expert system that our application will utilize is the Java Expert System Shell (JESS). JESS
represents its knowledge not only in the form of ru les but also as objects. This allows rules to use pattern matching
on the fact objects as well as input data.
          Expert systems differ fro m conventional applications in that the problem of the expert system often has no
algorith mic solution. As a result inference must be used to determine the most reasonable solution. Methods of
inference as outlined by Giarratano and Riley are listed below:
       Deduction: Logical reasoning in wh ich conclusions must follow fro m their premises.
       Induction: Inference fro m the specific case to the general.
       Heuristics: Rules based on experience.
       Generate and Test: Trial and error.
       Abduction: Reasoning back fro m a true conclusion to a premise that may have cause the conclusion.
       Default : In the absence of specific knowledge, assume general knowledge by default.
       Autoepistemic: Self-knowledge.
       Nonmonotonic: Previous knowledge may be incorrect when new evidence is obtained.
       Analogy: Inferring a conclusion based on the similarities to another situation.
       Intuition: Based on no proven theory. The answer unexpectedly appears, possibly due to unconsciously
          recognizing an underlying pattern. This method is not yet imp lemented by expert systems. [1]

          Co mmon sense reasoning would be categorized as a co mbination of any two of the above described
inference methods. Since inference is an integral part o f an expert system, it is helpful to implement an explanation
facility with in the expert system to explain the reasoning that occurred during the expert system’s deductions.

2.2.1 The Java Expert System Shell and the Rete Algorithm

          The loan-processing application described in this paper makes use of JESS developed at Sandia National
Laboratories in Californ ia. JESS is based on the earlier CLIPS expert system tool developed by NASA in 1985.
JESS is written entirely in Sun Microsystem’s Java programming language. Therefore it will be n icely embedded
within our Java-based application, keeping the application 100% portable. JESS makes use of the Rete Pattern
Matching Algorithm, developed by Charles L. Forgy at Carnegie-Mellon University, to determine which rules have
their conditions satisfied. [1] The need for this algorith m stems fro m the lack of efficiency in the simp listic “rules -
find-facts” approach. Due to the dynamic nature of our applicat ion, the facts may be modified, added, or removed.
These changes can cause previously unsatisfied rules to meet their co nditions, necessitating the inference engine to
execute another cycle of rule -fact co mparison. However, an expert system enacting the process described above
exhibits a property called temporal redundancy. The actions of the rules typically only alter a s mall percentage of
the facts. Therefore, by applying every ru le to the entire fact base, numerous unnecessary computations are made.
The Rete algorith m is designed to take advantage of the temporal redundancy. This is accomplished by saving the

state of the pattern matching after each execution cycle and only re -co mputing the changes to the fact base states.
For examp le, if t wo of three patterns were matched for a particular ru le during the first execution cycle, then the
second execution cycle would only need to be concerned with the fulfillment of the third pattern based upon the
updated facts. By taking advantage of temporal redundancy the Rete algorith m reduces the computational
complexity of expert systems fro m an exponential order (associated with the standard “rules finding facts”
approach) to a linear order. The major disadvantage of the Rete algorith m however, is that it is very memory
intensive; saving the state using partial matches for each rule can consume significant amounts of memory,
particularly if the ru le base is a large size. Th is is a typical t ime -space trade-off in algorith m design.

3. Financial Logic
          Below is a listing of the typical financial criteria utilized by the business loan evaluation process. The term
“typical financial criteria” is used because after querying several banks it was concluded that every loan is evaluated
on a case-by-case basis. This adds significant difficulty when attempting to emu late a loan expert electronically.
This problem will be discussed in further detail below. The financial information required by the applicat ion can be
divided into three sections: loan request information, business information, and business owner or guarantor
informat ion.

Loan Request Information:
     Loan Type: Term Loan or Line o f Credit Loan.
     Amount of Cap ital Requested.
     Repay Term: 1-8 years.
     Interest Rate: Fixed or Variable.

Business Information:
     Age of the Business.
     Nu mber of Employees.
     Total Cap ital in Business Accounts.
     Business Net Income for the Previous Year.
     Business Structure: Proprietorship, Limited Partnership, Corporation, Limited Liab ility Co mpany, General
        Partnership, or Other.
     Has the Business Ever Been Threatened by Claim or Lawsuit?
     Has the Business Ever Filed for Bankruptcy?
     Does the Business Owe Any Taxes for Years Prior to the Current Year?

Business Owner/ Guarantor Informat ion:
     Percent Ownership/Guarantee
     Annual Adjusted Gross Income.
     Personal Assets.
     Monthly Obligations.
     Has the Owner/ Guarantor Ever Been Threatened by Claim or Lawsuit?
     Has the Owner/ Guarantor Ever Filed for Bankruptcy?
     Does the Owner/ Guarantor Own Any Taxes for Years Prio r to the Current Year?
     Any Other Owners/ Guarantors?

This data are used by the application’s expert system co mponent to determine the applicant’s loan approval status.
The JESS rules are listed below. The rules are d ivided into two categories: business rules, for input data pertaining
directly to the business, and owner/guarantor rules (a separate set of rules is created for each owner or guarantor
input data). The application has been implemented in such a way that the framework of each rule is static, but the
values, which the rules assert, are dynamic. The dynamic values are enclosed by ?.
          Each JESS rule returns a Boolean value. At the termination of the JESS execution the resultant values are
checked. If all rules have returned a true value then the loan is approved and the monthly payments are displayed.
Otherwise the loan is rejected and the rules causing the rejection are displayed.

JESS uses the following business rules to assess the loan:
     (capital request + (capital request * ?interest rate?)) <= (total business capital * ?percentage of total
        business capital that must be met?)
     (business annual gross income) >= ((capital request / repay term) + ((cap ital request / repay term) *
        ?interest rate?) + (12 * business monthly obligation))
     (?credit score cutoff?) <= (applicant's credit score)

JESS uses the following owner/guarantor rules to assess the loan:
     ((capital request * percent guaranteed) + ((capital request * percent guaranteed) * ?interest rate?) <= (total
        owner/guarantor assets * ?percent of the assets that must be met?)
     (inco me) >= ((capital request / repay term) + ((cap ital request / repay term) * ?interest rate?) + (12*
        owner/guarantor monthly obligations))

         Notice the third business rule involves the applicant’s credit score. It is in the calculat ion of this score that
the majority of the mean financial criteria are utilized. The applicant’s credit score is calculated by the application
of If-Then-Else ru les below. As above, the dynamic values are enclosed by ?.

The business credit score is calculated as follo ws:
     ADD (((business' number of emp loyees) - (?zero points number of employees?)) / (?emp loyees
        increment?)) * (?X points?)
     ADD (((age of the business) - (?zero points age?)) / (?age increment?)) * (?X points?)
     IF (business has experienced a lawsuit) THEN (?X points?) ELSE (?Y points?)
     IF (business has experienced bankruptcy) THEN (?X points?) ELSE (?Y points?)
     IF (business owes taxes) THEN (?X points?) ELSE (?Y points?)

The owner/guarantor credit score is calculated as follo ws
     IF (owner/guarantor has experienced a lawsuit) THEN (?X points?) ELSE (?Y points?)
     IF (owner/guarantor has experienced bankruptcy) THEN (?X points?) ELSE (? Y points?)
     IF (owner/guarantor owes taxes) THEN (?X points?) ELSE (?Y points?)

        Each owner/guarantor score is then adjusted proportionally to the percentage of the loan for wh ich they are

4. Capability Analysis
          Though the financial expertise of this application mimics the decision making process of a human loan
evaluator, the system does not represent an exact emu lation. As mentioned above, loan decisions are often made on
a case-by-case basis, where certain facts are evaluated in one case and not considered in another. For examp le, a
short-term med ical d isability or a recent divorce could lead the bank to relax loan acceptance requirements,
assuming good standing in all other aspects of the loan. However, the decision making process that would
incorporate this type of data would necessitate foresight or even empathy, two capabilities that a digital loan
evaluator does not yet possess. Of course, the application could be pro mpted to inquire as to the health and marital
status of each loan applicant, but this gives rise to another problem. How deeply should the application investigate
the applicant’s life? It was determined advantageous to disregard the set of all case specific facts due to the
seemingly infinite size and lack of knowledge to prioritize its members. Th is choice prevents the complete
emu lation of an expert loan decision maker for the time being, and will be investigated further in the future.

5. Application Process
         Figure 1 shows a diagram depicting the process of the loan application. The application makes use of Sun
Microsystem’s JavaServer Pages to instantiate Java classes and return the final response. At any time, if the Java
classes encounter a HTTP exception, the exception is passed to the error.jsp file, wh ich is responsible for presenting
the exception to the browser. Initially, the user is pro mpted to enter the local file name or the URL of the XBRL

instance document by the Singerie.jsp file. The inputted document name is then passed to the
DispatcherInstanciation.jsp file. DispatcherInstanciation.jsp will then instantiate and initialize the Dispatcher class.
The doIt() method of the Dispatcher class will first call the doPost() method of the FileTransfer class, passing this
method the HttpServletRequest that holds the XBRL file name. The doPost() method will upload the desired XBRL
file to the server machine by means of the com.oreilly.servlet.Multipart Request class defined by Jason Hunter. [2]
           After the XBRL file has been uploaded to the server, a Java file object is created fro m the file and returned
to the Dispatcher class. At this point, the Dispatcher class calls the makeDOM () method in the UseXBRL class to
create a DOM representation of the XBRL file object. The XBRL DOM provides a means for acce ssing and
man ipulating XBRL data. After the DOM has been created, the Dispatcher class will call the grabNode() method of
the UseXBRL class. This method accepts a string as a parameter and parsers the XBRL DOM for the element
corresponding to the string value. If found, the element’s value is return to the doit method of the Dispatcher class
which then will save the value into the appropriate JavaBean class (LoanRequestBean, BusPropBean or
           Once all the data have been extracted fro m the DOM, the application will in itiate the loan evaluation phase.
First, the credit score is calculated by means of the calculateScore method within the ScoreCalc class. This score as
well as the business and owner/guarantor XBRL data is passed to the Jess class for expert system analysis. The Jess
class will return a Boolean array to the Dispatcher class, in which each index signifies a rule decision. If the
Dispatcher class finds that the applicant received a true value for each expert system rule, then the loa n is approved.
The Dispatcher class will pass an approved message to DispatcherInstanciation.jsp, which will print this message to
the browser. In addition to the loan approval decision, the application will output the applicant’s credit score
calculation and JESS facts, all rules, and rules fired for each expert system execution.


       HTTP Exception                                             HTTP Exception

                         Local XBRL File Name or URL
 Singerie.jsp                                                          DispatcherInstanciation.jsp
  (Ho me Page)
                                                    HTTP Request
                                                    (Upload File)        HTTP Response
                                                                         (HTM L formatted Loan-Processing Decision)

                                                                           Raw Loan-Processing Response
                                               doit()                              Business and      
                                                                                   Owner/ Guarantor          (Expert System)
 HTTP Request                                                                      Attributes and
 (Upload File)                                                                     Cred it Score

                                                                                   Cred it Score
             If error: HTTP Response                                       Business and
             (Stack Trace)                                                 Owner/ Guarantor   ::
                                                                                                       calcuateScore() ::
doPost(HttpServletRequest                     Uploaded File Object
 req, HttpServletResponse                                                    String of XBRL
           resp)                                                             Node Value

                                          ::                 ::
                                                     MakeDOM()                         GrabNode(String nodeName)
                     Uploaded File Object ::

                                              Figure 1 Application Process Map

6. Summary and Future Developments
           An XBRL based, JESS loan-processing web application provides a portable and cross -platform method to
assess business loan requests that is more efficient than the standard “rules finding facts” approach. This method
takes advantage of the highly communicative ab ilities of XBRL so that all client-types with an XBRL document are
able to request a loan evaluation. In addition, the automated expert ise enabled by the application’s JESS co mponent
allo ws increased availability, reduced cost, permanent and steady expertise, and fast response to all application user
requests. For these reasons this application embodies an advantageous approach to loan processing.
           The author plans to expand upon research presented in this paper by pursuing the integration of an expert
system with XM L data processing. By revamp ing the current loan-processing application and developing two XM L
DTDs for expert system ru les and facts the author would like to create a general purpose “black bo x” fo r more
efficient ru le-based XML processing. This generalized application will receive two XM L docu ments as input: one
that contains the expert system facts data, and one that contains the expert system rules data. Based on these
documents the application will create a CLIPS/JESS knowledge base and rule engine . By apply ing the derived rules
to the knowledge base the application will output a yes/no result as well as a description of all intermed iate logical
steps: the result of all rules fired, fact alterations, and partial pattern matches. CLIPS/JESS provide s these
capabilit ies. Such an applicat ion would allo w linear ordered ru le-based computation of any XML data as well as
encourage the highly commun icative and flexible data representation of facts and rule -based expertise that XM L

7. Acknowledgements
         The author wishes to express his appreciation to both the Onieda Savings Bank and the Alliance Ban k in
Hamilton, New York for their patient assistance with the financial criteria portion of this application. A debt of
gratitude is also extended to Vanessa Kramer for her levity and enduring support.

8. References
(1)    Giarratano, J. and Riley G., 1994, Expert Systems, Princip les and Programming, PWS Publishing
Co mpany, Boston, MA, pg. 1, 119.
(2)    Hunter, J. and Crawford, W., 1998, Java Servlet Programming, O’Reilly and Associates, Inc, Sebastopol,
(3)    McLaughlin, B., 2000, Java and XM L, O’Reilly and Associates, Inc, Sebastopol, CA.


To top