10 Questions to Ask When Evaluating a Data Quality Solution by usvoruganti


									10           Questions to Ask

When Evaluating a Data Quality Solution

A White Paper
By Thomas Brennan and Steven Kleinmann
                                                        5/12/2004                           WHITE PAPER “10 QUESTIONS”

Executive Summary                                                   5) Best record capability: Once duplicate
Selecting a DQ application can be a complex en-                     records have been detected, the data quality ap-
deavor. There are many issues to consider that                      plication requires the functionality to extract
can directly impact the success or failure of DQ                    from among the duplicates the “best” (most ac-
deployment within your enterprise.                                  curate) record and use it as a basis for merging
                                                                    the “best data” from the remaining duplicates.
Stalworth Inc. has compiled a list of 10 key
questions to ask vendors of data quality solu-                      6) Streamlined testing and QA: A modern
tions. The answers to those questions are pro-                      data quality application enables business users
vided in this white paper to help you determine                     (not programmers) to create rules and test them
which DQ product will best meet your needs.                         during the creation process against current data
                                                                    that will not be affected by testing.
To summarize the following pages, your data
quality solution should have these attributes:                      7) Comprehensive data capture: Your data
                                                                    quality application must scrutinize data at all
1) Completeness: It should have all the com-
                                                                    points of entry – input from staff, suppliers and
prehensive elements required to achieve data
                                                                    customers, data migrations, and so on. Scrutiniz-
quality, including preconfigured application con-
                                                                    ing the database alone is not sufficient.
nectors for quick installation, prepackaged rules
for a quick start, a powerful rules generation en-
                                                                    8) Ease of integration: The data quality solu-
gine, and comprehensive monitoring and audit-
                                                                    tion you select should be Java-based and come
ing functionality.
                                                                    equipped with custom connectors tailored for the
                                                                    enterprise applications you run, be they from
2) Enterprise design: Unlike other solutions
                                                                    Siebel, SAP, Oracle, or PeopleSoft.
that work only with flat files, your enterprise
data quality solution must be able to function di-
                                                                    9) Processing power: With the volumes of
rectly with relational databases, and ensure that
                                                                    data processed in today’s enterprise environ-
alteration of records is reflected in the tables that
                                                                    ment, your data quality solution should be de-
relate to those records.
                                                                    signed to minimize strain on server resources and
                                                                    at the same time have the power to merge mil-
3) Ease of use: Configuration, rule setup, and
                                                                    lions of records in hours rather than days.
rule testing should be manageable by your busi-
ness analysts without the intervention of IT per-
                                                                    10) Vendor support: Your data quality vendor
sonnel. The user interface should have the fa-
                                                                    should offer staff expertise and end-to-end ser-
miliarity of a Web application, and command
                                                                    vices for installation, configuration, and support,
syntax should be “plain English.”
                                                                    along with partnering relationships with systems
                                                                    integrators who are familiar with your organiza-
4) Sophisticated rules engine: The power
                                                                    tion’s unique requirements.
of a data quality application is in its rule engine.
It must employ intelligence that enables users to                   The remainder of this paper address each of
create rules that incorporate not only Boolean                      these important issues by presenting questions
(Yes/No) logic, but also “fuzzy” logic where                        that should be asked of the data quality vendor,
more subtlety can be exercised in rules creation.                   followed by answers that address the needs and
                                                                    concerns of an organization seeking a compre-
                                                                    hensive, enterprise-scale data quality solution.

© 2004 Stalworth Inc.                           www.stalworth.com                                        Page 1 of 12
                                               5/12/2004                     WHITE PAPER “10 QUESTIONS”

              Does the data quality product provide a complete solution, including pre-
              packaged DQ business rules and built-in testing, monitoring and audit functions, with
              pre-configured application connectors for quick and easy connection to your enterprise
              applications (Siebel, SAP, Oracle, PeopleSoft)?

Integrating a data quality product with an enterprise application requires a major commitment of
time and skilled resources from both IT and analysis departments. Integration requires an in-depth
knowledge of the application’s schema, and knowledge of both the application’s and the data
quality product’s API, as well as a significant development effort. Speed of integration is greatly
enhanced if the DQ product’s API is tailored specifically to your enterprise application.

A DQ product’s application connectors should interrogate databases automatically to map their
table structures. In the case of enterprise applications (e.g., Siebel, SAP, Oracle, PeopleSoft): ide-
ally, connectors should automatically understand the relations between the tables. Leading-edge
DQ products will accomplish this by interrogating the application metadata.

Pre-packaged business rules provide a foundation for the business user. There are six types of
business rules: Correction & Standardization, Search & Replace, Duplicate Identification, Best
Record, Best Data and Merge Disqualification. Each has unique subtleties. A data quality solution
that includes prepackaged rules can get the data quality process underway much more quickly be-
cause it gives you a running start in the complex composition of each rule type.

The same complexity is true for duplicate record identification. If two customer records have the
same first name, last name, company name and phone number, they may be duplicates. Then
again, in the case of Bob Smith who owns a McDonald’s franchise and provided the corporate
800 number, they may not be. The business implications of deleting a record that is not a dupli-
cate can be serious.

A leading data quality solution should be delivered with pre-configured application connectors
and pre-packaged business rules that are based on industry best practices. Pre-configured applica-
tion connectors enable seamless integration with commercial applications and database platforms.
Pre-packaged business rules can provide a robust model from which a company can tailor to their
specific needs. Both greatly reduce the time required to implement the data quality solution.

Questions to ask a data quality vendor:
    Does the product have preconfigured application connectors for quick integration with pack-
    aged applications from Siebel, SAP, Oracle, and PeopleSoft?

    Does the product automatically interrogate the database for tables and fields?

    What out-of-the-box knowledge does the product have of the database schema for Siebel,
    SAP, Oracle or Peoplesoft?

© 2004 Stalworth Inc.                     www.stalworth.com                              Page 2 of 12
                                                5/12/2004                     WHITE PAPER “10 QUESTIONS”

              Was the product developed specifically to address data quality issues in enterprise ap-
              plications? Is the product’s functionality limited within enterprise applications?

            Most previous-generation data quality products were designed to manage data in flat
            files: simple, one-dimensional data structures resembling Excel spreadsheets. But to-
day’s enterprise applications are built on top of relational databases that contain complex rela-
tionships among many data structures. Enterprise applications introduce logical layers on top of
the database, adding yet more complexity.

Modifying flat-file functionality does not adequately address the data quality needs of enterprise
applications. Third-party utilities are usually required to do the back-and-forth conversion from
flat file to database format, slowing processing time and increasing load on servers and other re-
sources. For example, a duplicate record merge using first-generation DQ technology could bog
down your servers for days or even weeks.

Only data quality products whose architecture has been designed from the outset to process rela-
tional databases without third-party add-ons can provide the performance and sophisticated rule
sets required for efficient operation at the enterprise level.

Questions to ask a data quality vendor:

    Can your DQ product merge duplicate records in a relational database (as opposed to a flat
    file) without the need of third-party utilities?
    Does the product manage all aspects of correctly reconciling the associated child records of
    a corrected record, even if the application does not? If the answer is yes, ask for an explana-
    tion of how the procedure works.

              How easy is the product to configure? Can a business analyst design, build and test
              business rules or is a programmer’s assistance required?

           Business analysts generally determine how data should be cleansed, what criteria
           should be used to determine if records are duplicates, under what conditions records
should be merged and if merged, how to construct a “best record.” Therefore it makes sense that
business users, not IT staff, should define the rules that drive these data quality processes.

First-generation data quality products require low-level programming to create business rules.
Since technicians know nothing about business rules, a complicated working arrangement must
be put into place, where analysts specify what elements the rules should contain, and technicians
write the rules and test them against sample data.

A new generation of data quality applications enables business analysts to write the rules them-
selves, without the need of programmers. By incorporating a familiar Web-browser user inter-
face, a simple point-and-click composition process, and plain-English rule syntax, next-
generation data quality applications have moved DQ operations from the IT department to where
they belong: the PCs and laptops of the organization’s business analysts.

© 2004 Stalworth Inc.                      www.stalworth.com                              Page 3 of 12
                                              5/12/2004                      WHITE PAPER “10 QUESTIONS”

Before choosing a data quality vendor, assess your internal organization and determine the roles
that business users and IT would play in data quality configuration and implementation. If your
business analysts determine the criteria for manipulating, modifying, merging and reconciling
data, they should create the business rules themselves. The easier it is for your analysts to develop
data quality rules, the faster the transition to enterprise-wide data quality will be.

Questions to ask a data quality vendor:

    Are IT resources needed to configure business rules?
    If yes to above question, to what extent does the company recommend that IT users be in-
    volved in rule configuration?

              Can the data quality product create complex and intelligent business

            A data quality product should be able to execute rules that mimic the decisions and
actions a human would take. An intelligent rules engine allows the user to describe data attrib-
utes—for example, what identifies a particular customer record as a duplicate—very precisely
and in great detail.

Lack of an intelligent rules engine inhibits the positive identification of duplicate records, which
in turn prevents the customer from implementing the all-important merge process. If duplicates
cannot be identified with certainty, or if only a small percentage of duplicates can be identified,
companies must either hire people to manually complete the task, or live with poor quality data.
Without powerful capabilities for positively identifying duplicate records, data quality goals are
not achieved and companies are left with unmanageable volumes of duplicate records.

A data quality product should be able to execute logic that greatly reduces the need for human in-
tervention, so that processes like identifying duplicates and merging them can be almost entirely
automated. Automated duplicate identification and merge is key to handling enterprise-scale vol-
umes of data and maximizing the ROI from your enterprise applications.

The methods used for duplicate identification vary greatly among data quality products. Some
vendors employ tokens: strings of characters created from pieces of other data. For example, a to-
ken could be a first name concatenated to a last name, which is also concatenated to the first line
of an address, which is also concatenated to the phone number. Sometimes a token will be built
from parts of fields, like the first five characters of the last name instead of the entire last name.
This primitive form of duplicate identification dates back to first-generation DQ systems working
on flat files.

More advanced products use an algorithm whose mathematical formulas compare any relevant
data and discern duplicates. Other products will use multiple algorithms that act in concert to
scrutinize data more closely.

© 2004 Stalworth Inc.                     www.stalworth.com                              Page 4 of 12
                                              5/12/2004                     WHITE PAPER “10 QUESTIONS”

Still other products provide the ability to use a variety of algorithms—since one algorithm might
work best on a particular kind of data—as well as a variety of Boolean operators. This capability
enables the enterprise to build a sophisticated web of rules that “traps” duplicate data though si-
multaneous comparison of data elements from many angles. This approach allows records to be
examined much the way a human would, but with greater consistency and speed than a human
could provide. These next-generation data quality products are fully configurable, enabling the
user to tailor these rules to their specific business data.

Tokens do not allow for this intelligent scrutiny of data, nor do simple rules. The ability to exe-
cute a series of intelligent rules is necessary to mimic the decision patterns that a human would
use to identify duplicates. Only a modern data quality application with a powerful rules engine
can enable the creation of rules with sufficient intelligence and subtlety to detect duplicates with
the necessary degree of accuracy and speed to make enterprise-wide data quality feasible, given
limited resources, both human and machine-based.

Additionally, it is important that a DQ product have the ability to traverse complex data relations
in order to make decisions. If two account records are identified as duplicates, it may be neces-
sary to examine associated records to determine which of the two records should survive the
merge process. In order to examine contract records associated with accounts, the data quality
product must have the ability to relate them. Older, flat-file-oriented data quality products can be
severely limited in this vital area.

Many data quality products can compare one account record with another account record, or one
contact record with another contact record, but are limited in their ability to scrutinize parent and
child records, to make decisions based on related data, or to understand complex data relations.
Humans can make these associations without conscious thought. Because ancillary data is often
critical to the decision-making process, the ability to scrutinize it can be the difference between
success (a high yield of “positive” duplicates) and failure (a low yield of positives and many
“false positives” or “suspects” requiring human scrutiny).

Questions to ask your data quality vendor:

    Can the product compare data from Accounts, Contacts, and Agreements in a single rule to
    identify an Account duplicate?
    Can the product execute a series of rules in a single operation to positively identify duplicates
    for a given customer record?

              Does the data quality product provide powerful functionality for creating a best
              record and salvaging the best data?

            In database applications, merging two or more duplicate customer records involves
            saving one record – the “best record” – and deleting the other. For example, if a du-
plicate group consists of five records, merging involves saving one record and deleting the other

© 2004 Stalworth Inc.                    www.stalworth.com                              Page 5 of 12
                                               5/12/2004                     WHITE PAPER “10 QUESTIONS”

When consolidating several records into one, you want to ensure that the surviving record con-
tains all of the best data elements from “non-surviving” records. To ensure this and to automate
the best record creation process, intelligent rules are required. Intelligent rules are needed to iden-
tify which record within a group of duplicate records has more desirable data than the other re-
cords. Once the best record is selected, intelligent rules are needed to identify those data elements
within the records that will be deleted. This is necessary so that the most valuable data can be
identified and copied from the records to be deleted to the record that will survive, before the
non-surviving records are deleted. This ensures that the surviving record is a composite of the
most accurate and valuable data.

First-generation DQ products have little or no capability for creating a best record or salvaging
best data. Deleting non-surviving records without salvaging good data they contain results in the
loss of valuable data. Some products have the ability to populate a null (empty or invalid) field
with non-null data. For example, if a surviving record does not have a phone number and a non-
surviving record does, some products can recognize the null and update the surviving record with
the phone number. This is helpful, but far from robust. It is essential that a data quality product
have the ability to save data elements that a human would if they were scrutinizing the data
manually. For example, a state-of-the-art DQ product should be able to identify which of two
phone numbers has a legitimate area code and prefix combination for a given address, which of
two phone numbers that have had an area code split has the more recent area code, or which of
two addresses is a legitimate USPS address.

Questions to ask your data quality vendor:

    Can an Account within a duplicate group be automatically identified as a survivor based on
    the status of a contract, without significant external preprocessing to associate the data?
    Can a phone value be identified as best data based on the legitimacy of the area code and

              Does the data quality product efficiently support the testing and QA of the data
              quality processes? Does the product provide robust functionality that allows business
              users to easily simulate the exact data quality processes before testing?

            In any data quality implementation, there are three main steps that must be executed
before deployment: rule definition, product configuration and quality assurance (QA). Of these
three elements, QA normally consumes more than 70 percent of project time and more than 80
percent of the resources required.

A first-generation data quality QA process typically involves the following complex steps:

    a.   A test system is procured
    b.   IT resources (including a DBA) build a copy of the production environment
    c.   Business analysts formulate data quality rules
    d.   IT staff program and execute the rules

© 2004 Stalworth Inc.                     www.stalworth.com                               Page 6 of 12
                                                 5/12/2004                       WHITE PAPER “10 QUESTIONS”

    e.   IT staff write scripts to test the rules’ effects on test data and validate results
    f.   Business analysts manually scrutinize the data to validate the results and then create test
         data sets to ensure that the rules are exercised as intended
    g.   End users manually scrutinize the data to validate the results

Assuming that the desired results were not achieved in the first iteration of this process (this is
almost always the case) steps “b” through “g” are revised and repeated. Depending on the amount
of data being tested, these processes are often repeated at least two and sometimes more than five
times. Some steps in this process can be difficult to achieve logistically. DBAs are not always
available to restore data between tests. Thus, QA can easily take from 4 to 8 weeks and consume
anywhere from 3 to 20 or more full-time staff resources.

Next-generation data quality products that can simplify, automate or eliminate much of the stan-
dard testing and QA processes can significantly reduce the cost of implementation and therefore
the total cost of ownership. Another important benefit of fast and efficient testing without pro-
grammer intervention is a higher-quality set of business rules.

Question for your data quality vendor:

    Does the DQ product allow me to test rules on actual data without configuring a test system
    or risking the data? What features does the product provide to streamline the QA process be-
    sides batch job reports?

              Does the data quality product address bad data coming in from all points of entry as
              well as the bad data in your database?

            It is likely that your CRM system obtains data from a number of sources and that new
            data is constantly being added. Bulk data migrations, persistent system integrations,
third-party EAI products and application users provide a constant stream of newly created data. In
order to keep bad new data from replicating throughout the enterprise database, it is necessary to
clean this new data as soon as it comes in.

For this reason, it is important that a data quality solution provides features and functions for
cleaning and consolidating data as it is entered. This requires cleansing and consolidation func-
tionality that work in batch and also in persistent mode, an application interface that cleans data
as it is being entered or preferably prevents bad data from being entered at all, as well as APIs to
integrate smoothly with your custom-developed or current open-systems interface standards.

One-time batch processing of a database is only the starting point. Robust, interactive application
functionality brings to bear an army of application users who can clean your data (many of whom
previously helped create the dirty data problem), most often doing so in real time while the cus-
tomer is on the phone or on the Web, thus ensuring even greater data accuracy and completeness.
APIs that interact with your current and future system integrations are necessary to process the
volumes of data that are being automatically copied from system to system.

© 2004 Stalworth Inc.                       www.stalworth.com                                  Page 7 of 12
                                              5/12/2004                      WHITE PAPER “10 QUESTIONS”

Questions for your data quality vendor:

    Does the product provide a batch data cleansing solution?
    Does the product have APIs for modern interface standards, such as Web Services? What
    languages do the APIs support?
    Does the product provide full functionality embedded in commercial applications to interact
    with the user, preventing duplicate creation and ensuring clean data entry?
    Does the product include powerful interactive search functionality within commercial appli-
    cations, and not just cleansing and/or matching?

              Is the data quality product built on technology platforms that provide maximum
              compatibility and ease of integration as well as the greatest possible
              flexibility, scalability and availability?

All of the major enterprise applications (Siebel, SAP, Oracle, PeopleSoft) are built on the Java
platform in order to provide the greatest possible flexibility, scalability and availability. XML and
Web Services also maximize compatibility and ease of integration. Leading data quality products
are also developed on these platforms, and for the same reasons. They provide robust Java and
Web Services APIs as well as capabilities for importing and exporting data in XML format.

It is important when evaluating data quality products that you ensure that the product will work
with leading enterprise applications, that it will work with the greatest number of the applications
in your enterprise, and that the platform will be stable. If a DQ product is based on an older tech-
nology platform, you may soon find yourself involved in a major conversion project.

Questions for your data quality vendor?

    What language is the product written in?
    Does the data quality product have Java and/or Web Services APIs?

              Is the data quality product a high performance database application product?

           Performance in a data quality solution is vitally important for two reasons. First, it
           must have the capability to handle the tens of millions of records that populate enter-
           prise systems with both speed and minimal load on servers and other enterprise re-
sources. A next-generation DQ product can complete large cleansing and consolidation processes
in hours—not days, weeks, or even months, as are required by earlier DQ technologies.

Of all the processes that will be run in a full-scale data cleansing operation, record merge is by far
the most time consuming. It is common for merge to consume 80 percent or more of all data qual-
ity processing time. For example, in a Siebel database with 20 million records and a 20 percent
duplication rate, 4 million records might need to be merged during an initial cleansing operation.
A typical first-generation data cleansing product for the Siebel market merges records at a maxi-

© 2004 Stalworth Inc.                     www.stalworth.com                              Page 8 of 12
                                                5/12/2004                     WHITE PAPER “10 QUESTIONS”

mum rate of 1,200 records per hour. To merge 4 million records would require 139 days of dedi-
cated server time. Running a merge process for 139 days may not be feasible: the system may be
taken offline for maintenance, it may be unavailable during backups, or it simply may crash while
the resource intensive merge is running.

The second reason that performance is a vitally important attribute of a DQ product is that high-
performance architecture makes possible the creation of more complex, more intelligent business

To illustrate this advantage, it helps to consider how the human brain works to identify duplicate
records. A person will look at a contact's first name, then their last name and maybe their phone
number. They may look to see if the first name is equivalent (e.g., Bob equals Robert), if the area
code on the phone number is different while digits 4 to 10 are the same, or whether or not the ad-
dress is the same. While doing so they will probably take into consideration other factors like typ-
ing errors. Each element or variation of an element that is considered consumes more brain cycles
(system time). In order to emulate the way the human brain scrutinizes records, a data quality
product must be able to execute many rules with multiple variations just to perform a single func-
tion like identifying a duplicate address. Because of the intense processing required, most data
quality products resort to the use of tokens (incorporating weak logic) or simplistic business rules
using Yes/No logic only). A new-generation data quality solution incorporates advanced architec-
ture that combines performance with intelligence, and applies both to the data within the time and
resource limitations inherent in every organization.

Questions for your data quality vendor:

    How many customer records does the product merge within a commercial enterprise applica-
    tion in an hour?
    What logic would typically be used to positively identify duplicate records?
               Does the vendor offer end-to-end services to help you execute your data qual-
               ity project? Does the company have staff expertise not just in data quality, but also in
               systems integration and data migration? Does the vendor have partnerships with other
               solution providers who can contribute specialized skills and services?

A successful data-quality initiative transcends mere technology. To fully realize the potential
benefits of your enterprise application, you require comprehensive consulting and support ser-
vices from recognized data-quality experts who understand both the technical and business impli-
cations of a data quality initiative. In order to understand and address the complex data quality
issues within an enterprise application, the vendor must have significant experience implementing
enterprise applications.

For large organizations, especially Global 2000 enterprises, cleansing huge volumes of customer
data can have serious consequences for good or bad, depending on how the project was planned
and implemented. Having a next-generation data quality application gives you the tools required

© 2004 Stalworth Inc.                      www.stalworth.com                              Page 9 of 12
                                                 5/12/2004                       WHITE PAPER “10 QUESTIONS”

for the job, but you also need a vendor whose consulting staff can impart the skills and techniques
your in-house staff will, through observation and supervised experience, acquire for themselves.

Every enterprise application reflects the unique culture of its vendor (Siebel, SAP, Oracle, or Peo-
pleSoft) and the organization in which it is installed. While the data quality solution provider
can’t be expected to know all the possible combinations of a given installation, they should have
partners who do, and can be brought in when the occasion warrants.

Data quality initiatives are often part of large data migration projects, or are driven as the result of
a large data migration initiative. The ideal solution is to bring in a DQ vendor with expertise in
both areas. Not only is the product installed and configured, but the data is merged under the su-
pervision of experts who pass along their expertise to their clients.

Questions for your data quality vendor:

    Does the vendor have staff consultants or partners who can help manage the DQ process
    concurrently with other major enterprise data processes?
    Can the vendor’s staff or partners assist the client in developing data quality strategy as well as
    business rules that meet the organization’s specific needs for their enterprise applications?
    Has the vendor managed and executed large-scale implementations? Do they partner with
    systems integrators to provide a broader range of resources?

Seek Out a Total Data Quality Solution
The questions above lead to an inescapable conclusion: The data quality vendor you select must
provide a next-generation product that is usable by your staff analysts with minimal IT involve-
ment. It should be supplied with database and application connectors that facilitate installation, a
simulator to speed up rule testing, and monitoring and auditing functions. Their product’s archi-
tecture must be designed from inception to work with relational databases, not flat files. It must
be fast enough to minimize stress on time and resources. Its rules engine must be capable of de-
signing the most sophisticated rule sets, yet its user interface should be simple enough for busi-
ness users: point-and-click functionality and plain-English syntax.

But beyond the product, the vendor must also have the expertise – in-house or through partnering
arrangements with respected integrators – to provide the end-to-end service that ensures your data
quality implementation will be a smooth one, from planning to configuration and on to day-to-
day operation. Data quality is an emerging sector, and few vendors have the product or depth of
experience to provide the answers you require to the above questions. A few do, however, and we
at Stalworth wish you a successful search for the data quality solutions provider who can meet
your needs.

© 2004 Stalworth Inc.                       www.stalworth.com                                 Page 10 of 12
                                                        5/12/2004                              WHITE PAPER “10 QUESTIONS”

Data Quality Checklist
If your company is evaluating a data quality solution that will meet your current and future enterprise ap-
plication needs, the checklist below provides some points to consider.

        Enterprise Data          Next-Generation Data Quality:                          Typical First-Generation
  Quality Requirements             DQ*Plus from Stalworth                                Data Quality Solution
     Services-Oriented     J2EE, Java, XML, SOAP, Web Services               C, Cobol
   Relational Database     Designed from inception to manage data in         Designed from inception to manage data in
           Orientation     relational databases                              flat files, limited ability to merge records or
                                                                             execute functions in a relational database
          Pre-Packaged     Auto-interrogating connectors for Siebel,         Very limited integration to commercial appli-
            Connectors     SAP, Oracle and PeopleSoft. JDBC connec-          cations, intense programming or “dump to
                           tors for custom systems.                          flat file” required for custom systems
         Pre-Packaged      For Siebel, SAP, Oracle and PeopleSoft,           Not available
        Business Rules     developed by experts based on industry
               Ease of     Designed for the business user, intuitive,        Designed for IT users, complex configura-
        Implementation     point and click user interface and plain-         tion and programming
                           English rules setup
        Rapid QA Cycle     SIMULATOR works with real data to reduce          Multiple iterations of rules definition, data
                           the QA cycle by 80 percent or more                loading, rules execution and results analysis
                                                                             required, involving analysts, IT intermediar-
                                                                             ies and users; takes weeks or months
   Intelligent, Powerful   Employs human-intelligence-like fuzzy and         Token-based or flat file oriented use of algo-
           Rules Engine    Boolean logic to execute intelligent and          rithms: rigid, inflexible, low-yield
                           complex business rules. Rules are applied
                           to data sequentially to examine it from di-
                           verse perspectives, correcting data and
                           identifying duplicates with far greater accu-
      Automated Merge      Intelligent, powerful rules engine identifies     Very limited.
                           duplicates with certainty, providing the con-
                           fidence to merge automatically, dramatically
                           increasing yield while reducing manual labor
                Robust     Intelligent, powerful rules engine employs        Very limited or non-existent.
         “Best Record”     sophisticated logic to identify the best data
             Capability    elements within duplicate records, combin-
                           ing this data into a best record
           Data Quality    Robust batch and persistent data quality to       Limited.
           Everywhere      cleanse and consolidate existing data in
                           your data store, interactive data quality to
                           clean data or prevent the introduction of bad
                           data at point of entry, Java and Web Ser-
                           vices APIs to facilitate integration with third
                           party (Tibco, WebMethods, etc.) or custom
     High Performance      Designed to handle the largest volumes of         Single threaded
                           data, multi-threaded, massive parallel proc-
                           essing, merge records 24 times faster than
                           competing solutions

© 2004 Stalworth Inc.                             www.stalworth.com                                          Page 11 of 12
                                                   5/12/2004                    WHITE PAPER “10 QUESTIONS”

Stalworth and Blue Hammock Contact Information
Stalworth, Inc. is a leading provider of Customer Data Integration (CDI) solutions. Stalworth's suite of
integrated products enables businesses to maximize profitability and customer satisfaction by leveraging
customer data within applications and across the enterprise. DQ*Plus is a comprehensive data quality solu-
tion that was designed for enterprise applications to enable companies to build a complete, accurate, con-
solidated and up-to-date view of their customer accounts.

Blue Hammock specializes in Data Management and Customer Relationship Management (CRM) consult-
ing and integration services. In conjunction with leading Data Management and CRM technologies, Blue
Hammock’s strategic and implementation services are the glue that connects people with technology.

                                               Stalworth, Inc.
                                           Worldwide Headquarters
                                         1840 Gateway Drive, Ste 200
                                          San Mateo, CA 94404-4029
                                            Phone: (650) 378-1448
                                             Fax: (650) 378-1463


                              Tom Brennan, President: tom.brennan@stalworth.com
                        Steve Kleinmann, Vice President: steve.kleinmann@stalworth.com
                                   Stalworth Product Info: info@stalworth.com

                                               Blue Hammock
                                           Worldwide Headquarters
                                       445 Fort Pitt Boulevard, 4th Floor
                                             Pittsburgh, PA 15219
                                           Tollfree (877) 559-BLUE
                                            Phone: (412) 258-1200
                                             Fax: (412) 258-1201



© 2004 Stalworth Inc.                         www.stalworth.com                             Page 12 of 12

To top