Doing More with LESS: Logic
Embedded in SpreadSheets
Andre Valente (KSVentures and USC/ISI)
David Van Brackle (ISX)
Hans Chalupsky (KSVentures and USC/ISI)
Gary Edwards (ISX)
LESS: Logic Embedded in SpreadSheets
Logic Embedded in SpreadSheets (LESS)
will provide the user with a system which
combines the power of logic-based
knowledge representation and reasoning
with the familiar and easily-mastered user
interface paradigm of a spreadsheet
LESS was developed under an SBIR
contract from DARPA
Project Manager: David Gunning
Motivation
Spreadsheets are Widespread but Error Prone
[Abraham & Erwig, 2003]
55 Million users (= programmers) of spreadsheets in
2005
Spreadsheets contain alarming rates of errors [Erwig
and Burnett 2002]
Spreadsheets have their strengths…
Rapid model development
Interactive “What If” feedback
Scalable application of easily defined computational
logic
… and weaknesses
Hard to express complex logical statements
Data structure hides relationships
Logic spreadsheets can
Improve quality (less errors)
Increase maintainability
Support the construction of more
explicit and complex models
Be a practical platform for knowledge
acquisition
Key Challenges
Basic Perspective:
A pragmatic, scruffy, engineering approach
Start with open mind – try things out, see what works
Usefulness – How to develop a useful tool, not just an
academic exercise?
Answer: a real hybrid
Enrich the functionality of a spreadsheet while maintaining
on its core interaction paradigm
Spreadsheet should not be a GUI for knowledge bases
Logic should not be just a function library in the
spreadsheet
Usability – How to get the syntax and the human interaction
right to make it usable
Answer: four tiers
Hide as much logic as needed
Allows different trade-offs
The LESS Model
Tier 4:
App Templates
Tier 3: Full
Integration of
Spreadsheet and
Logic System
Tier 2: Tabular-oriented
Knowledge Integrated into
Spreadsheet GUI
Tier 1: Logic functions available programmatically
in formulas
Tier 1: Logic functions available
programmatically in formulas
A B C D
PowerLoom functionality 1 Person Age Car Happy?
2 Fred 14
available to Spreadsheet 3 Joe 17 Ford Yes
functions as function calls 4 Sam 33 Saab
Lifting Mappings – pushing ASSERT( “Age”, $A2, B2 )
info into the KB
Query Mappings –
extracting info from the KB IF( ASK( “Happy”, $A3 ), “Yes” )
Tier 2: Tabular-oriented Knowledge
Integrated into Spreadsheet GUI
Instance Tables: A collection of
instances of a given class will be
represented as a special spreadsheet
table
Creating a new Instance Table is
equivalent to creating a Class in an
Ontology
Rules are entered as text, much like
Spreadsheet Functions
Tier 3: Full Integration of
Spreadsheet and Logic System
Increased usability
Hide away details of logic formulations
Develop special GUIs integrated into the
spreadsheet
Wizards
Special cell editors
TreeTable metaphors
Requires tighter integration of logic and
spreadsheet data models
Tier 4: Application Templates
Templatize working LESS applications
Hide even more of logic to users
Better deployment, packaging mechanisms
Domain Paks
Templates captured by experts in a given field
and widely distributed
Production tools (meta level)
Mechanisms for connecting to existing data
(e.g. in DBMSs)
LESS Architecture
LESS Components
LESS integrates:
PowerLoom, a powerful logic KR&R system
Microsoft Excel, the most widely used
spreadsheet in the world
LESS is implemented as an extension (add-
in) to Microsoft Excel
Status: prototype implementation to be
demonstrated here
LESS Excel add-in has 1.6Mb (includes
PowerLoom libraries)
Implemented in C++ and Visual Basic
Invoke Spreadsheet
Functions
LESS – a mapping layer between Excel and PowerLoom
Invoke PowerLoom
Excel Spreadsheet
Functions
Component Interactions
PowerLoom
Push Knowledge
To Spreadsheet
Pull Knowledge
From PowerLoom
Push Knowledge
Into PowerLoom
PowerLoom
PowerLoom is result of 20 years of research
and practical use of KR&R systems at
USC/ISI
Powerful and expressive
Unique “pragmatic” stance to tackle real
world problems
Usability more important than theoretical
"neatness"
Allow trade-offs between
expressivity/completeness
A KR system for real-world applications
Used in HPKB, CoABS, RKF, EELD, etc.
PowerLoom Highlights
Representation in fully-expressive first-order predicate logic
KIF syntax with many extensions to standard FOL:
Type, set & cardinality relations (e.g., subset-of, range-cardinality)
Second-order definitions via holds
Classical negation and negation-by-failure, defaults
Frame-style definition language as syntactic sugar
Defconcept, defrelation, deffunction, definstance, defrule
Incremental monotonic and non-monotonic updates
Interleave definitions, assertions, retractions with retrieval and
inference
Context mechanism separates name and assertion spaces with
inheritance
Provides structuring mechanism, facilitates hypothetical reasoning
Several reasoning mechanisms
Reasoning specialists architecture for “plug-ins” (e.g., time reasoner)
RDBMS interface based on lifting axioms
PowerLoom Status and
Distribution
Written in STELLA
Available in Lisp, C++ and Java
Both STELLA and PowerLoom are now Open Source
Recent development for PowerLoom
GNU or Mozilla licenses
Current release: PowerLoom 3.0.2.beta
Distributed as Lisp, C++ and Java source
More recently in STELLA source
~600 downloads world-wide
~400 subscribers to the mailing lists
Why Have a Logic
Spreadsheet at All?
A Key Insight: Relational vs.
Positional Referencing (1)
Spreadsheets use positional referencing to refer to
information
G5 = relative cell reference
$G$5 = absolute cell reference
A5:D5 = range of cells {A5 B5 C5 D5}
Positional Referencing is very intuitive in the
beginning…
But it is highly error-prone
More complicated than it seems at a first glance
Spreadsheets try hard do update references when
changes are made (e.g., a line is inserted), but errors
are frequently introduced
Also, restricted to two dimensions
A Key Insight: Relational vs.
Positional Referencing (2)
Logic uses relational referencing to refer
to information
Meaning is conveyed by relations and assertions
Information is referenced by open variables as
well as role in relations
Example:
Logic function (hours ?person ?month ?task)
Means the number of hours a ?person spends on
a ?task in a ?month
N-ary relations and more powerful
proposition specification languages add
power to spreadsheets
A Key Insight: Relational vs.
Positional Referencing (3)
Key insight of LESS is to add relational
referencing to spreadsheets
Mixed model allies best of both worlds
Cells can still be used to refer to data
Cell Variables can be bound to contents of cells
Logic propositions express relationship
between several information elements
LESS Features – Highlights
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
About Our Examples
Three main examples:
Pilot training spreadsheet
Budget spreadsheet
Grading spreadsheet
All examples are real-life spreadsheets
Pilot training example based on a spreadsheet used by ANG
149th Tactical Fighter Squadron
Budget spreadsheet was used by a small business for
writing cost proposals for government contracts
Grading spreadsheet was used to grade an actual course as
well as an exercise for a class in spreadsheets
Sometimes simplified for presentation purposes
Syntax in these slides is simplified (we are still adding
syntactic sugar)
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
Positional Referencing Looks Easier
Than It Is (1)
For example, B6 contains the formula
=SUM(B3:B5)
Simple, easy…
A B C D E
1 Task 1 Hours
2 Person Jan-05 Feb-05 Mar-05 Apr-05
3 Fred 12 4 12 14
4 Wilma 4 7 15 3
5 Barney 10 8 3 6
6 Total 26 19 30 23
Positional Referencing Looks Easier
Than It Is (2)
Even such a simple sheet generates errors
when a new employee (line) is added!
Totals stayed the same but should have
been updated
A B C D E
1 Task 1 Hours
2 Person Jan-05 Feb-05 Mar-05 Apr-05
3 Fred 12 4 12 14
4 Wilma 4 7 15 3
5 Barney 10 8 3 6
6 Pedrita 10 7 9 9
7 Total 26 19 30 23
Positional Referencing Looks Easier
Than It Is (3)
What is the problem?
In positional referencing SUM(B3:B5)
really means “sum the three cells
above me”
This clearly is not what is intended
Intention of the user is “sum the Jan-
05 hours for all persons in Task 1”
Relational referencing allows a user to
say just that!
Relational Referencing in LESS
Makes Relationship Explicit (1)
With LESS, we want to express the relationships explicitly
(hours ?person ?month ?task)
Means the number of hours a ?person spends on a ?task in a ?month
In order to add relational information to a spreadsheet, we need to
add a lifting formula
One way is to add a lifting formula on top of each column:
=ASSERTMULTI(“hours”, A3:A5, B2, $A$1)
This expands into and performs a series of assertions, e.g.:
(hours A3 B2 A1) = (hours “Fred” Jan-05 “Task 1”) = 12
(hours A4 B2 A1) = (hours “Wilma” Jan-05 “Task 1”) = 4
A B C D E
1 Task 1 Hours
2 Person Jan-05 Feb-05 Mar-05 Apr-05
3 Fred 12 4 12 14
4 Wilma 4 7 15 3
5 Barney 10 8 3 6
6 Total 26 19 30 23
Relational Referencing in LESS
Makes Relationship Explicit (2)
Once the information is lifted into PowerLoom, we
can easily express a cell as “sum the Jan-05 hours
in Task 1 for all persons”
B6 contains the formula:
=SUM (RETRIEVEALL (“hours ?person Jan-05 “Task 1”)”))
Notice mix of logic and spreadsheet functions
A B C D E
1 Task 1 Hours
2 Person Jan-05 Feb-05 Mar-05 Apr-05
3 Fred 12 4 12 14
4 Wilma 4 7 15 3
5 Barney 10 8 3 6
6 Total 26 19 30 23
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
Multi-dimensional relations (1)
Problem: positional referencing only
works when information has at most
two dimensions (3-arity relations)
Intuitively easy to see: user cannot
make a range with three dimensions
Multi-dimensional relations (2)
A B C D E
1 Task 1 Hours
Example: calculate totals 2
3
Person
Fred
Jan-05
12
Feb-05 Mar-05
4 12
Apr-05
14
across tables 4 Wilma 4 7 15 3
5 Barney 10 8 3 6
Formula looks much less 6
7
Total 26 19 30 23
intuitive, e.g. 8
9
Task 2
Person Jan-05
Hours
Feb-05 Mar-05 Apr-05
B24=B3+B10+B17 10 Fred 4 7 4 7
11 Wilma 12 12 4 14
Can’t use ranges 12 Barney 10 8 10 8
13 Total 26 27 18 29
Extremely hard to maintain 14
15 Task 3 Hours
and error prone! 16 Person Jan-05 Feb-05 Mar-05 Apr-05
17 Fred 12 12 4 14
E.g., if a new task is added 18 Wilma 4 4 7 3
19 Barney 10 10 8 6
or there are many tasks 20 Total 26 26 19 23
21
What you really want to say is 22 Total Hours
“sum the hours in Jan-05 for
23 Person Jan-05 Feb-05 Mar-05 Apr-05
24 Fred 28 23 20 35
Fred in all tasks” 25
26
Wilma
Barney
20
30
23
26
26
21
20
20
27 Total 78 72 67 75
What happened?
Task table can be seen as a model with two
dimensions – person and monthly hours
BUT: Overall information model reflects
indexing in three dimensions – person,
month and task
When information needs to be aggregated
across more than two dimensions, the
spreadsheet model suffers
Pivot tables can overcome some limitations,
but only for some cases
LESS Supports Multi-Dimensional
Relations
A B C D E
1 Task 1 Hours
Assuming the same lifting 2
3
Person
Fred
Jan-05
12
Feb-05 Mar-05
4 12
Apr-05
14
described earlier, we can 4 Wilma 4 7 15 3
easily express “sum the
5 Barney 10 8 3 6
6 Total 26 19 30 23
hours in Jan-05 for Fred in 7
8 Task 2 Hours
all tasks” as: 9
10
Person
Fred
Jan-05
4
Feb-05 Mar-05
7 4
Apr-05
7
=SUM (RETRIEVEALL 11
12
Wilma
Barney
12
10
12
8
4
10
14
8
("(hours Fred Jan-05 ?task)”)) 13 Total 26 27 18 29
14
This can also easily be 15 Task 3 Hours
turned in to an Excel-like
16 Person Jan-05 Feb-05 Mar-05 Apr-05
17 Fred 12 12 4 14
template formula by 18
19
Wilma
Barney
4
10
4
10
7
8
3
6
mixing cell variables: 20
21
Total 26 26 19 23
=SUM (RETRIEVEALL 22
23
Total
Person Jan-05
Hours
Feb-05 Mar-05 Apr-05
("(hours A24 B$23 ?task)”)) 24 Fred 28 23 20 35
25 Wilma 20 23 26 20
26 Barney 30 26 21 20
27 Total 78 72 67 75
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
LESS Provides Better, Explicit Rules
Spreadsheets encode business rules in formulas
Several problems:
Rules are hidden from view – harder to audit, opaque
Rules have to be repeated everywhere they are used
– higher chance for errors, difficult to maintain
Another problem is that spreadsheet formulas are
very poor in expressing rules that are not strictly
mathematical formulas
No good mechanism for complex conditions
LESS allows a user to model explicit (business) rules
Better methodology (trend towards explicit business
rules)
Easier to debug and maintain
Improves visibility, communication
Better, Explicit Rules - Example (1)
In the example below, the formula to calculate the grade letter
(column H) requires nested conditionals:
IF($H4 (>= ?x 95) (letter-grade ?x "A")))
13 !=assert((=> (and (= ?x 90))(letter-grade ?x "A-")))
14 !=assert((=> (and (= ?x 88))(letter-grade ?x "B+")))
15 !=assert((=> (and (= ?x 85))(letter-grade ?x "B")))
16 !=assert((=> (and (= ?x 80))(letter-grade ?x "B-")))
17 !=assert((=> (and (= ?x 78))(letter-grade ?x "C+")))
18 !=assert((=> (and (= ?x 75))(letter-grade ?x "C")))
19 !=assert((=> (and (= ?x 70))(letter-grade ?x "C-")))
20 !=assert((=> ( (?n number))" )
25 Min Max Letter =POWERLOOM( "(deffunction letter-grade ((?x student)) :-> (?s string))" )
26 95 100 A =ASSERT("=>",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A26), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A27), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A28), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A29), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A30), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A31), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A32), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A33), P( "=",P("and",P( "Final", "?x", "?g" ), P(">=","?g",A34), P( "= (and (= ?x ??P2) ((letter-grade ?x ??P3)))
50 =apply-template(B49,A39:C47)
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
Powerful Information Querying
Spreadsheets have limitations on how to
summarize information
Particularly if the information is symbolic
Basic problem is that it is hard to select
information from other parts of the
spreadsheet
Worse if information is in multiple tables and/or
has more than three dimensions
Special wizards such as Excel pivot tables
help overcome some of these deficiencies
But they still have restricted expressivity
LESS Demo – Squadron
Training Examples
Also – Tiers 2, 3
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying Knowledge-
Based Systems
New Possibilities Opened with LESS
One way to look at logic spreadsheets is for
logic to add spice to spreadsheets
Delivers incremental improvements
(Only) user is spreadsheet user
We found that logic spreadsheets can be
good for knowledge engineers as well
Two types of applications:
Template-based knowledge acquisition
Reasoning
Template-based knowledge
acquisition (1)
A common problem in knowledge acquisition is to acquire knowledge about
many similar objects
KA approaches are hard to understand for SMEs
Require understanding of modeling decisions
For instance, not reasonable to ask an SMEs to model (example from
DARPA ARPI and JFACC Programs):
(defconcept F-15
:IS-PRIMITIVE Fighter-Aircraft
:IMPLIES (:AND (:FILLED-BY NICKNAME "Eagle")
(:FILLED-BY MADE-BY MCDONNELL-DOUGLAS)
(:EXACTLY 2 ENGINES)
(:ALL GUN M61A1)
(:FILLED-BY IN-SERVICE 1972)
(:FILLED-BY PRIMARY-FUEL JP-4)
(:FILLED-BY ALTERNATE-FUEL JP-5 JP-8 JET-A
JET-A-1 JET-B))
:DEFAULTS (:AND (:FILLED-BY CEILING 65000ft)
(:FILLED-BY COST 15000000)
(:FILLED-BY CREW-SIZE 1)))
Template-based knowledge
acquisition (2)
An approach to deal with this problem is to have
“knowledge templates”
Templates abstract the way information is used in the logic
formalism
Middleware code (e.g. a Lisp macro) “routs” the information in
the template to its place in the knowledge base (e.g., slot
filler, type, default value)
Example of template implemented as Lisp macro:
(def-aircraft-c F15
:types fighter-attack-aircraft
:nickname "Eagle"
:made-by McDonnell-Douglas
:engine-count 2
:primary-fuel JP-4
:alternate-fuel (JP-5 JP-8 Jet-A Jet-A-1 Jet-B))
Ideal division of labor:
SME enters information easily
Knowledge engineer designs templates to map into a KB
Template-based knowledge
acquisition (3)
Logic Spreadsheets are great for
implementing knowledge templates
Table format is natural for most SMEs
Spreadsheet is widely used (info may already be
in a spreadsheet!)
Information also available for other uses
Example:
A C D F G H
1 Aircraft Maker # Engines Primary Fuel Alternate Fuels …
2 F-15 McDonnel Douglas 2 JP-4 JP-5 JP-8 Jet-A Jet-A-1 Jet-B …
3 F-16 General Dynamics 1 JP-4 JP-5 JP-8 Jet-A Jet-A-1 Jet-B …
4 … … … … … …
LESS Features – Highlights
Explicit relationships
Multi-dimensional relations
Better, Explicit (Business) Rules
Powerful information querying
Platform for Knowledge
Acquisition/Engineering
Platform for deploying
Knowledge-Based Systems
Reasoning
Spreadsheets can contain full-blown knowledge-based
systems
Plays on availability of information already in spreadsheets
For instance, we can re-implement PROSPECT [Valente and
Schacchi, 2002] in LESS
Input was a list of (instances of) business process models
KB has knowledge about critiquing process models
Output is agenda of problems to evaluate
Applied to critique accounting processes in large corporation
LESS implementation would:
Use tables and templates to express input process models
KB can be provided explicitly or as a domain pak
Output agenda of problems in the process models can be
implemented as an instance table
Tier 2 Example
Create instance table
Creating instances