XML Parsing in Java
Creating a Dungeons and Dragons (Third Edition w/ 3.5 Revisions)
Character Generator in Java
Block IV—First Semester ‟07-„08
“We are stuck with technology when what we really want is stuff that works.”
“…we are trying to unravel the Mighty Infinite using a language designed to tell one another
where the fresh fruit was.”
My seminar project idea went through two basic stages: the project that
never was (due to, to be blunt, Steve Jobs), and the project that I ended up
My original concept was to use the school network to set up an
audioscrobbling server similar to the one you can find online at www.last.fm. I
envisioned people bringing in their iPods and connecting them to the school
computers to upload the music they‟d been listening to, and perhaps pushing that
information out to all the computers so THS students could see what everyone
was listening to at the moment. This ended up falling apart due to Apple‟s policy
of not giving out any sort of development tools for the iPod. Back to the drawing
board, as it were.
The lightbulb finally turned on when I was creating a character for a
friend‟s Dungeons and Dragons game. I was clicking through roughly 8 tabs in
Firefox, had my calculator out, and a book precariously balanced across my
knees. It was then that I expressed that infamous programmer‟s sentiment (the
one that drives people to hunt through Internet forums far and wee for that one
guy’s method code, and argue about “big O notation”): “This could be done so
much more efficiently.”
I took some time to think about it: the character sheet was essentially a
bunch of whole numbers, some Boolean variables, and a String or two for good
measure. It would be easy to put together a PlayerCharacter object that would
store all the information.
My next problem hit: the sheer scope of Dungeons and Dragons. The
Player‟s Handbook (the most basic of the rulebooks) contains 7 races, which can
be from 11 character classes, select from roughly 50 feats, 30 skills, and 100
magic spells. (For those keeping track at home, that‟s about 12 million possible
combinations from my, admittedly, very rough estimate.)
I discovered, however, that some forward-thinking kind soul had put most,
if not all of the basic game information, into some XML files. I reasoned that XML
was a common enough format, and there had to be some way for Java to
process the information so it could be used in a program. Sure enough, there
was: JDOM, the Java Document Object Model. I grabbed the XML files off the
Internet, what Dungeons and Dragons materials I possessed, a copy of the
JDOM class listings and a cursory tutorial from Oracle magazine, and went to
A Brief Primer: Dungeons and Dragons
Dungeons and Dragons grew out of a now-defunct game, Chainmail. It
was originally developed as a miniatures-based war-game, but later came to
have some more focus on role-playing (i.e. how players portrayed their
characters versus just trying to win every fight). The game, and its creator, Gary
Gygax, enjoyed some success in the mid-1970‟s, but then suffered when
Gygax‟s company, Tactical Simulation Rules (later known as TSR) decided to (in
software development terms) “fork” D&D. The original, simple rules were known
as “Dungeons and Dragons, Second Edition”, whereas TSR‟s new, more
complex set of rules became known as “Advanced Dungeons and Dragons, First
The basic set, considered childish by those who played “AD&D” (a fair
assesment, considering TSR wanted to market the game to the board game set),
managed a stilted life of its own, getting a third and fourth edition in 1981 and
1983, respectively. It finally petered out with a fifth edition in 1991.
Advanced Dungeons and Dragons, however, only picked up steam. It
issued “Advanced Dungeons and Dragons, Second Edition” in 1989, and enjoyed
large amounts of popularity from the growing field of computer games, and its
steeper learning curve. Internal difficulties at TSR, however, led to the company
nearly going bankrupt. Wizards of the Coast, another game company, flush with
the success of their lucrative card games (Pokemon and Magic: The Gathering
having made them several million dollars), purchased TSR—and with it, the
Dungeons and Dragons license, in 1997.
Wizards ended the bizarre “fork”, and dropped “Advanced” from the name
of the game, referring to it simply as “Dungeons and Dragons”. They considered
their new edition the third edition of the AD&D rules, and named it as such. The
game was greatly streamlined, and characters were made far more customizable
than they had been in the past. They also introduced the OGL, or Open Gaming
License, which made it easier for people to write material that was compatible
with the game. (This actually ended up hurting Wizards in the long run, for
reasons that are far too long for me to explain within in scope of this paper.)
Wizards went on to introduce the unpronounceable “3.5 Edition”, which created
some basic rules changes. (This is the current edition of the game, and the one
that my program uses.)
Wizards is set to release Fourth Edition in June of this year, with the rules
further streamlined, and a full suite of computer tools that promise to replace the
various player-made programs that sprouted up in the Third/3.5 Edition era. (I‟ll
address those later in my paper.) The obvious response at this point is, “Well,
Mike, if they‟re going to release D&D 4.0 in June, why go to all this trouble?”
Someone on an Internet forum pointed out one answer: “There is enough 3.5
edition material out there to [play D&D] until the [end] of the universe.”
The game is a roleplaying game with an emphasis on tactical combat--the
only thing you have close to a gameboard is a grid on which characters move
during fights. Mostly, it's about playing your character [usually as a sort of
amateur-hour acting] through whatever adventure the Dungeon Master [read:
referee] has planned for you--this could be a simple crawl through an abandoned
temple to find treasure, or a hunt through a big city for a killer.
Characters are represented by six attributes. The first three, Strength
[STR], Dexterity [DEX], and Constitution [CON], are "physical attributes", which
determine how the character works in situations where physical strength or agility
is needed. Arnold Schwarzenegger, for example, would be a good example of a
high Strength character, while a marathon runner would have high Constitution.
The other three, Intelligence [INT], Wisdom [WIS], and Charisma [CHA] are
"mental attributes", which determine how the character works in situations when
mental quickness or personality is needed. Ken Jennings is the best example of
someone with high Intelligence, but Bear Grylles [the survivalist of "Man Versus
Wild"] is high Wisdom--Intelligence is "book smarts", while Wisdom is practical
knowledge. Charisma, defined as "interpersonal skills" is a little bit harder to pin
down--James Bond is a classic example of high Charisma, but few examples of
low Charisma come to mind. These attributes, along with choice of race [elves,
dwarves, humans, et cetera] define the character.
XML is somewhat hard to explain because it‟s not your usual “enter
command x to get result y”. XML, or Extensible Markup Language, is a way to
format data so that it can be read by a computer. XML is an outgrowth of
SGML—Standardized General Markup Language, but differs from it in the sense
that the user can define their own way of entering the document. Those familiar
with another outgrowth of SGML, HTML, will recognize some elements of XML.
A “well-formed” XML document (i.e. a standard document that‟s ready to
be read without errors—similar to compiled source code) is given here:
<?xml version="1.0" encoding="UTF-8"?>
Document : test.xml
Created on : January 3, 2008, 12:29 PM
Author : student
Shows how XML works.
The first line, <?xml version="1.0" encoding="UTF-8"?>, is the XML
declaration. This is optional, but helpful if the document is going to be read
across a wide variety of platforms. It tells us that we are using XML 1.0 (the most
recent version), and that the text is encoded in UTF-8 (how the characters are
represented as bytes…something outside the scope of this brief introduction).
The next few lines, as any HTML users can tell, are commented out by the
<!—and -- > blocks. Comments are fairly self-explanatory to anyone familiar with
programming—the usual snippets of notes on what‟s being represented. This
comment block is automatically generated for me by Netbeans, and gives me
fields to fill in with the document name, the date it was created, the author, and a
description on what the program does. Not all programming environments have
this, of course. All XML documents support the comment blocks.
The last few lines are actually the real meat of things, the actual document
and its markup tags. We have one root element that encapsulates the entire
document, similar to the <html> tags that surround an HTML page. The next
element (which is considered a child element of the root element, as are all other
elements in the document), is the foo element, which has as contents the text
“bar”. Below that is another element, the fullname element, which has as text
“Michael”…but also a name element inside of itself, which just has “mike”. So,
foo and fullname are children of the root element, and name is a child of fullname
and root. For the document to be considered well-formed, there must be opening
and closing tags on all of these elements. Of course, what sets XML apart from
HTML and makes it so useful is that there are no restrictions—get a root
element, get some information tagged up inside of it, and you‟re ready to go.
However, this kind of freedom comes at a price—you have to be very careful
when you‟re working with it in a programming environment.
The Java Document Object Model
The other important part of the project to have some knowledge of is the
JDOM library, or the Java Document Object Model. The name sounds unwieldy,
but it makes some sense when you take it apart: the JDOM library provides a
model to turn an XML document into a Java Object.
The library is divided into 6 packages:
org.jdom contains all of the classes that represent an XML document and its
components—the Document class, the Comment class, the Element class, etc.
The input and output libraries are self-explanatory, and could be thought of as
two different ends of a continuum: the input library takes in data and produces an
XML document, and the output library takes in an XML document and produces
the JDOM data. The transform and xpath libraries are for transformations for
HTML sites and looking up elements in the document, respectively. Finally,
adapters is the black sheep of the family—users never actually interact with this,
as it is used by JDOM to translate the method calls that the user puts in into
The parsers are the important part for my project, as the program has to
parse the data from the XML document into the JDOM-ready format. JDOM
comes with two parsers, the DOMBuilder (Document Object Model—Java is to
JDOM as C++ is to DOM) class, and the SAXBuilder class (Simple API for XML).
I ended up using SAXBuilder, as that‟s what the tutorials I found for how to use
JDOM were written with. There‟s little actual difference between the two, as far
as I can tell—it seems to be one of those legacy things (the old, grizzled users
prefer DOM, while the new, Web 2.0 types prefer SAX).
The best way to see how JDOM works is to go through and look at how I
would parse XML for storage. This example will use the Feat class, which I never
actually got to use in my project. A Feat object has 7 fields:
[String] name: the name of the feat
[String] type: Feats can be general, combat, or metamagic in the basic
game. (Other types exist, but in the interest of saving space, I won‟t go
[Boolean] multiple: Can this feat be had multiple times? Some feats
can be taken more than once (Weapon Focus—each time, it applies to
a different weapon), some cannot (Alertness—it grants two skill check
bonuses, and that‟s it).
[Boolean] stack: Do the effects of this feat stack if it‟s taken multiple
times? (Extra Turning, for example, stacks—you can keep getting its
effects if you take it multiple times. Weapon Focus is an example of a
feat that does not stack—it applies to a different weapon each time you
take it, rather than constantly increasing its bonus amount.)
[String] prereq: What prerequisites must the character meet before it
can take this feat? Some feats require a certain score in one of the six
attributes, others require that you complete a “tree” of feats before you
can gain its effects.
[String] benefit: What benefit does the feat give? For example, Weapon
Focus gives a +2 bonus on all attack rolls with the specified weapon.
[String] normal: How would the character function without the benefit?
For example, without Weapon Focus, the character would just make its
normal attack rolls.
Now, the code for parsing an XML file the XML file that contains all of the
feat information (roughly 100-150 feats) looks like this, in the FAsObj class.
public class FAsObj
catch (Exception e)
System.err.println("Error in FAsObj");
public ArrayList FReturn()
ArrayList theList=new ArrayList();
if (o instanceof Element)
if (avail.getText().equals("SRD 3.5 Feats"))
String prerequisite="No prerequisites.";
Now, let‟s walk through the process of getting the list first. The FAsObj
constructor makes a new SAXBuilder to parse the document. It then creates the
Document object that we‟ll be crawling through by calling the build method on the
SAXBuilder, which gets passed the file name of the XML file, and then parses
through the document and drops it into the Document object.
Next, we get the root element of the document and store it in the root
Element object. We get all the children of the root named “feat”—each separate
feat is delineated in the document by <feat></feat>. These documents are stored
as a List object, namedChildren. We can then get an Iterator of this List, which
can be used to fire through the List.
At this point, the FAsObj object is all set, and ready to have its big method
called—the FReturn method, which passes back an ArrayList of Feat objects.
FReturn uses the iterator we made in the constructor to parse through the
document. It keeps running through the Iterator until there is nothing left for it to
However, the Iterator can‟t tell whether or not it‟s looking at a comment,
element, or something else entirely. We have to put in an if-block, if (o instanceof
Element). Instanceof is an operator similar to .equals() or the various inequality
symbols. It tells the computer to look and see if the first object mentioned is an
instance of the second object. My experience with it thus far is limited, but it is
certainly a powerful tool.
The next thing that happens is that the computer casts the mystery object
o to be an Element. (If it‟s not, it‟s skipped—we‟re only interested in the elements
of the document, not any comments that might exist.) The next if has a story
behind it—it‟s a result of one of the many mistakes I made while working on this
The XML files that I got off the Internet were stated to contain all of the
System Reference Document material—in other words, everything that you
would need to play a basic game. I shrugged this off, reasoning that that was
what I wanted. However, in a case of getting exactly what you ask for, the XML
file also contained the material for epic levels—a sort of variant on the game that
I wasn‟t prepared to deal with. The computer took forever to go through the list,
and I wasn‟t particularly interested in dealing with all of the extra coding for epic
levels. I had to find some way to handle all the epic material that I was going to
tell the computer to ignore.
Luckily, the XML document tagged each feat with an extra element,
<reference>, which states just where it got them from. After a quick eyeballing of
the document, I realized that everything I needed came under the “SRD 3.5
Feats” category. This if block, therefore, stops the program from going further if
the feat it‟s currently looking at is not part of that group.
After that, things are fairly self-explanatory. One interesting part to note is
the “prerequisites” code, which I took to referring to as the “imaginary friend
handler”—that is, it checks to make sure that something actually exists. This, of
course, was from another mistake I made.
As I was writing the first version of this Feat parser, I realized that it kept
crashing at certain feats, saying that it had made some variation on the theme of
a null exception. I tried everything that I could think of, and then ended up going
into the code itself to see what was wrong. I discovered that, rather than just
saying “No prerequisites” under the feat‟s <prerequisites> tag, the original author
had simply left them out entirely. Therefore, when it gets to something that I‟ve
found (through trial and error) to “maybe not be there”, it goes into the “imaginary
friend handler”, making sure that what it gets back isn‟t a null.
Outside of that part, things are fairly normal. Once it has all its ducks in a
row, it adds the new Feat to the list, goes back to the beginning, and starts over
from there. Eventually, it gets the whole list all set and sends it back for whatever
use one may have for it.
A Question of Design
The next most important part, once I had all the various pieces of code
that I would need to get the requisite information out of the XML documents, was
how to represent it. This was, looking back, almost even a larger problem than
the XML parsing, because of my experiences in the past. Let‟s take a look at
other character generation software.
The first up is what I consider the “gold standard”, which can be found at
http://www.pathguy.com/cg35.htm . It‟s surprisingly good for an amateur effort.
(This is just a small part of the program, the rolling of the character attributes.)
If you just said, “Wait, what?”…I don‟t blame you. This is my big problem
here: no one seems capable of creating an intuitive interface for doing this.
The way that this one is set up, you set your dice rolling method in the
dropdown box, and then hit the “Roll the Dice” button. You assign each attribute
in the radio buttons—you can‟t have two selections share a column or row. The
manual entry column is used for point buy, or when you get fed up with rolling
So, let‟s take a look at another program. Next on the list is PCGen, which
other people consider to be the gold standard. This is the first screen you get
when you‟re going into the program, ready to make your character.
If you thought that the last one was bad, you might as well just curl up and
cry at this point. You have to pick the sources you want to construct your
character from in the box on the right before you can actually do anything. (The
proper selection for making a basic Dungeons and Dragons character is
“RSRD”.) Then, you click on the button to load it in (right above the big red “2” in
this screenshot). This is the best part now: you have to go up to “File” and hit
To sum all this up in one sentence, I‟m on the frontlines of the war against
poorly designed graphic user interfaces. Looking back now, I think that just about
anyone who has ever programmed, ever, can say that with some degree of pride.
On the flip side of the coin, there are some ideas that I wanted to
incorporate into my project. First up on the Good List is Redblade, a free
character generator that, while not being perfect, has some points worth
This is the very first screen you see, as you can see over on the left. I
don‟t like that they make you roll ability scores afterwards (note that your
progress through generation is measured over on the left—going from “Base” to
“Finished”), but I do like this screen. You can put in the character‟s name,
gender, race, and alignment right then and there. The dice pictures are actually
buttons—you click them to get a random result in the dropdown. It‟s clean and
easy to follow. Not to say that it's not without problems, but I like it.
I, of course, did not finish my project due to being stilted along the way by
various pieces of JDOM and Java that I simply didn't know how to code/how to
use properly. The first panel I created was the stat roller.
The way that this flows is simple. At the top of the screen, a generation
method is selected from the most popular ones I could find on the Internet, and
then "Roll" is clicked. The attributes rolled show up in the "???" spaces, and the
comboboxes each show the six attributes [STR, DEX, CON, INT, WIS, CHA].
One stat is selected per box, and then "Check and Accept" is clicked. If the
boxes have not be selected properly, the message at lower left changes to say
that one or more scores have been improperly assigned. If otherwise, the
selected stats are stored to the PlayerCharacter object that is constantly being
built behind the scenes, and the message shows that the user has selected a
valid score set.
The other panel that I created, which was still a work in progress at
the time that I started this paper, was the Race panel, which I was especially
I felt that other race selection screens always presented too much, or too
little information. Therefore, I grabbed the parts that I thought were the most
important: the favored class [a game mechanic that's far too complicated, and in
my opinion, poorly designed, to get into right now], special abilities [gnomes can
speak to any burrowing animal, changelings can alter their physical appearance
at will], and any skill and/or attribute adjustments. The user simply selects a race
from the dropdown and hits accept if everything appears to be acceptable. The
big drawback to this was that I could never seem to get the race data into the text
fields or the table object--something that I wish I'd known more about how to use
beforehand. Another problem that I had during development was the stat roller
panel--this has already been covered in excruciating detail in my weekly updates,
but I'll reiterate once again that while using radio buttons is great, Java's radio
button group class, for reasons I don't think I will ever be able to comprehend,
does not have any way for you to call a method to see what button is currently
selected out of the group.
Down the Road
I would like to continue working on this at some point, although, looking
back at my paper now, I'm not so sure that I would like to continue with the 3.5
ruleset. I think that the rules are too baroque and there's just too much going on
for anyone to write one really good, clean program that can easily sum up a
character. Fourth Edition is supposed to be the big simplification of the rules, and
I think that that would be a great place to get onboard with any sort of program.
As far as XML and Java, I feel like I've learned a lot. I'm still a little
stunned that I never knew about the instanceof operator, not to mention the
usefulness of XML. I seem to see XML everywhere now, whether cruising around
the Internet or digging through files on my computer. I've greatly enjoyed the
freedom that the seminar has given me, and hope that I can have similar
experiences in the future.