Semester Project
Layout Driven Editor based on Document Templates
Author: Jo¨l Schintgen e joel.schintgen@epfl.ch
Supervisor: Professor: St´phane Sire e Christine Vanoirbeek stephane.sire@epfl.ch christine.vanoirbeek@epfl.ch
January 10, 2009
Contents
1 Introduction 1.1 The goal of this project . . . . . . . . . . . . . . . . . . . . . . 1.2 Why an XTiger editor? . . . . . . . . . . . . . . . . . . . . . . 2 The XTiger editor 2.1 Language subset . . . . . . 2.2 XTiger Editor User Interface 2.2.1 use . . . . . . . . . . 2.2.2 repeat . . . . . . . . 2.2.3 attribute . . . . . . . 2.2.4 bag . . . . . . . . . . 2 2 2 3 4 4 5 5 6 7 8 8 8 8 9 9 10 10 11 12 12 12 13 13
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
3 Implementation 3.1 Setting up the transformation 3.1.1 Stylesheets . . . . . . . 3.1.2 Template or instance? 3.1.3 The target language . 3.2 The editor . . . . . . . . . . . 3.2.1 use . . . . . . . . . . . 3.2.2 repeat . . . . . . . . . 3.2.3 bag . . . . . . . . . . . 3.2.4 attribute . . . . . . . . 3.3 Save . . . . . . . . . . . . . . 3.4 Logger . . . . . . . . . . . . . 4 Future work 5 Conclusion
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
1
1
Introduction
Today, the internet is composed of billions of documents containing a lot of information, but the problem is that the biggest part of them are designed to be read by humans, which makes it very difficult for computers to interpret and process their content. Semantic documents try to overcome this problem by explicitly encoding semantics together with the documents content. However, as the director of W3C, T. Berners-Lee stated in [3] “Instead of asking machines to understand people’s language, it involves asking people to make the extra effort”. But this “extra effort” most probably is the biggest obstruction in the development of the semantic web. XTiger is an XML language that adds semantics to existing XML schemas. The combination of XTiger and the existing XML language, helps to build documents that follow a certain defined structure and should facilitate the extra effort mentioned before. This combination of XTiger and the existing XML language, called target language (typically XHTML), is an XTiger template.
1.1
The goal of this project
The goal of this project was to develop a visual editor for creating documents using XTiger templates. As the name of XTiger (Extensible Templates for Interactive Guided Editing of Resources) suggests, this editor should guide the author in editing a document following the model described by the XTiger template. The output of the editor can be classified in two different categories: the instance on one side, and the final XML document on the other side. The instance of a template can be, together with the original template, reopened in an editor for further editing, as all the needed XTiger elements are still present, whereas in the final target document all information about the template (i.e. all XTiger tags) have been removed. The editor is written in JavaScript and relies on the XTigerTrans[13] library by Jonathan Wafellman.
1.2
Why an XTiger editor?
As mentioned before, the biggest problem of semantic web is the “extra effort” that people have to make. This editor should help people to easily create XML documents following an XTiger template, without having to deal with XML-markup. An existing example of such an editor is included in the web authoring tool and browser Amaya[9]. The ideal would be a WYSIWYG-style editor. 2
Figure 1: Illustration of the types of documents used with XTigerEdit In this report I will first define an example XTiger template that will be used as an illustration throughout the whole document, then I will present the interface, followed by the implementation an finally propose some future extension.
2
The XTiger editor
To illustrate the content generation using XTigerEdit, I will use an XTiger template used to generate project descriptions on the MEDIA website1 . This template is slightly modified in order to show all the implemented XTiger elements. The editor will open an XTiger template (respectively an instance + the corresponding template) and show the editable document. Saving the document will produce an XTiger instance document. This document contains all the XTiger elements from the original template without the head section. An instance together with the original template can be reopened with any XTiger editor (Amaya for instance). An other possibility is to produce the final document (by clicking on export). This will show the edited document without any XTiger elements. This document can not be re-edited.
1
http://media.epfl.ch/Student%20Projects.html
3
Figure 2: The editors interface with an opened template
2.1
Language subset
The XTiger language subsets that are fully implemented are the use and repeat elements. The attribute element is supported with the exception of the use="prohibited" attribute. The bag element very complex to represent. It is supported only with basic types and target language elements. Furthermore, the bag editor is not WYSIWYG.
2.2
XTiger Editor User Interface
The user interface of the editor is based on XTigerTrans[13]. It is only extended by a Save and an Export button. The choice of a transformation is removed, as there only exists the transformation into the editor. The document body is presented in the original document style. Only the text inside XTiger elements is editable and is presented in a box surrounded by a dashed border (except for the attribute element). These boxes are all displayed in block style, which makes the editors body look slightly different from the final document. To make the representation more clearly and in order to avoid the user to loose himself in a bunch of nested boxes, the border of a box becomes solid and the background becomes yellowish when the mouse is placed over a box (fig. 2).
4
2.2.1
use
A use element is represented by a green bordered box containing the initial content. This content can be of one of the basic types (string, number or boolean), a target language element (e.g.
with XHTML as target language) or an XTiger component. In the case of string, number or target language element, a click on this value creates a green backgrounded textarea containing the initial text content of this element. The height of the textarea dynamically adapts to the number of rows needed to represent the content. By pressing the enter key, the textarea is saved and closed. The insert of a newline character is achieved, as known from other applications, by holding the control key while pressing enter. Clicking outside the textarea (blur) also saves and closes it. Boolean type values are represented by the string “true” and “false” respectively. A click on it changes the value to the opposite (i.e. from true to false or from false to true). If more than one type is allowed, a dropdown menu will appear on the right side of the elements name, which lets the user select the type of the use element. When a type is selected, the content of the box is adapted accordingly. In the example of the project descriptions described above, the first element is an xt:use element with the label Title and basic type string as the only allowed type. The element below has two possible component types, which can be selected in the dropdown box. A special case of the use element is with the attribute option="set". This optional element is represented the same way as the normal use element, with the only difference that a click on its icon will set or unset it. When unset, only the icon and the label are visible. The element labeled requirements is an example of an optional use. 2.2.2 repeat
To let the user repeat a piece of the document multiple times, the repeat element is used. A repetition is represented by a blue bordered box containing use elements as described above. The box of the repeat item has a -sign on the right side of the label. A click on this icon adds a copy of the repeated element at the first position. Every box of a repeated use element has in the top right corner a and a sign to insert a new element below the box, respectively remove the element. In case the minimal or maximal number of elements allowed inside the
5
Figure 3: Trying to delete the last occurrence of a repeated element, but the minimal number of occurrences is set to 1 repeat is reached, an error message is displayed. (fig. 3) 2.2.3 attribute
When an attribute can be specified, an icon will be shown in the left top corner of the corresponding element. If the mouse is pointed on this icon, a floating box will appear displaying all the attributes that can be set for this element. There are three different types of attributes implemented: • The attributes content can be any string. In this case a simple input field is shown. • The content is one value out of a list. This is represented by a dropdown box containing the possible choices. If this attribute is optional (use=’optional’), an additional is put on the top of the list. • A last case is when the attributes value is fixed. Here, this value is simply displayed without any possibility to edit.
Figure 4: Editing the fragmentkind attribute of the fragment element
6
In the example template the fragment element can have an attribute with the name fragmentkind of type string. (fig. 4) 2.2.4 bag
The bag element is a more special XTiger element. It is a free content area, and puts much less constraints on the structure of its content. Inside a bag, a list of target language elements are allowed (specified by the types attribute) as root elements and inside these root elements all descendant element types allowed by the target language can be inserted. This specification makes it very difficult to represent it as a WYSIWYG-editor.
Figure 5: Editing inside a bag element A bag element is again represented by an orange box, and a click on it brings up an editor. A first thought was to use an editor like fckedit[7] or kupu[1]. These editors are however intended to edit the visual style of (X)HTML documents, but the goal of XTigerEdit is to edit documents with XHTML or any other XML language as target language and there two different elements may not be visually distinguishable. I decided therefore to implement my own editor, which does not hide the XML-tags. The editor has on the top a section with the possible tags that can be inserted at the current cursor position. A click on one of these tags will insert an open- and closing tag of this element in the editor at the cursor position. In the case of a selected text part, the selection is surrounded by the tags, but making sure the content remains well formed XML. Furthermore, when typing, the & and < characters are replaced by their respective entities and are handled as one single character. Also the tags are treated as a whole and can only be deleted either by placing the cursor on the left of the opening tag and pressing delete or by pressing the backspace key with the cursor placed at the right of the closing tag. An example of a bag is the element labeled freecontent in the example template. (See fig. 5)
7
3
Implementation
The generation of the editors interface is mainly done through XTigerTrans, which was slightly modified and extended (especially to support the attribute element). It uses the scripts.js file, located in transformations/editor, which provides the callback functions described in section 3.2 and defines the event listeners used in the editors interface. A second file, named editors.js, at the same path, defines the three classes XTigerUseEditor, XTigerBagEditor and XTigerAttributeEditor which handle the editors of the use, bag, and attribute elements respectively. The XTigerEdit class holds the internal instance (see below) and provides all the necessary methods to do the modifications on it. The code has been tested on the Linux version of Firefox 3.0.5 and works on every version of Firefox later than 3. There should be not many adaptations needed to make it work on every standard-compliant browser.
3.1
Setting up the transformation
The loading of the XTiger template is based on the viewer application of XTigerTrans. The template is first loaded into memory as a JavaScript XML Document. This document is passed to a new XTigerEdit object, where it is used as a working copy of the template (referred to as internal instance in the rest of this document). 3.1.1 Stylesheets
XML documents can have a CSS stylesheet associated to define how they are represented in the browser. For XTiger templates with a target language other than XHTML, this can be done by defining an xml-stylesheet processing instruction. The urls of these stylesheets can be given in a relative way, but once the template is loaded in the editor, the browser sees them as relative to the XTigerTrans transformation files path and therefore they have first to be made absolute. 3.1.2 Template or instance?
As the editor can open templates to create new instances, as well as reopen existing instances created with some XTiger editor (Amaya for instance), the difference between templates and instances has to be made. The main difference between an instance and a template is that the instance does not have a head element. Therefore, if no head element is found, the original template
8
Figure 6: Illustration of the transformation of a template using XTigerTrans and the interaction with the internal instance. is opened. The url of this template can be found in the xtiger processing instruction. This url, which can be relative, has first to be converted into an absolute one. After this, the original template can be opened and its head element is appended to the instance document. 3.1.3 The target language
In some cases, the possible target language elements need to be known (for instance to determine the allowed elements inside a bag element). In XTigerTrans, these elements were stored inside the transformation file. As in XTigerEdit this transformation file is not intended to be touched by an editor user, and should be used for multiple target languages, the specification of the target language elements is done in a separate xhtml file, stored in the transformations/editor folder and with the name of the target language namespace with “/” replaced by “-” (e.g. the file for xhtml would be named http:--www.w3.org-1999-xhtml.xhtml).
3.2
The editor
After every XTiger element transformation, XTigerTrans calls a callback function to perform additional actions. A second callback function is called 9
at the end of the document transformation. These callback functions are defined in the transformations/editor/scripts.js file (as illustrated in figure 6). In the following, I will describe the generation and editing for each of the four XTiger elements. 3.2.1 use
For the use element, first the type of its content is determined in order to select the correct entry in the dropdown box. If there is only one possible allowed type, the selector is completely hidden. If the selected type is a primitive type or a target language element, a new editor object is created and the currentType attribute is set in the internal instance. In case the selected type is a component, the content of this use has to be replaced by the generated content of the component. This transformation has to be done in the second callback function. Therefore, the path of the element is put into a queue, which is processed at the end of the document generation. Furthermore, the path of this element is added to the html document. This path is used to maintain a link between the element displayed in the editor and the internal instance. To edit a use elements content, the value has to be clicked. Then a textarea will be created whose height adapts to the size of the content (height = scrollHeight). When the new value has to be saved, the saveValue method of the editor is called, which sets the new value in the internal instance. 3.2.2 repeat
As with uses of component type, repeat elements need to be processed again after the transformation of their child elements. For every repeat element encountered during the transformation, the path is added to the html document, and it is placed in a queue to be processed later. This queue is processed in the second callback function. Here, the occurrences of a repeated element are numbered to be able to distinguish them in an XPath query and the add and remove icons are associated to every single occurrence. What has to be pointed out here is the difference between the presentation of the repeated items in the editor and their representation in the internal instance. In the editor, an item can be added at every position and every item can also be deleted, as long as the restrictions given by the minOccurs and maxOccurs attributes are respected. A direct application of these additions 10
and removals on the internal instance would invalidate the XPathes stored with every repeated item and break the link between the editor and the internal instance. As a solution, every new occurrence is added as the last child of the repeat element in internal instance and at the correct place in the editor. A mapping table stores the relation between the two position. Similarly, a deletion is only performed in the editor and its mapping is set to null in the previously mentioned table. Only at the time of saving the items in the internal instance are reordered and deleted respectively. 3.2.3 bag
The bag is the most complicated XTiger element. It allows different types as top level elements and as descendant elements. The original XTigerTrans implementation did not correctly implement the types of a bag, in that it did not make the difference between top level and descendant elements. I modified XTigerTrans to correctly output the values defined in the types, include and exclude attributes. When a bag element is encountered, the content of the types attribute represents the allowed top level types. The descendant level types should then be all the descendant elements allowed by the target language without those defined in the exclude attribute. In the current implementation, however, there is no way to determine which elements are allowed by the target language at a specified place, and which not. As a simplification, I decided not to restrict the descendant level elements to those defined by the target language, but to present all possible elements of the target language, without those specified in the exclude attribute and leave the correct usage up to the user. As for the use, the elements content has to be clicked to edit it. The editor presented is a textarea with at the top a list of tags that can be inserted at the current cursor position. A click on one of these tags will insert a start tag at the selection start and an end tag at the selection end. The selected region is extended in a way to always produce valid xml. A keypress event listener makes sure that the < and & characters are replaced by their respective entities (< and &) and that tags and entities are considered as a whole. This makes also sure that the removal of an opening tag also removes the closing tag and vice versa. A problem that persists, is that the user can paste any string into the textarea and in this way produce invalid xml. For this reason, there is a change event listener that checks if the textareas content is valid and if necessary resets it to the previous content.
11
3.2.4
attribute
The attribute element was completely ignored in the original XTigerTrans implementation. Therefore the attribute element had first to be implemented in XTigerTrans. Its implementation in the editor differs slightly from the three elements presented before, in that not its own content is edited, but it gives instructions on how to edit the attributes of its parent element. The first time an attribute element for a target language element is encountered during the transformation, an XTigerAttributeEditor object is created and every subsequent attributes for this element will be added to this editor object. The editor is then displayed after the document transformation, in the second callback function. The attributes are set when a value in the attributes editor changes.
3.3
Save
There are two different ways to “save” the edited document. Save is however not the right term, because it would suggest that a file would be written, which is not possible with JavaScript. The first form is the XTiger instance, with all the XTiger elements present, the second is the final document, that does not have any XTiger element and cannot be reopened for editing. To save the instance, first the head element is removed if existent, as it is not allowed and needed in the instance. After this the items of potential repeat elements have to be put in the correct order. And finally an xtiger processing instruction has to be added which contains the templates url. The final document, without XTiger elements is then obtained by applying an XSLT transformation to the previously produced instance. This XSLT transformation removes all elements belonging to the XTiger namespace or which are descendants of optional use elements that have no initial=’true’ attribute. Furthermore the xtiger processing instruction is again removed.
3.4
Logger
To help debugging the code, I created a logger, which can be called to display some debug messages in a separate window. The log messages are preceded by the time the message was written and can take any colour value defined in css. To enable the logging, the last line in editor/logging.js needs to be changed to var logging = new Log(true); 12
Figure 7: The log window with some debug messages.
4
Future work
The editor presented in this report has some limitations. It could be extended in different ways: • The implementation of the bag element is not done in a WYSIWYG way. The bag editor could be adapted to immediately reflect the visual effect of the tags and maybe hide the tags from the user, as it is done in some Richt Text Editors like kupu[1] or FCKedit[7]. The main difficulty here will be to hide the tags but still show the difference of elements that may not be distinguishable by their visual style. • In some way take into account the DTD of the target language. This will mainly be useful for the bag, to be able to not only present all possible target language elements, but only the ones that are really allowed. • The interface could be done in a more WYSIWYG style. First of all not all XTiger element should be presented in block style, as it is done now, but the block / inline style of the parent element should be adopted.
5
Conclusion
This project shows the implementation of a layout driven editor for XTiger templates. It also reveals some weaknesses of the XTiger specification, especially regarding the possibility to create a fully WYSIWYG editor. It is quasi impossible to present all the details of the template, without destroying 13
the layout that the final document will take. Also the bag element, trying to give a maximum of flexibility to the document author, makes it very difficult to represent it. All things considered, XTigerEdit is an almost-WYSIWYG editor using XTiger templates that implements the majority of the XTiger specifications.
Acknowledgements
Finally, I want to thank St´phane Sire for his helpful tips and the support e he gave me in doing this project.
14
References
[1] Kupu. http://kupu.oscom.org/. [2] Torsten Anacker. Formulare: Text an Cursorposition einf¨gen. u http://aktuell.de.selfhtml.org/artikel/javascript/bbcode/, 11 2007. [3] Tim Berners-Lee. What the Semantic Web can represent. http://www.w3.org/DesignIssues/RDFnot.html, September 1998. [4] David Flanagan. JavaScript: The Definitive Guide, Fifth Edition. O’Reilly Media, Inc., Fifth Edition edition, August 2006. [5] Francesc Campoy Flores, Vincent Quint, and Ir`ne Vatton. Templates, e microformats and structured editing. In DocEng ’06: Proceedings of the 2006 ACM symposium on Document engineering, pages 188–197, New York, NY, USA, 2006. ACM. [6] Danny Goodman. JavaScript & DHTML Cookbook. O’Reilly Media, Inc., August 2007. [7] Frederico Caldeira Knabben. FCKeditor. http://www.fckeditor.net. [8] Vincent Quint and Ir`ne Vatton. Structured Templates for Authore ing Semantically Rich Documents. In Proceedings of the 2007 international workshop on Semantically aware document processing and indexing, ACM International Conference Proceeding Series; Vol. 259, pages 41–48. ACM, may 2007. [9] W3C. Amaya. http://www.w3.org/Amaya/. [10] W3Schools. CSS2 Reference. http://www.w3schools.com/css/css reference.asp. [11] W3Schools. XML http://www.w3schools.com/dom/default.asp. DOM Tutorial.
[12] W3Schools. XPath Tutorial. http://www.w3schools.com/xpath/default.asp. [13] Jonathan Wafellman. DocReuse: template based document editor. XTigerTrans, July 2008. ´ [14] Ir`ne Vatton Emilien Kia, Vincent Quint. XTiger Language Specificae tion, July 2008. Version 1.0.
15