WebStyles and Ontologies: Information Modeling and Software Engineering for the WWW

Max Mühlhäuser, Ralf Hauber, Theodorich Kopetzky
Telecooperation Group, Johannes Kepler University Linz, A-4040 Linz, Austria
e-mail: ( max | ralf | theo ) @ tk.uni-linz.ac.at


Abstract: The hypertext paradigm is simple yet powerful and has the potential to eventually catalyse an old dream of the software engineering community: to overcome the boundaries between code and data, between dynamic and static information, to have a single common modeling concept. At present, however, the WWW does not offer sufficiently powerful concepts for making this dream come true. We present two closely related approaches which we view as missing links between hypertext today and the universal concept sought for. Considering the Web as a collection of loosely connected sets of rather self-contained webs, the two approaches contribute to the power of webs rather than to the power of the atomic building blocks (nodes, links, anchors). To this end, two-tiered GUTS approach for web typing will be introduced.




1   Typing for Webs: Requirements and Benefits

We use the term "(a) web" to denote a meaningful, coherent set of nodes and links; webs can be represented as graphs. They may contain other webs hierarchically as "meta-nodes" and comprise links to the "outside world". In the context o f the Workshop, we want to advocate typing concepts for webs which cover both structural and logical aspects. We believe that the re-use of Web Information can be sufficiently improved with such concepts, where re-use may refer to the issue of finding inf ormation by query or navigation or to the issue of incorporation of Web information into augmented hypermedia; by the latter we mean systems which are built on top of hypermedia-based structured information (such as, for instance, hypermedia-based learnin g systems, software engineering environments, decision support systems, and many more).

Typing for the atomic constituents of hypermedia (nodes, links, anchors) has been around for quite some time. Even in HTML, there is a primitive form of such typing in the sense that types of nodes are implicitly defined through the med ia and format (HTML, JPEG, GIF, Quicktime™ etc.), and types of links through the target type (local/remote document, "mailto:", etc.). More advanced hypermedia systems support user-defined types of nodes and links. Usually, this feature is used for modeling the application domain.

In addition to user-defined node and link types – which are present in many hypermedia systems, just not yet "really" in the standard Web – the following typing concepts are useful and can be considered as requirements for a re-use orie nted hypermedia system:

  1. Multi-level typing for atoms. In reality, there is often not a single application domain, but several ones: in hypermedia-based learning, for instance, there is the pedagogical domain, the subject matter domain, the media and formats of actual contents, and others. Adequate hypermedia systems thus ought to support several levels of type systems, and the mapping of one on the other (e.g., "concept" – a node type of an instructional strategy - might be mapped onto "Newton’s law " – a node type from the subject matter domain - which in turn could be mapped onto "video" – a node type from the media domain).
  2. Consistent types with semantics. For a given application domain, it should be possible to define a set of node/link types together with their (generic) semantics. E.g., for instructional theory, it should be possible to defin e "concept", "fact", "procedure" etc. as nodes, to specify relations among and between these nodes as links, to specify semantics of how these nodes and links are used in instructional design, and to use the entire type system in a variety of applications .
  3. Web types emphasizing construction, restriction, element types, and hierarchy. A web type should obviously describe a family of coherent sets of nodes and links. We argue in favor of a constructive approach since it ca n be used for guiding a Web author through the process of creating and maintaining a web – a web type should thus describe how its elements are to be put together in order to create a web as an instance of the web type defined. Virtually all constructive approaches have shown that the "laws of construction" alone are not sufficient for specifying a type: restrictions in terms of boundaries and constraints are usually needed in addition. Last but not least, web types should refer to node and link types as described above. Moreover, successful models have typically supported hierarchies of abstractions – thus, it ought to be possible that a web contains other webs as elements, not only nodes and links.

The main benefits which might be drawn from the above-mentioned desired features of web types are listed below; since they are not automatically provided if the above requirements are fulfilled, the following list can regarded as furthe r requirements.

  1. Software re-use through web types. Maybe the most important advantage of web types is their potential for re-use. In particular, web types may hold "code" (rules, procedures) which can be inherited to the instances i.e. concr ete webs. Taking the example of hypermedia-based learning, most of the available on-line hypermedia-based learning material today is in essence an exploratory hypermedia: nodes and links with little or no navigation support, adherence to an instructional strategy or user model, etc. The have been such advanced hypermedia-based learning systems, but the effort for creating the "intelligence" above the plain exploratory hypermedia has been enormous and could not be used further. With the advent of web types , such efforts could – at least in parts – be conserved by programming them in the context of (re-usable) web types.
  2. Reduction of cognitive load. The re-use of web types and of the "intelligence" i.e. code put on top of exploratory hypermedia has an important side-effect: entire departments or organizations, even the "communities" which bel ong to an application domain may decide to share common web styles. By doing so, the users will encounter "similar" webs and get accustomed to the commonalties, e.g., in "look & feel". Many of the drawbacks of hypertext in comparison to paper-based do cuments come from the lack of "culture" and "common look & feel". Web types can be an important step forward here.
  3. Improved information acquisition. The more type information is provided during the construction of a hypertext, the more such information can be used in the context of information retrieval. E.g., an author of instructional m aterial can only narrow down a search for material based on a rule like "instructional theory XYZ is required to have at least one example linked to any concept" if such rules become explicit in the corresponding web type.

2   Related Work

Looking for type concepts for webs, one finds that the Web community has concentrated on augmenting the power of nodes and links rather than the power of webs. Many observations back this claim, some of which are listed below:

While the Web (considered as a particular kind of hypermedia system) is by far the most wide-spread such system, even a de-facto-standard, it is by far not the most advanced system. As a consequence, several of the most interesti ng contributions to the field have been made for other hypermedia systems than the Web. A short excerpt of such contributions – independent of whether they are Web-compliant (cf. [MM97]) or not – i s given below. These contributions relate to fields as diverse as hypermedia authoring/design support, generics and dynamics in hypermedia, database schema approaches, and structural queries.

In summary, there have been considerable achievements in the attempt to provide design support for hypermedia; the most promising ones are based on type concepts of what we call webs in this paper. Most such developments relate to hyper media systems other than WWW. Even worse, the considerable achievements made are contrasted by a rather moderate state of commercially available authoring tools for WWW (with a minor exception being the "site management" support given by systems li ke NetObject Fusion™. The most general and most adequate representation of webs has been found to be a "graph" of nodes and links.

3   The GUTS Approach

The GUTS approach (generic unified typing system) described here leverages off multi-year research at the hypermedia learning group of the first author. It is based on two principle approaches which realize the abov e-mentioned requirements:

Using learning systems again as an example, their lifecycle may be supported, e.g., via ontologies for instructional analysis, for instructional design, for domain analysis, and so on. As to alternative approaches, there are for instanc e ontologies that express rather traditional instructional concepts and rather advanced ones.

At this point of course, the reader will not easily grasp how the introduction of an ontology might lead to support for a certain lifecycle phase or instructional concept, nor how WebStyle typing actually works. To this end, the authors decided to introduce further details of their concept by way of example, rather than by describing rather abstract "architectures" or "models".

The key to understanding GUTS is its way of representing knowledge. In the teaching context, knowledge means content of courses together with information about the entities involved in the teach ing-learning situation, for example content, courses and learners. The latter kind of knowledge is called meta-information.

Principal mechanisms for knowledge representation and inference as used in GUTS were thoroughly studied the fields of semantic networks and graph grammars (see for example [Sowa91] and [Rozen97], respectively). As advocated earlier, the basic underlying data structure is the Graph. Our extended notion of a Graphs — called WebStyles — compris e a grammar for expressing static (syntactic, structural) and dynamic (semantic, navigational) aspects. The following table may motivate these categories.

3.1   WebStyles

WebStyles are based on a work about "generic and dynamic aspects of hypermedia" [Richartz96]. They consist of three parts: generic webs, procedures, and rules. These generic webs are in e ssence graph grammars (cf. [Rozen97]) and consist of nodes and links. A first overview of the symbols used with WebStyles is shown in figure 1.

Figure 1: WebStyle symbols

Transformations: The most notable transformations in WebStyles concern the instantiation of "sequence nodes" (transforming into "chains" of nodes-and-links) and "fan links" (transforming into "bunches" of links originating from the same "source" node); of course, there are more transformations like instantiating a node to a real node. The following figures help to grasp how nodes and links can be transformed.

Transformations of a

 

a) sequence node

b) fan link

c) a simple web

d) web c) after n transformations

Figure 2: WebStyle transformations

In the left figure, two transformations have been applied to a sequence node. In figure b) a possible transformation of a fan link is shown. In figure c) a web consisting of two types of nodes and two types of links is presented. By app lying the transformations introduced in figures a) and b) to this web and mapping some generic nodes to instantiated nodes, the web shown in d) can be constructed.

To find out which nodes and links are affected by a transformation, the so-called track-algorithm is used. This algorithm defines which nodes and links belong to a track and marks them. In a second step all the marked objects are copied and connected to the original web.

Attributes, procedures, rules: Each WebStyle object has general attributes, like a name, and more specific attributes, like lower bound and upper bound. The bounds for example are used by the transformations and define how many nodes or links can be instantiated. Besides default procedures (like isTraversible which tells if an object may be traversed) user defined procedures and rules may be attached to nodes and links. These procedures and rules may influence the construction of a web even more (e.g. by constraining it) and may influence the navigation in such a web, too.

Further elements of generic nets: Two more types of objects are supported: alternatives and meta-nodes. Alternatives are used by the author of a web to offer a choice from different possibilities during construction. Meta-nodes help to model complex sub-webs and can be used e.g. to build tree-like structures. For more detailed descriptions cf. [Richartz96].

3.2   Ontologies

Knowledge representation involves classifying the ‘things’ to be represented, e.g. «Mars» is a «planet», «next» is an «order relation», «is a» is a «genus-species relation». Ideally the classes (concepts, types, the terms inside french quotes «…») are explicitly written down and put in relation with each other. This is called a theory, conceptualization, or, as is fashionable, an ontology. (Ontology as a part of philosophy is the study of being, or, the basic categories of existence. With the indefinite article, the term "an ontology" is often used as a synonym for a taxonomy that classifies the categories or concept types in a knowledge base. [Sowa91], p. 3)

There are ontologies for ‘everything’. For instance, in instructional design, if one wants to use Gagné’s events of instruction [GBW92], he could define an ontology containing «gain attention», «indicate goal», «recall prior knowledge», «present material», «provide learning guidance», etc. Or, to be able to talk in terms of Reigeluth’s elaboration theory [Reigel87], one needs «fact», «concept», «principle», and «procedure».

In these examples we did not consider any relations and formalization of semantics. If one tries to work out these aspects, it soon will be evident that something crucial is missing: How could such an ontology be defined? In which language? Answer: There is a particular ontology built in. GUTSrepresentation ontology is rich enough to capture the computational content of new, user defined ontologies. It comprises objects («object», «theory», «abstraction», «type», «rule») and relations («relation», «genus-species», «instance», «composition», «equivalence», «order», «derivation», «functional», «context»).

WebStyles with its representation ontology (the "basic, built-in" ontology that is used to define other ontologies) is a specialized representation language, as is KIF [GF92] with the so-called "frame-ontology". Sowa mentions that "the structure of a knowledge representation language depends critically on its ultimate goal" ([Sowa91] p. 157), and since WebStyles and KIF differ in purpose, their flavor, appearance and computational properties are different. Although WebStyles could be easily mapped to KIF (and back).

4   WebStyles Implementation; Workshop Contribution

The predecessor of WebStyles (called PreScripts) was developed in C/C++, a prototype of WebStyles in JavaScript. Java was chosen for the current implementaiton. The latter one features graphical editing of WebStyle webs (this includes m anipulation of the graph structure and the objects) and implements the complete track-algorithm which defines the semantics of web types.

In the workshop, we would like to demonstrate the WebStyle editor and discuss the value of ontologies and WebStyles for information and software modeling, for greatly increased WWW re-usability (for knowledge in expert domains, in parti cular), for sophisticated navigation support, for a hypertext "culture" and "common look & feel", for the seamless modeling and implementation of (static) information and (dynamic) software.

5   References

[FS89] R. Furuta, P. D. Stotts. Programmable browsing semantics in Trellis. Proc. ACM Hypertext '89, pp. 27-42
[GBW92] R.M. Gagne, L.J. Briggs, W.W. Wager. Principles of Instructional Design. 4th Edition , Hbj College & School Div, 1992

[GF92] M. R. Genesreth, R. E. Fikes et al. Knowledge Interchange Format, Version 3.0 Reference Manual, Computer Science Department, Stanford University, Technical Report, 1992
[GHM94] K. Grønbæk, J.A. Hem, O.L. Madsen, L. Sloth: Cooperative Hypermedia Systems: A Dexter-based architecture. CACM 37, 2 (Feb. 1994), pp. 64-75. cf. http://www.daimi.aau.dk/~kgronb ak/DEVISE/index.html
[MM97] A. Mendelzon, T. Milo. Formal Models of the Web, Prof. ACM Database Systems, Tucson, Arizona, June 1997
[Reigel87] C.M. Reigeluth (Ed.). Instructional Theories in Action. Lawrence Erlbaum Assoc, September 1987
[Richartz96] Martin Richartz. Generik und Dynamik in Hypertexten. Shaker Verlag, Aachen 1996 (in german)
[Rozen97] G. Rozenberg. Handbook of Graph Grammars and Computing by Graph Transformation: Foundations. World Scientific, 1997
[STH97] M. Salampasis, J. Tait, C. Hardy. HyperTree: A Structural Approach to Web Authoring. Software – Practice and Experience, Vol. 27(12), 1411-1426, December 1997
[SF89] P. Stotts, R. Furuta. Petri-net-based hypertext: document structure with browsing semantics. ACM ToIS 7(1), pp. 3-29
[Sowa91] J.F. Sowa. Principles of Semantic Networks. San Mateo, 1991
[SRB96] D. Schwabe, G. Rossi, S. D. J. Barbosa. Systematic hypermedia application design with OOHDM. Proc. 7th ACM Hypertext '96, pp. 116-128
[WebSQL] University of Toronto, http://www.cs.toronto.edu/~websql/