An Architecture for Personal Semantic Web Information Retrieval System
Document Sample


An Architecture for Personal Semantic Web Information Retrieval System
– Integrating Web services and Web contents
Haibo Yu Tsunenori Mine and Makoto Amamiya
Graduate School of Information Science Faculty of Information Science
and Electrical Engineering and Electrical Engineering
Kyushu University Kyushu University
6-1 Kasuga-Koen, Kasuga 6-1 Kasuga-Koen, Kasuga
Fukuoka 816-8580, Japan Fukuoka 816-8580, Japan
yu@al.is.kyushu-u.ac.jp {mine, amamiya}@al.is.kyushu-u.ac.jp
Abstract Currently, there are a lot of researches such as [25] [13]
[27] trying to apply semantic Web technologies to Web in-
The semantic Web and Web services technologies have formation retrieval systems, but they all address only prob-
provided both new possibilities and challenges to automatic lems concerning certain phases or certain aspects of the to-
information processing. There are a lot of researches on ap- tal complex issues involved. There isn’t any research ad-
plying these new technologies to current personal Web in- dressing the semantic issues from the whole life cycle of
formation retrieval systems, but no research addresses the information retrieval and architecture point of view.
semantic issues from the whole life cycle and architecture However, for the reasons we show below, we argue that
point of view. Web services provide a new way for accessing it is important to clarify the requirements of a Web infor-
Web resources, but until now, they have been managed sep- mation retrieval system architecture to apply semantic web
arately from conventional Web contents resources. In this technology to it.
paper, we point out new system requirements and propose First, we need to ensure the semantics are not lost sight
a conceptual architecture for a personal semantic Web in- of during the whole life cycle of information retrieval, in-
formation retrieval system. It incorporates semantic Web, cluding publishing, querying, accessing, processing, stor-
Web services and multi-agent technologies to enable not ing and reusing. For example, current semantic Web portals
only precise location of Web resources but also the auto- such as SEAL [27] manage their semantic data for naviga-
matic or semi-automatic integration of hybrid Web contents tion, and semantic searching internally, but when they pub-
and Web service resources. lish data to the user, they will transform their semantic data
into HTML format in order to present human understand-
able information which can be accessed through a browser.
1. Introduction At this moment, the semantic is lost, and the user cannot
use it for further semantic processing. So the interfaces in-
volved in the whole life cycle of information retrieval tasks
1.1 Motivation need to be re-considered.
Second, efficient searching for high quality results is
With ever-increasing information overload, Web infor- based on pertinent matching between well-defined re-
mation retrieval systems are facing new challenges for help- sources and user queries, where the matching reflects user
ing people not only locating relevant information precisely preferences. Just as in current Web usage, when users use
but also accessing and aggregating a variety of information search engines to search for specific information, the quality
from different resources automatically. of the search results will be improved significantly if they
The semantic Web is an extension of the current Web are familiar with the mechanism of the indexing and make
in which information is given well-defined meaning, better use of advanced functionalities to select and combine key-
enabling computers and people to work in cooperation [5]. words well. In the same way, in a semantic Web information
It provides new possibilities for automatic Web information retrieval system, we also need to help users to submit per-
processing. tinent queries and efficiently incorporate their preferences
based on mechanisms through which a provider categorizes Second, “integrating Web contents with Web services.”
and publishes its semantic data and Web services. So the As we mentioned earlier, Web services will provide a new
description of Web site capability and the way of submit- way for retrieving Web information. In fact, Web users do
ting queries incorporating user preferences should be con- not care about how the system discovers, accesses and re-
sistently considered from an architectural point of view. trieves information from what kind of resources, they only
Web service mechanisms provide a good solution for ap- care about the final results which can be directly used effi-
plication interoperability between heterogeneous environ- ciently. So, the particular characteristics and the concrete
ments. Though they have mainly been used for business realization details of both Web services and Web contents
processes until now, we have seen that WSRP [15] has been need to be hidden from users as much as possible. There-
approved as a standard of OASIS [4] to integrate remote fore an integrated or unified management of Web contents
portlets, and we can predict that Web services will soon be and Web services needs to be carried out through different
used by Web portals for information gathering, display and levels including the description of capabilities and require-
delivery [14]. Web services will provide a new way for ac- ments, querying, discovering, selection and aggregation.
cessing Web information and play a vital role in Web infor- Third, “providing a gateway to all the information that
mation retrieval activities. However, the conventional “Web the user is interested in.” Since the user needs to access
contents” resources target at human consumption but new and process a variety of internal and external information, a
“Web services” resources target at machine consumption. gateway to all relevant information is necessary. Although
Thus they have been managed separately for publishing, Web portals are trying to provide such gateways, they are
discovering, accessing, and processing until now. On the centralized resources using fixed organizational schema tar-
other hand, in the semantic Web, contents are given well- geting uniform access by large numbers of people [25].
defined meaning, and they are becoming such data that can However, “no one size can fit all,” and even a portal with
be understood and processed by machine as well. As both a wealth of resources can not satisfy all the requirements
Web contents and Web services will be consumed by ma- of a user. As a user is only interested in certain parts of
chines, this introduces the possibility and necessity of man- the resources provided by the portal, the personalization
aging them together in a personal Web information retrieval functionality and the integration of different Web portals are
system. strongly required.
In this paper, we propose a conceptual architecture for Currently, there are several Web portals providing per-
a personal semantic Web information retrieval system. It sonalization such as “My Yahoo [3],” “My AOL [2],” to ag-
incorporates semantic Web, Web services and multi-agent gregate desired channels, such as news, weather, or sports,
technologies to enable not only precise location of Web re- and view personalized contents. However their customiza-
sources but also the automatic or semi-automatic integration tion functions are limited as they lack semantics and are
of hybrid semantic information from Web content and Web separated from user’s local information. Relevant Web in-
service resources. formation needs to be stored, modified, searched, even pub-
lished as well as existing local information provided by the
1.2 Approach user, and the integration of Web information with local user
information is also necessary. We argue that a personal-
A conceptual architecture of our semantic Web informa- ized “Myportal [28]” can satisfy all the Web usage require-
tion retrieval system is constructed based on the following ments of a user. The “Myportal” is different from current
three main ideas. personalized portals in a sense that it is located on a user’s
First, “all participants contribute to the semantic descrip- own desktop or local server, owned by the user her/himself,
tion consistently.” The Web information retrieval system managing all the information based on semantic Web tech-
concerns three main kinds of participants: the “consumer” nologies, enabling integration of existing local user infor-
which searches for Web resources, the “provider” which mation and Web information, and providing full person-
holds certain resources, and the “mediator” which enables alization and flexible customization functions for all user-
the communication between the consumer and the provider. relevant information.
In order to guarantee semantic interoperability during the The rest of the paper is organized as follows: Section
whole life cycle of information retrieval, all participants 2 outlines our conceptual architecture, the components and
need to consistently contribute to the semantic description. their communication interfaces of a personal semantic Web
The provider needs to precisely describe their capabilities information retrieval system. Section 3 describes the inte-
and the users need to pertinently describe their requirements gration of Web services and Web contents. In section 4 we
as well. The mediator needs to correctly interpret the se- explain the process flow of an information retrieval system.
mantic dimension and to ensure that semantics are not lost Related work is discussed in section 5 and the concluding
sight of during the processing. remarks will be summarized in section 6.
Web Site 2 WSCD (Web Site Capability Description)
Web Site 1 Web Site n
WSCD WSCD WSCD GID (General Information Description)
(GID, WCD, WSD)
…
(GID, WCD, WSD) (GID, WCD, WSD)
WCD (Web Content Description)
PSA 1 PSA 2 PSA n
CSA WSD (Web Service Description)
Inference Engine Database SWSD (Semantic Web Service Description)
User Preference UIA SE KB CWSD (Concrete Web Service Description)
“MyPortal”
Figure 2. Web Site Capability Description
Figure 1. A Conceptual Architecture
Second, we give the Web content capability descrip-
2 A Conceptual Architecture tion (WCD) and Web service capability description (WSD).
There are links from GID to WCD and WSD for fa-
Our conceptual architecture for a personal semantic Web cilitating the further matching and use of Web contents
information retrieval system is illustrated in figure 1. and Web services. In order to semantically describe the
Because the P2P architecture provides a robust system capabilities and support the concrete realization of ser-
which accommodates to open and dynamic Web environ- vices, we express the service capability description in two
ments, we choose a P2P network architecture to connect layers: “semantic Web service description (SWSD)” and
consumers and providers. “concrete Web service description (CWSD).” This hierar-
Each provider describes their capabilities in what we call chical capability-describing mechanism enables semantic
a WSCD (Web site capability description) and is assigned a capability-describing and matchmaking for different levels.
PSA (provider search agent). Each consumer describes the Currently, there are only a few drafts of standards avail-
user’s requirements including preferences. It is assigned a able for describing semantic Web services such as OWL-S
consumer search agent (CSA) and also has a user interface [9] and WSMO [11], and none of them have been adopted
agent (UIA) that provides an intelligent unified interface to by any standards body at the present time. As OWL-S
the user. The CSA and PSA will function as mediators be- is the first well-researched Web service ontology, and cur-
tween a consumer and a provider by communicating with rently has numerous users from industries and academe, we
each other to fulfill the searching and accessing task. The use OWL-S for the semantic Web service description and
consumer is constructed as a “Myportal” providing a gate- WSDL [12] for the concrete Web service description.
way to all relevant information. The Web content description (WCD) is the metadata of
Web contents. It is composed of knowledge bases of all
2.1 Web site capability description (WSCD) domains involved. The domain ontologies are described in
OWL [20] and the metadata is described in RDF [16].
Resource location is based on matching between user The WSCD is put at the root directory of the Web site
requirements and Web site capabilities, so a capability de- as an RDF file, and the WCD and WSD can be reached
scription of Web sites is necessary. We describe the layered through the links of them.
capabilities of a Web site as shown in figure 2. For the details of our Web site capability description
First, we semantically describe the general capabilities mechanism, one can refer to document [29].
of the Web site, and we call this a “general information de-
scription (GID).” We argue that some explicit general ideas 2.2 “Myportal”
about a Web site are strongly required in order to precisely
locate Web resources based on user preferences. Therefore “Myportal” is a “one stop” that links the user to all the
a brief general information description of the Web site is de- information s/he needs. It resites on the user’s own desk-
fined at the top level. The GID gives an explicit overview of top or local server and is designed to satisfy user’s personal
the Web portal capabilities, and can be used as the initial fil- information requirements and to be mastered freely by the
ter for judging congruence with user preferences. The GID user her/himself. The information can be shared by others
includes the description about “Category,” “Topic,” “Type,” with proper authority. The structure of “Myportal” is shown
“Language,” “Scale,” “Audience,” “HomePageLink,” “Lo- in figure 3.
cation,” “ServiceLink,” “Security,” and “Functionalities”. “Myportal” is composed of three main functional com-
in missing or inherent information based on user prefer-
latropyM ences, breaks and transforms the requirements into formal
AIU ASC queries and sends them to the CSA. The CSA receives
noitagerggA & noitcelloC noitamrofnI gnisseccA noitamrofnI formal queries from the UIA, communicates with relevant
)remusnoc a sa noitcnuF( )redivorp a sa noitcnuF( agents, selects and invokes Web services, integrates the in-
launaM -imeS / citamotuA namuH enihcaM formation and sends the results back to the UIA.
dna gnisworB( citamotua ,flesmih resU( beW( The PSA receives queries from a CSA and returns
)gnihcraes / secivreS beW( ,rebmem ytinummoC ,secivreS
)relwarC )resu cilbuP )relwarC matching results to the CSA based on different preferences
and requirements.
enihcaM namuH
cilbuP ytinummoC flesmih resU rehtO & relwarC secivreS beW 2.4 Communication interfaces
rebmem snoitacilppa
In order to fullfill the information retrieval task, the in-
yreuQ egdelwonK secivreS beW terfaces between providers and consumers including query
enignE tnemeganaM tnemeganaM language and protocol for communicating those queries
resU )WK( esuoheraW egdelwonK need to be defined. As semantic Web information is based
ecnerefnI
secnereferP niamoD
DCSW secivreS beW enignE on RDF to represent data, a standard interface for query-
tnemeganaM egdelwonK ing and accessing RDF data is ideal for the interoperabil-
ity between heterogeneous environment. Currently, there
are many query languages for RDF data have been created,
Figure 3. Structure of “Myportal” but they lack both a common syntax and a common se-
mantics. The W3C RDF Data Accessing Working Group
(DAWG) has published their working drafts of RDF Query
ponents: core component, consumer component and Language SPARQL [23] and SPARQL protocol [8] that are
provider component. expected to be standards in this field. The RDF Query Lan-
The core component provides basic support for seman- guage SPARQL expresses queries over RDF graphs, and
tic technologies and information management. It consists SPARQL protocol for RDF defines a protocol for com-
of “Knowledge Warehouse (KW),” “Knowledge Manage- municating those queries to an RDF data service. The
ment,” “Query Engine (QE)” and “Inference Engine (IE).” applications can access and combine semantic Web infor-
As a consumer, it will bring together a variety of necessary mation across the Web by combining SPARQL query lan-
information from different resources automatically or semi- guage and protocol for RDF. Although our architecture is
automatically for the user. It is assigned a CSA to fulfill designed for any reasonable communication interfaces, we
the information retrieval tasks through the communication are currently planning to use SPARQL RDF query language
with provider agents. As a provider, the contents and ser- and SPARQL protocol as our communication interfaces be-
vices of “Myportal” can be consumed by humans as well tween providers and consumers.
as machines. The human can be the user or other permitted
persons, and the machine can be local or remote. The inter- 2.5 The description of user requirements
faces for browsing, searching and facilitating Web contents
and services need to be provided. We described “Myportal” The user requirements are reflected by his/her prefer-
in a little more detail in document [28]. ences, profile and constraints along with aquery. We pro-
vide a user interface which enables the input of all these
2.3 Mediator information. Input templates, default settings, and recom-
mendation lists are also provided. The missing or inher-
In our architecture, we use a multi-agent system called ent information will be inferred based on the user profile
KODAMA [30] that has been developed at Kyushu Univer- and preferences, and the requirements will be broken down
sity as our mediators. KODAMA is a high quality, large- and transformed into formal queries. The formal query is
scale multi-agent system which can operate in open envi- composed of three types of element fields: user preferences
ronments. It is a global distributed computing architec- (UPs), content query (CQ) and Web service query (SQ).
ture based on agent-oriented programming and was demon- And the responses will combine Web content and Web ser-
strated suitable for network-aware applications. vice information together. Even if the user does not explic-
The agents in our system consist of UIA, CSAs and itly describe their requirements on Web services for each
PSAs. query, searching for Web services potentially relevant to
The UIA receives requirements from the user, factors him/her will automatically be carried unless s/he explicitly
refuses such searching. agents on behalf of their users. So Web contents are in the
process of becoming data with well-defined meaning that
2.6 Ontology considerations can also be consumed by machines. Since they target the
same consumer, Web services and Web contents have the
The description of Web site capabilities and the man- necessary common ground to be managed together in a per-
agement of data in “Myportal” must be based on for- sonal Web information retrieval system.
mally defined vocabularies in order to make them machine- On the other hand, users also have requirements for
understandable and processable. Ontology is used to for- the aggregation of different Web services and the integra-
mally define terms and the relationships between them. tion of both Web services and Web contents in a personal
Currently, the style of the ontology for the future seman- Web information retrieval system. For example, there are
tic Web is still under discussion. A huge common ontology many Web portals or search engines supporting searching
or numerous small ontologies which are mapped to each functions (services), but users can only make use of their
other by mediators are possible styles. Our analysis showed searching functions one at a time with a browser interface
that a wide and shallow ontology for categorization is nec- and none of those searching results can be currently aggre-
essary and narrow and deep ontologies are also needed for gated together. Especially when we use this kind of seman-
the user’s specific interests such as research topic, business tic search functions, the semantic search results are trans-
or hobby. Though it is not yet a reality, we assume that the formed into HTML format for human consumption with
user and the providers are using the same ontology as what detaching semantic metadata described in RDF. Therefore
they involved at the current stage. it is necessary to deliver the semantic data through Web ser-
The Web site capability ontology should include the fol- vices and aggregate the semantic data from different Web
lowing component ontologies. services.
1) The general information description ontology: In
Our Web information retrieval system realizes unified
this ontology component, the terms used for the Web site
management and integration of Web services and Web con-
general information description such as “type,” “location,”
tents at different levels, including description, discovery, se-
and the relationships between them and restrictions on them
lection, and the aggregation of invocation results as we will
are formally defined.
describe in the following.
2) The domain specific ontology: A domain specific
ontology should be constructed in order to realize the in-
teroperability between all the applications and users of that
domain. The system can define its own ontology or reuse 3.1 Descriptions of capabilities and requirements
existing ones for domains that they involved.
3) The Web service ontology: This ontology compo-
nent defines all the terms, relationships, and restrictions On the provider side, as we described in section 2, we
concerning Web services. Here we use OWL-S [9] Web manage GID, WCD and WSD together as WSCD.
service ontology. The WSD can be reached through the GID and is de-
scribed in two layers: SWSD and CWSD. With unified
3 Integration of Web services and Web con- management, the Web services and Web contents can share
the same general information such as a category and the
tents domain ontology. The hierarchical capability-describing
mechanism enables semantic capability-describing and
Conventional Web contents target at human consumption matchmaking for different levels. We use WSDL [12] and
and are published with standard languages such as HTML, OWL-S for CWSD and SWSD respectively.
which can be accessed through a client browser. Standard
HTTP protocol is used for the communication between a The WCD is the metadata of Web contents. It is com-
Web server and a Web client. Web services, on the other posed of knowledge bases of all domains involved. The do-
hand, target at machine consumption, and are applications main ontologies are described in OWL [20] and the meta-
which can be realized at heterogeneous systems, published data is described in RDF [16].
with a standard language such as WSDL [12] and accessed On the consumer side, we provide a template-style in-
by applications through a standard protocol such as SOAP put interface, enabling users to input or select their pref-
[17]. Due to their different usages by different consumers, erences as well as query items from recommendation lists.
Web contents and Web services have been managed sepa- The formal query is composed of three types of element
rately until now. fields: user preferences (UP), contents query (CQ) and Web
However, in the semantic Web, information is marked service query (SQ). And the responses will combine Web
up with metadata and can be manipulated by autonomous contents and Web services information together.
3.2 Discovery UIA. The results from different Web services invocation as
well as the results of Web contents will be aggregated by
There are three models that Web service discovery is the CSA into a refined final result based on user preferences
based on: matchmaking, broker and P2P [7] and there is and be sent to the user through the UIA. This result can
also centralized and decentralized searching for Web con- be evaluated, modified and stored in the user’s “Myportal”
tents. Our Web information retrieval system is based on a knowledge warehouse for the future reuse. The integration
P2P architecture, and the matching is realized by the PSAs of different Web service invocation results and Web con-
on the provider side. tents is based on their common RDF data model.
OWL-S is an ontology for Web services, but before we
use the ontology of a specific Web service, we need to posi- 4 Process Flow
tion a service within the broad array of services that exists in
the world. OWL-S 1.1 [9] provides an example of profile-
The total process flow of the Web information retrieval
based class hierarchies [10] for categorizing Web services.
system can be illustrated as shown in figure 4.
We noticed that almost all the Web service providers pro-
vide Web services for machine consumption as well as con- Provider Consumer
sistent Web information or browser based services for hu- Profile & Preferences
Capability Description
man consumption at the same time. Thus the information (WSCD)
“Myportal”
Knowledge Warehouse (KW)
both for human and machine consumption is generally con- User: Requirements Description
sistent and in the same category. Therefore we think it is UIA: Completing missing information,
PSA: Matching GID with transforming into formal query
reasonable to use the category information inside the GID preferences (Score1) SE: Search inside “Myportal” KW
to find potential Web sites which possibly contain relevant PSA: Matching WCD with
CQ (Score2)
Found relevant information? Yes
Web services first, and then do the detailed matching of ex- PSA: Matching WSD with CSA: Send requests to PSAs
isting services based on their OWL-S descriptions. SQ (Score3)
List of relevant information
The information discovery is based on matching between PSA: Send matching result to CSA
if total score > threshold
Web Web sites, Web
Web sites,
contents,
sites contents services
user requirements and provider capabilities. We do match- services
Selection
ing at three levels. First, we do matching of Web site general Potential providers Relevant Web sites,
description (GID) against user preferences to see whether PSA: Communication with CSA Invocation
Web contents
they match at the overview level or not. Second, we do User: Intervention
Invocation results
matching of Web contents, and finally do the matching of
Integration
Web services. A matching score will be given from the
User: Evaluation, modification and storing
matching of each level and they will be used for the final UIA: Modify preferences
judgment of relevance of Web contents and Web services.
There are researches on semantic Web services such as
Figure 4. Process Flow
[18] and [19]. We make use of their research and devel-
opment results for our semantic Web service matchmaking
and processes.
Although we will not repeat the tasks of each informa-
tion retrieval phase that have been described in last section,
3.3 Selection we will emphasize on the following aspects.
First, searching for relevant information inside “Mypor-
As we described immediately above, matching of user tal” knowledge warehouse will be carried out first, and
requirements with provider capabilities will be done at three only when we can not find satisfied information from “My-
levels and a matching score will be given from the matching portal,” we will continue the searching from the other
of each level. PSAs will send back their matching scores providers. As we tend to repeatedly and frequently use a
to the CSA, and the CSA will judge and select the most certain amount of information from the Web but seldom or
relevant Web services and Web contents based on a total never use other information, it is essential to locally store
consideration of those matching scores. frequently used information for the user and the external
access only happens when the request cannot be satisfied
3.4 Aggregation locally. Because the information that interests the user is a
limited resource and external accessing time is decreased,
After selecting the most relevant Web services, the CSA the total retrieval time will be significantly decreased com-
will invoke those services. If the input information is not pared to a search of the vast open Web.
sufficient for triggering invocation, the CSA will request Second, the list of relevant information sent back from
the user to provide the necessary information through the PSAs will be different depending on the user preferences
and Web site capabilities. The user has different possi- can not only be located but also used as a computational
ble Web usages, such as only locating certain kind of Web part of the information retrieval system. RSS, Atom and
sites, locating certain kind of Web sites and their Web con- FOAF can be used for the Web contents capability descrip-
tents, only locating Web services, and locating all relevant tion which is a part of our Web site capability description.
resources including Web sites, Web contents and Web ser- There are Web portals based on Semantic Web technol-
vices. The provider may only have Web contents or have ogy, such as KA2 [1] and SEAL [27], which support a se-
both Web contents and Web services. Therefore the PSA mantic portal solution including ontology-based contents
may send back different possible list of relevant informa- construction and maintenance, but they target uniform ac-
tion resources as shown in figure 4. cess by large numbers of people for human navigation and
Third, during the invocation, if the input information in- searching. SEAL provided an interface for a software agent
side the query is not enough, the PSA will ask the user to but only for a crawler. None of them supports Web services
input missed information through the UIA. So the user in- for information aggregation and publishing at present, as far
tervention may occur during invocation. as we know. Our “Myportal” is a personalized gateway to
Fourth, the integrated information can be evaluated by all user-relevant information and it not only aggregates Web
the user and the evaluation results will be used for refine- information but also shares its information through Web ser-
ment of future searching. The information can also be mod- vices.
ified and stored into “Myportal” knowledge warehouse for Haystack’s per-user information environment [25] em-
the future reuse. The user preferences will be automatically phasizes the relationship between a particular individual
refined based on the searching and evaluation results. and his corpus. It automatically captures and modifies its
data and its retrieval process based on user behaviors in or-
5 Related work der to adapt its system to the user to realize personalization.
This user information system has not been constructed from
the Web portal point of view and doesn’t emphasize the
In this section, we discuss some related work that is di-
support of machine interoperability between users enabling
rectly or indirectly of interest to our research work.
Web service functionalities and user information sharing.
Francisco et al. [22] presented an architecture for an in-
The semantic Web browser [24] can search and present pos-
frastructure to provide interoperability using trusted portals
sible Web services for the user, but it does not aggregate the
and implemented such an infrastructure based on Thematic
invocation results of different Web services and Web con-
Portals. The searching portals use semantic access points
tents as we proposed. We refer to their ideas of personaliza-
based on metadata for more precise searching of the re-
tion in information retrieval and filtering, but construct our
sources associated with the potential sources of informa-
user information system as a fully personalized Web portal,
tion. The proposed architecture supports specific and cross
which supports Web services and can be accessed by the
domain searching, but only provides semantic representa-
others to form a basic unit of a P2P information retrieval
tion for the capabilities of Web contents not for their ser-
system.
vices as far as we understand. Our semantic Web site capa-
OWL-S [9] is an ontology of services which provides
bility description and pertinent user requirements and pref-
a mechanism for semantically expressing the capability of
erences description provide interoperability for both Web
Web services. In our approach, we use OWL-S to describe
contents and Web services.
Web portal service capabilities, and add another “General
RSS [26] and Atom [21] are lightweight multipurpose
Information Description” layer above it to enable the unified
extensible metadata descriptions and syndication formats.
management of Web services and Web contents. This will
They are XML-based applications and conform to the RDF
help in the precise location of Web portals as well as the
specification. A brief description of Web site capability can
efficient discovery and invocation of Web services.
be summarized with them and the summary can be used for
online publication, retrieval and further transmission or ag-
gregation. FOAF vocabulary [6] provides a collection of ba- 6 Conclusion
sic terms that can be used in machine-readable Web home-
pages for people, groups, companies and so on. The initial In this paper, we addressed the main aspects of a seman-
focus of FOAF has been on the description of people, but tic Web information retrieval system architecture trying to
now it is under extension to express other kinds of things. answer the requirements of next-generation semantic Web
RSS, Atom and FOAF vocabulary all focus on certain kinds users. We proposed a mechanism for semantically describ-
of Web contents description such as news, Web blog or peo- ing the capabilities of Web sites, enabling automatic discov-
ple, they do not include Web services as we proposed. Our ery of Web sites and Web contents as well as Web services.
Web site capability description describes not only Web con- Our “Myportal” aims at constructing a fully personalized
tents but also Web services, so the resources of the portal user’s local Web portal, which is adapted to user preferences
and satisfies all the requirements of a user’s Web usage. The [19] Massimo Paolucci, Katia Sycara, Takuya Nishimura,
user Web portal can be used as a basic unit of a P2P infor- Naveen Srinivasan. Using DAML-S for P2P Discovery. In
mation retrieval system. Proceedings of the First International Conference on Web
In the future, we will realize a prototype of a multi- Services, ICWS 2003, pages 203–207, June 2003.
[20] D. L. McGuinness and F. van Harmelen. OWL
agent based P2P personal Web information retrieval system,
Web Ontology Language Overview, February 10, 2004.
and evaluate the effectiveness of our proposed architecture
http://www.w3.org/TR/2004/REC-owl-features-20040210/.
based on it. Currently, we assume that all the portals, users [21] M. Nottingham. The Atom Syndication Format 0.3 (pre-
and agents in a community agree on a common ontology draft), December, 2003. http://www.mnot.net/drafts/draft-
that involved and use it to represent the semantics of Web nottingham-atom-format-02.html.
portal capabilities and Web services, but it’s not easy to get [22] F. Pinto, C. Baptista, and N. Ryan. Using Semantic Search-
this agreement in reality. We need to give further consider- ing for Web Portal Interoperability. In International Work-
ation to these ontology-mapping issues in the future. shop on Information Integration on the Web - Technologies
and Applications, April 9-11, Rio de Janeiro - Brazil, April
2001.
References [23] E. Prud’hommeaux and A. Seaborne. SPARQL Query Lan-
guage for RDF, April 19, 2005. http://www.w3.org/TR/rdf-
[1] KA2 Portal. http://ka2portal.aifb.uni-karlsruhe.de/. sparql-query/.
[2] My AOL. http://my.aol.com. [24] D. Quan and D. R. Karger. How to Make a Semantic Web
[3] My Yahoo. http://my.yahoo.com/. Browser. In Proceedings of WWW2004, pages 255–265,
[4] OASIS: Organization for the Advancement of Struc- 2004.
tured Information Standards. http://www.oasis- [25] D. Quan, D. H. uynh, and D. R. Karger. Haystack: A Plat-
open.org/home/index.php. form for Authoring End User Semantic Web Applications.
[5] T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic In Proceedings of ISWC2003, pages 738–753, 2003.
Web. Scientific American, May, 2001. [26] RSS-DEV Working Group. RDF Site Summary (RSS)1.0,
[6] D. Brickley and L. Miller. FOAF Vocabulary Specification. 2000-12-06. http://web.resource.org/rss/1.0/.
Sept., 2004. [27] N. Stojanovie, A. Maedche, S. Staab, R. Studer, and Y. Sure.
[7] M. Burstein and C. Bussler. A Semantic Web SEAL – a framework for developing SEmantic PortALs. In
Services Architecture, Version 1.0, January, Proceedings of the International Conference on Knowledge
2005. http://www.daml.org/services/swsa/note/swsa- Capture, pages 155–162, 2001.
note v3.html. [28] H. Yu, T. Mine, and M. Amamiya. Towards a Semantic My-
[8] K. G. Clark. SPARQL Protocol for RDF, January 14, 2005. Portal. In The 3rd International Semantic Web Conference
http://www.w3.org/TR/rdf-sparql-protocol/. (ISWC 2004) Poster Abstracts, pages 95–96, 2004.
[9] David Martin et al. OWL-S 1.1 Release, November, 2004. [29] H. Yu, T. Mine, and M. Amamiya. Towards Automatic Dis-
http://www.daml.org/services/owl-s/1.1/. covery of Web Portals -Semantic Description of Web Por-
[10] David Martin et al. Profile-based Class Hierarchies tal Capabilities-. In Semantic Web Services and Web Pro-
– Explanatory remarks for ProfileHierarchy.owl, OWL-S cess Composition: First International Workshop, SWSWPC
1.1, November, 2004. http://www.daml.org/services/owl- 2004, LNCS 3387/2005, pages 124–136, 2005.
s/1.1/ProfileHierarchy.html. [30] G. Zhong, S. Amamiya, K. Takahashi, T. Mine, and
[11] Dumitru Roman et al. D2v1.1. Web Service M. Amamiya. The Design and Implementation of KO-
Modeling Ontology (WSMO), Feb. 10, 2005. DAMA System. IEICE Transactions on Information and
http://www.wsmo.org/TR/d2/v1.1/20050210/. Systems, E85-D(4):637–646, April, 2002.
[12] Erik Christensen et al. Web Services Description Language
(WSDL) 1.1, March 15, 2001. http://www.w3.org/TR/wsdl.
[13] R. Guha, R. McCool, and E. Miller. Semantic Search. In
Proceedings of WWW2003, pages 700–709, 2003.
[14] S. Han. Commercial Portal Products. In DERI Research
Report, 2003-12-31.
[15] A. Kropp, C. Leue, and R. Thompson. Web Services for
Remote Portlets Specification. August, 2003.
[16] F. Manola and E. Miller. RDF Primer, February 10, 2004.
http://www.w3.org/TR/rdf-primer/.
[17] Martin Gudgin et al. SOAP Version 1.2 Part 1: Messaging
Framework, June 24, 2003. http://www.w3.org/TR/soap12-
part1/.
[18] Massimo Paolucci, Katia Sycara, Takahiro Kawamura. De-
livering Semantic Web Services. In Proceedings of Twelves
World Wide Web Conference, WWW2003, pages 111–118,
May 2003.
Related docs
Get documents about "