By Abhishek Roy B.Tech, Assam University NOTE : This document is to be used with the slide that I have uploaded for Presentation Purpose This is in accordance with the contents of the slide (to explain during the seminar). Web 3.0 Introduction: The Semantic Web is a "web of data" that enables machines to understand the semantics, or meaning, of information on the World Wide Web. It extends the network ofhyperlinked human-readable web pages by inserting machine-readable metadata about pages and how they are related to each other, enabling automated agents to access the Web more intelligently and perform tasks on behalf of users. The term was coined by Tim Berners-Lee, the inventor of the World Wide Web and director of the World Wide Web Consortium, which oversees the development of proposed Semantic Web standards. He defines the Semantic Web as "a web of data that can be processed directly and indirectly by machines." [[[[ The term Metadata is an ambiguous term which is used for two fundamentally different concepts (Types). Although an expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at design time the application contains no data. In this case the correct description would be "data about the containers of data". Descriptive metadata on the other hand, is about individual instances of application data, the data content. In this case, a useful description (resulting in a disambiguating neologism) would be "data about data contents" or "content about content" thus metacontent. Descriptive, Guide and the NISO concept of administrative metadata are all subtypes of metacontent.]]]] The Semantic Web as originally envisioned, a system that enables machines to understand and respond to complex human requests based on their meaning, has remained largely unrealized and its critics have questioned its feasibility. History( web 3.0 ) The term “Web 3.0” was first coined by John Markoff of the New York Times in 2006, while it first appeared prominently in early 2006 in a Blog article written by Jeffrey Zeldman in the “Critical of Web 2.0 and associated technologies such as Ajax”. Evolution of Web 3.0 Pre web: To fully appreciate why open standards are so important to the Web, let’s think back to the mid-1980’s to early 1990’s, when there were Internet service providers such as AOL, CompuServe, and Prodigy providing the first interfaces to the Internet to the general, non-geek population. In those days, content was strongly controlled by the providers. So was user behavior. For example, Prodigy customers could only easily connect with other Prodigy customers. We refer to this concept of lack of interoperability and freedom as a “walled garden.” While there were short-term advantages to these “walled garden” approaches, there were, and still are, also some longer-term risks and costs to not adopting a more open platform. Web 1.0 Defined It's hard to define Web 1.0 for several reasons. First, Web 2.0 doesn't refer to a specific advance in Web technology. Instead, Web 2.0 refers to a set of techniques for Web page design and execution. Second, some of these techniques have been around since the World Wide Web first launched, so it's impossible to separate Web 1.0 and Web 2.0 in a time line. The definition of Web 1.0 completely depends upon the definition of Web 2.0. With that in mind, if Web 2.0 is a collection of approaches that are the most effective on the World Wide Web, then Web 1.0 includes everything else. As for what it means to be "effective," Tim O'Reilly says that it's providing users with an engaging experience so that they'll want to return to the Web page in the future. Here's a collection of strategies O'Reilly considers to be part of the Web 1.0 philosophy: Web 1.0 sites are static. They contain information that might be useful, but there's no reason for a visitor to return to the site later. An example might be a personal Web page that gives information about the site's owner, but never changes. A Web 2.0 version might be ablog or MySpace account that owners can frequently update. Web 1.0 sites aren't interactive. Visitors can only visit these sites; they can't impact or contribute to the sites. Most organizations have profile pages that visitors can look at but not impact or alter, whereas a wiki allows anyone to visit and make changes. Web 1.0 applications are proprietary. Under the Web 1.0 philosophy, companies develop software applications that users can download, but they can't see how the application works or change it. A Web 2.0 application is an open source program, which means the source code for the program is freely available. Users can see how the application works and make modifications or even build new applications based on earlier programs. For example, Netscape Navigator was a proprietary Web browser of the Web 1.0 era. Firefox follows the Web 2.0 philosophy and provides developers with all the tools they need to create new Firefox applications. Web 1.0 Next, let’s take a quick look back to the early days of the Web –Web 1.0 if you will --which was about generally static documents, linked together in simple ways. Tim invented the Web in 1989, when he submitted the first proposal and design to colleagues at CERN, the high-energy, particle physics lab in on the French-Swiss border.Like most ground- breaking inventions, the Web was defined by 3 simple, yet elegant, technologies: •Uniform Resource Locator or Identifier (URL or URI) to uniquely identify resources (e.g., documents, data) on the Web, and know where to find those resources. •Hypertext Markup Language (HTML) to represent content in terms of Web pages, and to express links •Hypertext Transfer Protocol (HTTP) to move Web data across the InternetIt became clear to Tim early on that there was a fourth fundamental element required for the Web to succeed. – openness. Just a few weeks ago, on 30 April, we marked the 15 year anniversary of Tim Berners- Lee’s invention being made freely available to the world by CERN. This momentous decision helped pave the way for the Web as we know it today –a global, open, interoperable, medium for communication, education, commerce, entertainment, and improved well being. Web 1.0 = Websites, e-mail newsletters and “Donate Now” buttons Web 1.0 is one person or organization pushing content out to many people via websites and e-mail newsletters. The donation process is not interactive or public. You donate and then receive a “Thank You” email. It’s one-way communication. Web 2.0 = Blogs, wikis, and social networking sites At its core, Web 2.0 is the beginning of two-way communication in the online public commons. People can post comments and converse with your organization in public for all to see. It’s one person or organization publishing content to many on social networking sites who then re-publish your content to their friends, fans, followers, connections, etc. Donating is a public experience. Friends, fans, followers, connections, etc. on social networking sites see your giving and fundraising activity through widgets, Apps, and peer-to-peering fundraising tools, likefundraising pages. Web 3.0 = Mobile Websites, Text Campaigns and Smartphone Apps Web 3.0 is all of the above except that the Web experience is no longer limited to desktop and laptop computers while stationary in one place. It’s the Internet on the go fueled by mobile phones and tablets. Mobile websites must be designed to be easily read on mobile devices. Group text campaigns function like e-mail newsletters in Web 1.0… to drive traffic to your mobile website. Text-to-Give technology allows quick, easy donations on your mobile phone inspired by urgent calls to actions. Smartphone Apps enable content to be published and shared easily while on the go. Effectively donating via smartphone Apps doesn’t exist yet, but its coming. Very soon. Web 1.0 + Web 2.0 + Web 3.0 = Integrated Web Communications What’s important to understand is that all three eras of the Web are complimentary and build and serve one another, rather than replace one another. They can also overlap. You use Web 2.0 tools to drive traffic to your website, to build your e-mail newsletter list, and to increase visits to your Donate Now buttons. You use your Web 2.0 communities to launch your Web 3.0 campaigns. And you use your Web 3.0 tools to grow your communities on social networking sites and to send supporters and donors to mobile versions of your e-mail newsletter “Subscribe” and “Donate Now” pages. What’s web 3.0? Web 3.0 is the concept of next evolution of World Wide Web about linking, integrating,and analyzing data from various sources of data to obtain new information streams.Also, Web 3.0 aims to link devices to generate new approaches of connecting to theweb by several machines and exchanging data among machines. However, the standard definition of Web 3.0 has not yet been emerged at this moment since Web 3.0 is mainly under developing by World Wide Web Consortium (W3C) to become a reality(Steve Bratt, Fast forward get ready for Web 3.0, 2008, P. 25-27).The main important purpose of Web 3.0, to link data, is supported by semantic web.Semantic web is a web that can demonstrate things in the approach which computercan understand. The system offers a common framework that helps data to beconnected, shared and reused across the applications, organizations and communities.The semantic web allows a person or a machine to begin with one database and then link through an infinity set of open databases which are not connect by wires, butconnect data by referring intocommon things such as a person, place, idea, concept,etc. Semantic web mainly operates on Resource Description Framework (RFD) which isstandard model for data interchange on the web. RDF is written in XML language thatcan easily be exchanged between the different types of computers with different types of operating system (http://www.w3schools.com/rdf/rdf_intro.asp, VIEWED 12/06/2008].Meantime, RFD joins structure of the web with Uniform Resource Identifiers (URIs) and allows original data in each database to form in an original form such as XML, Excel, etc because RFD builds an abstract layer separately from the underlying data format One of important logics behind the development of semantic web is Artificial Intelligence. The Artificial Intelligence (AI) is the field of computer science targeting to create machines that are able to occupy on behavior that humans consider intelligent (Herbet Simon, An introduction to the science of Artificial Intelligent, 1997). Thereby, some parts of semantic web technologies are relied on Artificial Intelligence research such as model technology for RDF and knowledge representation for ontology. However, the development of semantic web also generates new perspective for Artificial Intelligence community as the benefits of URIs linkage in RDF (http://www.w3.org/2001/sw/SW-FAQ#relAI, [VIEWED 14/06/2008]). Another objective of Web 3.0 is a ubiquitous web that facilitates accessibility for anyone, anywhere, anytime by using any devices. This objective desires to break barriers of bandwidth constraints, poor display on mobile device and cost of data besides computer device. Then, web 3.0 will enable a web linked of devices to match with the increasing in web of linked data by using Cascading Style Sheet layout (CSS) standards which allows HTML document to display in different output style, support content adaptation and use smaller image (http://www.w3schools.com/css/css_intro.asp, [VIEWED 14/06/2008]). In summary, web 3.0 composes of two main platforms, semantic technologies and social computing environment. The semantic technology represents open standard that can be applied on the top of the current web. Meanwhile, the social computing environment means web 3.0 focuses on human-machine synergy and desires to organize a large number of current social web communities. Semantic Web Architecture The architecture of semantic web is illustrated in the figure below. The first layer, URI and Unicode, follows the important features of the existing WWW.Unicode is a standard of encoding international character sets and it allows that all human languages can be used (written and read) on the web using one standardized form. Uniform Resource Identifier (URI) is a string of a standardized form that allows to uniquely identify resources (e.g., documents). A subset of URI is Uniform Resource Locator (URL), which contains access mechanism and a (network) location of a document - such as http://www.example.org/. Another subset of URI is URN that allows to identify a resource without implying its location and means of dereferencing it - an example is urn:isbn:0-123-45678-9. The usage of URI is important for a distributed internet system as it provides understandable identification of all resources. An international variant to URI is Internationalized Resource Identifier (IRI) that allows usage of Unicode characters in identifier and for which a mapping to URI is defined. In the rest of this text, whenever URI is used, IRI can be used as well as a more general concept. Extensible Markup Language (XML) layer with XML namespace and XML schema definitions makes sure that there is a common syntax used in the semantic web. XML is a general purpose markup language for documents containing structured information. A XML document contains elements that can be nested and that may have attributes and content. XML namespaces allow to specify different markup vocabularies in one XML document. XML schema serves for expressing schema of a particular set of XML documents. A core data representation format for semantic web is Resource Description Framework (RDF). RDF is a framework for representing information about resources in a graph form. It was primarily intended for representing metadata about WWW resources, such as the title, author, and modification date of a Web page, but it can be used for storing any other data. It is based on triples subject-predicate-object that form graph of data. All data in the semantic web use RDF as the primary representation language. The normative syntax for serializing RDF is XML in the RDF/XML form. Formal semantics of RDF is defined as well. RDF itself serves as a description of a graph formed by triples. Anyone can define vocabulary of terms used for more detailed description. To allow standardized description of taxonomies and other ontological constructs, a RDF Schema (RDFS) was created together with its formal semantics within RDF. RDFS can be used to describe taxonomies of classes and properties and use them to create lightweight ontologies. More detailed ontologies can be created with Web Ontology Language OWL. The OWL is a language derived from description logics, and offers more constructs over RDFS. It is syntactically embedded into RDF, so like RDFS, it provides additional standardized vocabulary. OWL comes in three species - OWL Lite for taxonomies and simple constrains, OWL DL for full description logic support, and OWL Full for maximum expressiveness and syntactic freedom of RDF. Since OWL is based on description logic, it is not surprising that a formal semantics is defined for this language. RDFS and OWL have semantics defined and this semantics can be used for reasoning within ontologies and knowledge bases described using these languages. To provide rules beyond the constructs available from these languages, rule languages are being standardized for the semantic web as well. Two standards are emerging - RIF and SWRL. For querying RDF data as well as RDFS and OWL ontologies with knowledge bases, a Simple Protocol and RDF Query Language (SPARQL) is available. SPARQL is SQL-like language, but uses RDF triples and resources for both matching part of the query and for returning results of the query. Since both RDFS and OWL are built on RDF, SPARQL can be used for querying ontologies and knowledge bases directly as well. Note that SPARQL is not only query language, it is also a protocol for accessing RDF data. It is expected that all the semantics and rules will be executed at the layers below Proof and the result will be used to prove deductions. Formal proof together with trusted inputs for the proof will mean that the results can be trusted, which is shown in the top layer of the figure above. For reliable inputs, cryptography means are to be used, such as digital signatures for verification of the origin of the sources. On top of these layers, application with user interface can be built. Why web 3.0 is important? Web 3.0 improves data management: when the contents come from various types of database structure, there are a lot of applications required to manage contents. In addition, some complex sets of data structure that computer is not able to understand how to link them together. This problem can be occurred when combing sets of data from different origin somewhere on the web, different format such as excel sheet or XHTML or different names for relation such as multilingual. The semantic website solves this problem by describing relationship between each data or things and properties; therefore, the computer can understand the relationship between sets of data and can integrate it together. Figure 6 compares the different processes between normal sets of data integration that involves human to be a center to merge datasets with set of data integration in semantic web that can automatically merge by the system. Web 3.0 supports accessibility of mobile internet: the number of mobile subscribers has surpassed 3 billion subscribers already in the end of 2007. The global mobile penetration rate in the end of 2007 was 48% which expected to continually grow in near future, particularly on BRIC economy countries (Brazil, Russia, India and China). Moreover, many mobile operators in the world tend to shift mobile technology base from 2G to 3G which represents greater channel to access internet via mobile devices (Vanessa Grey, ICT Market trends, 2008, P. 6). Hence, web 3.0 plays the main role to enhance internet accessibility via mobile because web 3.0 develops based on Cascading Style Sheet (CSS) Standard that helps to reduce the page size to lower than 20kb by smaller background image. Web 3.0 stimulates creativity and innovation: the main concept of web 3.0 promises that all global datasets will be linked together. The information and knowledge datasets can be apply by humans and machines have more efficiency than this moment. Therefore, this will be driven innovation process in term of idea generation and research and development (R&D) area that refer to easier way to discover new business model. Web 3.0 encourages factor of Globalization phenomena: web 3.0 aims to build standardize of data structure via RDF programming language. The datasets of current information in the World Wide Web will be unlocked from the existing data structure and integrated all data structure together in the same standard. This presents speed up Globalization phenomena in near future. Web 3.0 enhances customers’ satisfaction: by using Artificial Intelligence concept in web 3.0 that adds brain for computer, business units will easier to improve their customer satisfaction in term of Customer relationship management (CRM) such as clients can be provide broader group of information about the product in customer service webpage or related information from other datasets. Web 3.0 helps to organize collaboration in social web: nowadays many people register to be members of many social websites and many weblogs have been emerged, then, SOIC-project initiates purpose to merge the social web community information together by using semantic web technology in RDF. The process is to create distributed conversations across blogs, forum and mailing lists. Purpose : The main purpose of the Semantic Web is driving the evolution of the current Web by allowing users to use it to its full potential, thus allowing them to find, share, and combine information more easily. Humans are capable of using the Web to carry out tasks such as finding the Irish word for "folder," reserving a library book, and searching for a low price for a DVD. However, machines cannot accomplish all of these tasks without human direction, because web pages are designed to be read by people, not machines. The semantic web is a vision of information that can be interpreted by machines, so machines can perform more of the tedious work involved in finding, combining, and acting upon information on the web. Semantic Web application areas are experiencing intensified interest due to the rapid growth in the use of the Web, together with the innovation and renovation of information content technologies. The Semantic Web is regarded as an integrator across different content, information applications and systems, it also provides mechanisms for the realisation of Enterprise Information Systems. The rapidity of the growth experienced provides the impetus for researchers to focus on the creation and dissemination of innovative Semantic Web technologies, where the envisaged ’Semantic Web’ is long overdue. Often the terms ’Semantics’, ’metadata’, ’ontologies’ and ’Semantic Web’ are used inconsistently. In particular, these terms are used as everyday terminology by researchers and practitioners, spanning a vast landscape of different fields, technologies, concepts and application areas. Furthermore, there is confusion with regard to the current status of the enabling technologies envisioned to realise the Semantic Web. In a paper presented by Gerber, Barnard and Van der Merwe the Semantic Web landscape is charted and a brief summary of related terms and enabling technologies is presented. The architectural model proposed by Tim Berners-Lee is used as basis to present a status model that reflects current and emerging technologies. Need : When we search in Google for particular information, most of what we get on the first page are the links to websites without any information useful to us. To obtain the Website that we need, we might have to use different keywords or go to the second or third SERP. Without using our intelligence, we can't get the required result. Programs cannot see what people can. Google is a dumb machine discharging its bots throughout the Web, scanning for keywords. When it finds a keyword in any site already indexed by it, it will present the link to you. It is up to you to decide if the site is actually useful or not. Hence, most of the time, the first search results of Google are not what you want; they either contain technical jargon allover or advertisements, not the specific thing you want. With the advent of Web 3.0, this is all going to change. Web 3.0 aims to make the Internet itself a huge database of information, accessible to machines as well as humans. When Web 3.0 becomes popular, we will have a data-driven web, enabling us unearth information faster from the net. You can get the machines to contribute to your needs, by searching for, organizing, and presenting information from the Web. That means, with Web 3.0 you can be fully automated on the Internet. Besides this, with machine intelligence, you can achieve tasks like the following very easily: automating share transactions; checking and deleting unwanted emails; creating and updating websites; and booking your movie tickets, airplane tickets, etc. Web 3.0 is going to be actually the era of artificial intelligence enabled programs sprawling the Web. Uses : Examples : When we want to search for particular information, more often than not, we get the answers after multiple searches. However, with Web 3.0, this task will be carried out in one search itself. Once you read some examples of Web 3.0, this will become more clear to you. If you want to go out for a movie of a specific genre and also want to eat out after the movie. You will type in a complex sentence and the search engine will fetch the answer for you. An example of Web 3.0 will be "I want to go for an action movie and then eat at a good Chinese restaurant. My options are?". You query string will be analyzed by the Web 3.0 browser, looked up the Internet and will fetch all the possible answers and also organize the results for you. Certain health data can also be looked up on the Internet using Web 3.0. One of the Web 3.0 examples for health search can be, a patient might want to ascertain, what is he suffering from with the set of symptoms, he is currently facing. Like I have mentioned previously, after assessing the query, the web browser will fetch the results. However, there is a loophole here. The data may not be accurate, as there can be multiple diseases, which may have similar symptoms. Challenges: Some of the challenges for the Semantic Web include vastness, vagueness, uncertainty, inconsistency, and deceit. Automated reasoning systems will have to deal with all of these issues in order to deliver on the promise of the Semantic Web. Vastness: The World Wide Web contains at least 24 billion pages as of this writing (June 13, 2010). The SNOMED CT medical terminology ontology contains 370,000 class names, and existing technology has not yet been able to eliminate all semantically duplicated terms. Any automated reasoning system will have to deal with truly huge inputs. Vagueness: These are imprecise concepts like "young" or "tall". This arises from the vagueness of user queries, of concepts represented by content providers, of matching query terms to provider terms and of trying to combine different knowledge bases with overlapping but subtly different concepts. Fuzzy logic is the most common technique for dealing with vagueness. Uncertainty: These are precise concepts with uncertain values. For example, a patient might present a set of symptoms which correspond to a number of different distinct diagnoses each with a different probability. Probabilistic reasoning techniques are generally employed to address uncertainty. Inconsistency: These are logical contradictions which will inevitably arise during the development of large ontologies, and when ontologies from separate sources are combined. Deductive reasoningfails catastrophically when faced with inconsistency, because "anything follows from a contradiction". Defeasible reasoning and paraconsistent reasoning are two techniques which can be employed to deal with inconsistency. Deceit: This is when the producer of the information is intentionally misleading the consumer of the information. Cryptography techniques are currently utilized to alleviate this threat. Conclusion: Web 3.0 is all about the backend of the Web, about creating extreme machine interfacing. When the Web 3.0 interface becomes more popular, it will entirely change the way we access the Internet. We humans will no longer have to do the difficult tasks of researching on the Internet and finding the exact information. Machines will better do all these tasks. We only will need to view the data, modify it in the way we want, and create whatever new thing we wish to create. Final: The next evolutions of the so-called WEB Will be: – REAL 3D – and the INCLUSION of the other three senses...taste...touch...smell But the term Web may be replaced, and the software, integrated with another evolved software or Major add-on... Like Google Maps Live - Web 1.0 was the Hypertext/CGI Web. (the basics) - Web 2.0 is the Community Web (for people: apps/sites connecting them). - Web 3.0 is the Semantic Web (for machines). Web 2.0 and Web 3.0 are a fork we are moving into now, where one is focused on internet architectures for people/community/usability and the other is focused on internet architectures for machines. Web 4.0 is when these technologies come together to form what I call the "Learning Web". This is moving more into the area of Artificial Intelligence. The Learning Web is where the Web is actually learning by itself and is a user alongside human users, generating new ideas, information and products without direct human input. This may be possible on a large-scale when more sensors/actuators/semantic structure/ontologies are advanced and in place someday (maybe 10-15 years).