Web-Object Rank Algorithm For Efficient Information Computing by ijcsis


More Info
									                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                         Vol. 9, No. 2, February 2011

                  Dr. Pushpa R. Suri                                                     Harmunish Taneja
    Department of Computer Science and Applications,                            Department of Information Technology,
                Kurukshetra University                                          Maharishi Markendeshwar University,
         Kurukshetra, Haryana- 136119, India.                                      Mullana, Haryana- 133203, India
                pushpa.suri@yahoo.com                                               harmunish.taneja@gmail.com

Abstract - In recent years there has been considerable                search results based upon various lexicons. As the web
interest in analyzing relative trust level of the web objects.        contains the contradictions and hypothesis on a huge scale,
As the web contain facts and the assumptions on the global            therefore finding the relevant information using search
scale resulting on various criterions for trusting web page.          engines is a tedious job. With the help of object level
In this paper an algorithm is proposed which assigns a                ranking [22], various objects on a domain independent of
rank to every web object like a requested document on the             the query that describes the relative trust of the web page
web that specify the quality of that object or the relative           can be prioritized. The object rank of a page depends upon
level of trust one can make on that web page. It is used for          various factors associated with the web object.
object level information extraction for ranking search                    The organization of the paper is as follows. Related
results and is implemented in C++. In this paper the                  work is presented in section 2. Section 3 discusses the
behavior of object rank for different values of moister               challenges of high quality search results. In section 4,
factor in a domain is analyzed. The results emphasize that            Web_Object_Rank algorithm is proposed and discussed.
the moister factor can be useful in rank computation and              The algorithm is implemented in section 5. Finally Section
further explore more web pages in alignment with the                  6 concludes the paper on the basis of the results obtained.
user’s requirements.
                                                                                         II.   RELATED WORK
   Keywords- Random Surfer Model, Information                             Google is a prototype of a large-scale search engine
Computing, Web Objects, Information Retrieval System,                 that makes heavy use of the structure present in hypertext
Web Graph, Ranking, Object Rank.                                      [1]. Google is designed to crawl and index the web
                                                                      efficiently and produce much more satisfying search
                     I.    INTRODUCTION                               results than existing systems. Link Analysis Ranking [16]
    Information computing in various web domains is broadly           emphasize that hyperlink structures are used to determine
extracting the web objects of unstructured nature like text           the relative authority of a web page and produce improved
objects that convince information need from within large              algorithms for the ranking of search results. The prototype
collections using document-level ranking and therefore the            with a full text and hyperlink database of web pages is
structured information about real-world objects which is              available at [8]. In the current era there is much concern in
embedded in static web pages. Online databases exist on the           using random graph models for the web. The Random
web in huge amounts which are of unstructured nature.                 Surfer model [9] and the Page Rank-based selection model
Unstructured data refers to the data which does not have clear,       [11] are described as two major models [10]. Page Rank-
semantically obvious structure [7]. In other words information        based selection model tries to capture the effect that the
computing constitutes process of searching, recovering, and           search engines have on the growth of the web by adding
understanding information, from huge amounts of stored data.          new links according to Page Rank. The Page Rank
The information from the web can be retrieved by                      algorithm is used in the Google search engine [12] for
implementing searching techniques as Keyword based                    ranking search results. PageRank is a link analysis
Searching, Concept-based Searching, Hybrid Search, and                algorithm used by the Google Internet search engine that
Knowledge Base Search. In case of object level information            assigns a numerical weighting to each element of a
computing, domain based search is required. Every commercial          hyperlinked set of documents, such as the World Wide
information retrieval systems try to facilitate a user’s access to    Web (WWW), with the purpose of "measuring" its
information that is relevant to his information needs. This           relative importance within the set. Google is designed to
paper highlights ranking problem for domain based                     be a scalable search engine with primary goal to provide
information retrieval, which states that every owner of the           high quality search results over a rapidly growing WWW
document wants to improve ranking of its document for that it         [18]. The PageRank theory suggests that even an
can do many manipulations on its document like increasing             imaginary surfer who is randomly clicking on links will
number of links to the page by the dummy pages [1]. Object            eventually stop clicking. The probability, at any step, that
based information computing maintain the integrity of the             the surfer will continue is a damping factor d [2]. The

                                                                162                             http://sites.google.com/site/ijcsis/
                                                                                                ISSN 1947-5500
                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                        Vol. 9, No. 2, February 2011

damping factor (α) is eminently empirical, and in most cases            IV.      WEB_OBJECT_RANK ALGORITHM AND
the value of α can be taken as 0.85 [1]. Page Rank is the                        IMPLEMENTATION
stationary state of a Markov chain [2, 7]. The chain is obtained
                                                                         Page Rank of a web object can be defined as the
by perturbing the transition matrix induced by a web graph
                                                                     fraction of time that the surfer spends on an average on
with a damping factor that spreads uniformly over the rank.
                                                                     that object. The probability that the random surfer visits a
The behavior of Page Rank with respect to changes in α is
                                                                     web page is its Page Rank [1]. Evidently, web objects that
useful in link-spam detection [3]. The mathematical analysis
                                                                     are hyperlinked by many other pages are visited more
of Page Rank with change in α show that contrary to popular
                                                                     often. The random surfer gets bored and restarts from
belief, for real-world graphs values of α close to 1 do not give
                                                                     another random web object with a probability termed as
a more meaningful ranking [2,21]. The order of displayed web
                                                                     the moister factor (m). The probability that the surfer
pages is computed by the search engine Google as the
                                                                     follow a randomly chosen outlink is (1-m).
PageRank vector, whose entries are the Page Ranks of the web
pages [4]. The Page Rank vector is the stationary distribution           The Markov Chain is a discrete-time stochastic
of a stochastic matrix, the Google matrix. The Google matrix         process: a process that occurs in a series of time-steps in
in turn is a convex combination of two stochastic matrices:          each of which a random choice is made [7]. There is one
one matrix represents the link structure of the web graph and a      state corresponding to each web object. Hence, a Markov
second, rank-one matrix, mimics the random behavior of web           chain consists of N states if there are N numbers of Web
surfers and can also be used to fight web spamming. As a             Objects in the collection. A Markov chain is characterized
consequence, Page Rank depend mainly the link structure of           by an N × N Probability Transition Matrix P each of
the web graph, but not on the contents of the web pages. Also        whose entries is in the interval [0, 1]; the entries in each
the Page Rank of the first vertex, the root of the graph, follows    row of P add up to 1. Markov Property states that each
the power law [10]. However, the power undergoes a phase-            entry Pij is the transition probability that depends only on
transition as parameters of the model vary.                          the current state i. A Markov chain’s probability
                                                                     distribution over its states may be viewed as a Probability
    Link-based ranking algorithms rank web pages by using the
                                                                     Vector: a vector all of whose entries are in the interval [0,
dominant eigenvector of certain matrices--like the co-citation
                                                                     1], and the entries add up to 1. According to [7, 14] the
matrix or its variations [17]. Distributed page ranking on top of
                                                                     problem of computing bounds on the conditional steady-
structured peer-to-peer networks is needed because the size of
                                                                     state Probability Vector of a subset of states in finite,
the web grows at a remarkable speed and centralized page
                                                                     discrete-time Markov chains is considered.
ranking is not scalable [5].
     Page ranking can be propagation rates depending on the          A. Web_Object_Rank Algorithm: Features
types of the links and user’s specific set of interests [6]. Page        Features of Object Rank Algorithm are as follow:
filtering can be decided based on link types combined with
                                                                         Query independent algorithm (assigns a value to
some other information relevant to links. For ranking, a profile
containing a set of ranking rules to be followed in the task can            every document independent of query).
be specified to reflect user’s specific interests [20].                  Content independent Algorithm.
Similarities of contents between hyperlinked pages are useful            Concerns with static quality of a web page.
to produce a better global ranking of web pages [19].                    Object Rank value can be computed offline using
                                                                            only web graph.
                    III.     CHALLENGES                                  Object Rank is based upon the linking structure of
                                                                            the whole web.
    The primary focus of Web Information Retrieval Support
System (WIRSS) is to address the aspects of search that                  Object Rank does not rank website as a whole but
consider the specific needs and goals of the individuals                    it is determined for each web page individually.
conducting web searches [15]. The major goal is to provide               Object Rank of web pages Ti which link to page A
high quality search results over a rapidly growing World Wide               does not influence the rank of page A uniformly.
Web. Google employs a number of techniques to improve                    More are the outbound links on a page T, less will
search quality including page rank, anchor text, and proximity              page A benefit from a link to it.
information. Decentralized content publishing is the main                Object Rank is a model of user’s behavior.
reason for the explosive growth of the web. Corresponding to a
user query there are many documents that can be retrieve by          B. Web_Object_Rank Algorithm: Assumptions
search engine. And every owner of the document wants to                 If there are multiple links between two web objects,
improve the ranking of its document. Commercial search               only a single edge is placed.
engine have to maintain the integrity of there search results and
this is one reason for the unavailability of the efforts made by            No self loops allowed.
them publicly. Democratization of content creation on the web               The edges could be weighted, but we assume that
generates new challenges in WIRSS. This gives rise to the                    no weight is assigned to edges in the graph.
question on integrity of web pages. In a simplistic approach,
one might argue that only some publishers are trustworthy and               Links within the same web site are removed.
others not. One more challenge is fast crawling technology is               Isolated nodes are removed from the graph.
needed to gather the web objects and keep them up to date.

                                                               163                             http://sites.google.com/site/ijcsis/
                                                                                               ISSN 1947-5500
                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                        Vol. 9, No. 2, February 2011

C.   Web_Object_Rank Algorithm                                                            V.    IMPLEMENTATION
    This algorithm is basically a query independent algorithm               This implementation is based upon random surfer
that takes a web graph as an input and assigns a rank to every          model [7] and Markov chain [13, 14]. The random surfer
object which can specify the relative authorization of that web         visit the objects in the web graph according to distribution
page. In the proposed algorithm, following is the list of               based on which random surfer can be in one of the
variables                                                               following four possible states at any time.
 moist_fact (m) is the moister factor: the probability of                  Initial state is state of the system from where it will
     random surfer to restart search from another web object            start its walk. The system is set in the random state by
 1-m is the probability of the random surfer to search web             randomly selecting an object using random function and
     objects from randomly chosen outlinks                              value corresponding to that web object in the Probability
 outlinks is the number of web objects linked with a                   Vector is set to unity. Rest of the values in the Probability
     particular page                                                    Vector is zero. Steady state is that state of the system when
 N is the number of objects in the domain                              the Probability Vector of random surfer fulfills the
 prob[i][j] is the Probability Transition Matrix for all i ,j €        properties of irreducibility and aperiodicity’s. To check
     1 to N                                                             either the system get the steady state or not, two successive
                                                                        values of the Probability Vector must be same. Ideal state
 adj[i][j] is the Adjacency Matrix for all i ,j € 1 to N
                                                                        is that state of the random surfer when the system achieves
 x is the Probability Vector                                           the steady state but at the same time web object ranks are
 itr is Iteration                                                      distributed uniformly to all documents. Toggling state is
                                                                        achieved by the random surfer when the system is not able
D. Web_Object_Rank Algorithm                                            to reach at steady state and just toggle between two set of
                                                                        object ranks.
Step 1.     Create a web graph of various objects in a
Step 2.     Set prob[i][j]=adj[i][j]                                                 O
Step 3.     Compute number of out links from a particular
                   node say counter.
            IF outlinks of web objects = NULL                                                     4

            THEN prob[i][j] is equally distributed for all i ,j

            ELSE prob values are distributed according to                     O
            number of outlinks                                                2
            For all i,j IF (counter = 0)                                                              O                O
                       THEN                                                                           5                6
                       ELSE                                                           O
                                  IF (prob[i][j] =1)                                  3
                                  prob[i][j] =1.0/counter
                                                                                                               O                            O
Step 4.     Multiply the resulting matrix by 1 − m.
Step 5.     Add m/N to every entry of the resulting matrix,                                                    7                            8
            to obtain Probability Transition Matrix.
                       For all i , j Do
                       prob[i][j]=(prob[i][j]*(1- m))+((m/N);                                                          O
Step 6.     Randomly select a node from 0 to N-1 to start a                                                            9
            walk say s_int .
Step 7.     Initialize Random surfer and itr to keep account                                                                            O
            of number of iterations required to 0.                                                                                      0
Step 8.     Try to reach at steady state with in 200 iterations
            otherwise toggling occur
Step 9.     Multiplying Probability Transition Matrixes
            with Probability Vector to get steady state                                     Fig. 1. Web Graph
Step 10.    Check either system enters in steady state or not
Step 11.    Print the ranks stored in Probability Vector x
            and EXIT.

                                                                  164                            http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500
                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                            Vol. 9, No. 2, February 2011

             C. Results and Discussion
    The web graph shown in Fig 1 is used for analyzing various                                                            M oister Factor vsNo. of Iterations
factors of the proposed algorithm. Variation in graph structures                                                                       Moister Factor     No. of iterations
used for analysis change the performance of the algorithm. The
graph shows 10 web objects in a domain that are interlinked as
strongly connected graph. Every two nodes of the graph have a                                      200
path with less number of links. Oi is the ith web object in the

                                                                               No. of Iterations
domain where i vary from 1 to 10. The adjacency matrix for                                         150

web graph of Fig 1 is shown in Fig 2.

 0     1     0      0     0      0     0      0     0         0
 0     0     1      0     0      0     0      0     0         0
 0     0     0      1     0      0     0      0     0         0                                        0



















 0     0     0      0     1      0     0      0     0         0


















                                                                                                                                              Moister Factor
 0     0     0      0     0      0     1      0     0         0
 0     0     0      0     1      0     0      0     0         0              Fig. 3 . Moister Factor vs Number of Iterations
 0     0     0      0     0      1     0      0     1         0
 0     0     0      0     0      1     0      0     0         0              It is further analyzed that as the Moister Factor is equal
 0     0     0      0     0      0     0      0     0         1          to 1, random Surfer enters into the Ideal state and the
                                                                         corresponding rank values of the web objects is same as in
 0     0     0      0     0      0     0      1     0         0
                                                                         table 2. The graph for the ideal state is shown in Fig 4.
           Fig.2. Adjacency Matrix for all i ,j € 1 to 10
                                                                              Table 2: Ranks of objects at moister factor 1
    To analyze the convergence speed, number of iterations                    Object                 Computed Rank
required by random surfer to reach at a steady state is recorded                                   O1                                                   0.1
in Table 1 and the corresponding graph is shown in fig 3. In                                       O2                                                   0.1
fig. 3 infinity value is shown by a large number of iterations
(200 or more). It clearly shows that as the moister factor                                         O3                                                   0.1
approaches 1, the number of iterations is reduced.                                                 O4                                                   0.1
                                                                                                   O5                                                   0.1
        Table 1: Moister Factor Vs No. of Iterations
       Moister Factor           No. of Iterations                                                  O6                                                   0.1
              0                      Infinity                                                      O7                                                   0.1
            0.05                     Infinity                                                      O8                                                   0.1
             0.1                     Infinity                                                      O9                                                   0.1
            0.15                     Infinity
                                                                                             O10                                                        0.1
             0.2                        83
            0.25                        73
             0.3                        62                                                                        Computed Rank at Moister factor 1
            0.35                        46
             0.4                        41                                                                                                   Computed Rank
                                                                                       Computed Rank

            0.45                        33                                                              0.1
             0.5                        35                                                             0.08
            0.55                        39                                                             0.06
             0.6                        24                                                             0.02
            0.65                        21                                                                0
             0.7                        20

            0.75                        22                                                                                                 Web Objects
             0.8                        16
            0.85                        12                                                                  Fig.4. Random Surfer Ideal State
             0.9                        11                                   Figure 5 shows that for the Moister Factor less than
            0.95                        10                               0.2, no rank is provided to any web object and system
              1                         2                                enters into the toggling state with large number of

                                                                  165                                                           http://sites.google.com/site/ijcsis/
                                                                                                                                ISSN 1947-5500
                                                                                  (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                  Vol. 9, No. 2, February 2011

iterations for the given domain. Also, the ranks computed by                                                            REFERENCES
the proposed algorithm for moister factor values from 0.2 to 1                                  [1]    Sergey Brin , Lawrence Page, “The anatomy of a
are shown.                                                                                             large-scale hypertextual web search engine”,
                                                                                                       Proceedings of the 7th International conference on
                                                                                                       World Wide Web 7, p.107-117, April 1998, Brisbane,
                              Computed Object Ranks at various Moister Factor                          Australia
                                                                                                [2]    Paolo Boldi, Massimo Santini, S. Vigna, “PageRank
                       MF=0.25        MF=0.3       MF=0.35      MF=0.4          MF=0.45                as a Function of the Damping Factor”, International
                                                                                                       World Wide Web Conference Proceedings of the 14th
                       MF=0.5         MF=0.55      MF=0.6       MF=0.65         MF=0.7
                                                                                                       International conference on World Wide Web Chiba,
                       MF=0.75        MF=0.8       MF=0.85      MF=0.9          MF=0.95
                                                                                                       Japan pages: 557 - 566 Year of Publication: 2005
                       MF=1.0         MF=0.2                                                    [3]    Hui Zhang, Ashish Goel, Ramesh Govindan, Kahn
                   0.250000                                                                            Mason,and       Benjamin    Van    Roy.      “Making
                                                                                                       eigenvector-based reputation systems robust to
                   0.200000                                                                            collusion”,     In    Stefano    Leonardi     Editor,
   Computed Rank

                                                                                                       ProceedingsWAW 2004, number 3243 in LNCS,
                   0.150000                                                                            pages 92–104. Springer-Verlag, 2004.
                                                                                                [4]    Nie Z., Wu F., Wen J.R., and Ma W.Y., “Extracting
                   0.100000                                                                            Objects from the Web”, 22nd International
                                                                                                       Conference on Data Engineering (ICDE’06), pp 1-3,
                   0.050000                                                                            Year: 2006.
                                                                                                [5]    Jianfeng Zheng, Zaiqing Nie, “Architecture of an
                   0.000000                                                                            Object-level Vertical Search”, IEEE, in the
                                O1    O2    O3    O4    O5    O6     O7    O8      O9     O10          Proceeding of International Conference on Web
                                                       Web Object
                                                                                                       Information Systems and Mining, pp 51-55, Year:
                                                                                                [6]    Zhanzi qui,Matthias Hemmje,Erich J.Neuhold,
      Fig. 4. Moister factor (>.2) to different documents
                                                                                                       “Using Link types in web page ranking and filtering”;
    From the above graphs and analysis, we can say that the
                                                                                                       IEEE Computer Society Proceedings of the Second
moister factor plays a main role in this algorithm and
                                                                                                       International Conference on Web Information
performance of algorithm can be improved if this factor is
                                                                                                       Systems Engineering (WISE'01) Volume 1 ; Page:
selected properly. The value of moister factor can vary from 0
                                                                                                       311 Year of Publication: 2001
to 1 but in most of the cases system enter into the toggling state
                                                                                                [7]    Christopher D. Manning, Prabhakar Raghavan,
if value selected is less than 0.2 and at the value 1 system enter
                                                                                                       Hinrich Schutze, “An Introduction to Information
into ideal state giving insignificant results. Value must be
                                                                                                       Retrieval”,    Publisher:   Cambridge      University
closer to 1 but can not be 1. As shown in Fig. 2 systems
                                                                                                       Press New York, NY, USA , Pages: 461-
achieve a steady state in less number of iterations if moister
                                                                                                       470 Year: 2008
factor value is closer to 1.
                                                                                                [8]    http://google.stanford.edu/
                                                                                                [9]    Blum, T.-H. H. Chan, and M. R. Rwebangira, “A
                                                                                                       random-surfer web-graph model”. In ANALCO '06:
    The current study was conducted to demonstrate how the
                                                                                                       Proceedings of the 8th Workshop on Algorithm
link structure of the web can be used to provide the ranking to
                                                                                                       Engineering and Experiments and the 3 rd Workshop
various documents. This ranking can be provided offline. With
                                                                                                       on Analytic Algorithmics and Combinatorics, pages
the help of this approach one can prioritize the various
                                                                                                       238--246, Philadelphia, PA, USA, 2006. Society for
documents on the web independent of the query. However a
                                                                                                       Industrial and Applied Mathematics.
complete score computation is based on various other factors.
                                                                                                [10]   Prasad Chebolu, Páll Melsted,” PageRank and the
In the proposed algorithm a damping factor is used that play a
                                                                                                       random surfer model”, Symposium on Discrete
very important role on the analysis of the algorithm. After the
                                                                                                       Algorithms Proceedings of the 19th annual ACM-
analysis it is concluded that damping factor must not be
                                                                                                       SIAM symposium on Discrete algorithms; Pages:
selected closer to zero. At the damping factor one, the system
                                                                                                       1010-1018.Year : 2008
enters into the ideal state and the ranking provided is
                                                                                                [11]   Gopal Pandurangan, Prabhakar Raghavan, Eli Upfal,
insignificant. As per evaluation the damping factor must be
                                                                                                       “Using PageRank to Characterize Web Structure”,
selected greater than or equals to 0.5. However, if we consider
                                                                                                       Proceedings of the 8th Annual International
convergence speed as only factor to evaluate the performance
                                                                                                       Conference on Computing and Combinatorics, page
than the best moister factor will be .95. The proposed algorithm
                                                                                                       No..330-339, August 15-17, 2002.
is query independent algorithm and does not consider query
during ranking.

                                                                                          166                              http://sites.google.com/site/ijcsis/
                                                                                                                           ISSN 1947-5500
                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                        Vol. 9, No. 2, February 2011

[12]   Google                 technology               overview      [22]   Nie Z., Zhang Y., Wen J.R., and Ma W.Y. “Object-
       {http://www.google.com/intl/en/corporate/tech.html},                 level Ranking: Bringing Order to web Objects”, In
       2004                                                                 Proceeding of World Wide Web (WWW), 2007.
[13]   R. Montenegro,P. Tetali, “Mathematical aspects of
       mixing times in Markov chains”, Foundations and Trends        Dr. Pushpa R. Suri received her Ph.D. Degree from
       in Theoretical Computer Science Volume 1 , Issue              Kurukshetra University, Kurukshetra. She is working as
       3 (May 2006) Pages: 237 - 354 ;Year : 2006                    Associate Professor in the Department of Computer
[14]   Tugrul Dayar, Nihal Pekergin, Sana Younes; “Conditional       Science and Applications at Kurukshetra University,
       steady-state bounds for a subset of states in Markov          Kurukshetra, Haryana, India. She has many publications
       chains”, ACM International Conference Proceeding              in International and National Journals and Conferences.
       Series; Vol. 201 Proceeding from the 2006 workshop on         Her teaching and research activities include Discrete
       Tools for solving structured Markov chains Article No.:       Mathematical Structure, Data Structure, Information
       3 Year: 2006                                                  Computing and Database Systems.
[15]   Orland Hoeber, “Web Information Retrieval Support
       Systems: The Future of Web Search, Web Intelligence &         Harmunish Taneja received his M.Phil. degree in
       Intelligent Agent”, Proceedings of the 2008                   (Computer Science) from Algappa University, Tamil
       IEEE/WIC/ACM International Conference on Web                  Nadu and Master of Computer Applications from Guru
       Intelligence and Intelligent Agent Technology - Volume        Jambeshwar University of Science and Technology,
       03 Pages: 29-32;Year: 2008                                    Hissar, Haryana, India. Presently he is working as
[16]   Allan Borodin, Gareth O. Roberts, Jeffrey S. Rosenthal,       Assistant Professor in Information Technology
       Panayiotis Tsaparas, “Link analysis ranking: algorithms,      Department of M.M. University, Mullana, Haryana, India.
       theory, and experiments”, ACM Transactions on Internet        He is pursuing Ph.D. (Computer Science) from
       Technology (TOIT) Volume 5 , Issue 1 (Feb. 2005)              Kurukshetra University, Kurukshetra. He has published
       Pages: 231 - 297 Year: 2005                                   11 papers in International / National Conferences and
[17]   R. Lempel, S. Moran, “Rank-Stability and Rank-                Seminars. His teaching and research areas include
       Similarity of Link-Based Web Ranking Algorithms in            Database systems, Web Information Retrieval, and Object
       Authority-Connected       Graphs”,     Publisher: Kluwer      Oriented Information Computing.
       Academic Publishers, April 2005 Information Retrieval ,
       Volume 8 Issue 2, Pages: 245 - 264 ;Year : 2005
[18]   Sehgal, Umesh; Kaur, Kuljeet; Kumar, Pawan, “The
       Anatomy of a Large-Scale Hyper Textual Web Search
       Engine”, Computer and Electrical Engineering, 2009.
       ICCEE '09. Second International Conference on Volume
       2, 28-30 Dec. 2009 Page(s):491 - 495 ; Year 2009
[19]   Kritikopoulos, A., Sideri, M., Varlamis, “Wordrank: A
       Method for Ranking Web Pages Based on Content
       Similarity”, Databases, 2007. BNCOD '07, 24th British
       National Conference on 3-5 July 2007, Page(s): 92-100,
       Year: 2007 .
[20]   Zaiqing Nie, Ji-Rong Wen and Wei-Ying Ma, “Object-
       level Vertical Search” January 7-10, 2007, Asilomar,
       California, USA, 3rd Biennial Conference on Innovative
       Data Systems Research (CIDR), Year: 2007.
[21]   Zhi-Xiong Zhang, Jian Xu, Jian-Hua Liu, Qi Zhao, Na
       Hong, Si-Zhu Wu, Dai-Qing Yang, “Extraction
       knowledge objects in scientific web resource for research
       profiling”, IEEE, Baoding, 12-15 July 2009, pp 3475-
       3480, Eighth International Conference on Machine
       Learning and Cybernetics, Year: 2009.

                                                              167                              http://sites.google.com/site/ijcsis/
                                                                                               ISSN 1947-5500

To top