Embed
Email

On Building a Custom Research Paper Web Application

Document Sample

Shared by: Jun Wang
Categories
Tags
Stats
views:
0
posted:
1/5/2012
language:
pages:
16
A web application for browsing research papers









By: Rhea Dookeran 09‟

Overview

 Goal: To build an interactive web interface based on the

XML output of Amy‟s Machine Learning algorithms.



 Usage: A class tool for CS 349 that enables convenient

browsing and selection from a corpus of CS research

papers.



 Technology

 PHP

 XML

 HTML, CSS, JavaScript

Comparison

 CiteSeer (http://citeseer.ist.psu.edu/citeseer.html)

 Focuses on CS/IS literature

Positive

 Allows you to search by through citations,

acknowledgments and Google docs.

 Can assign tag to the document.

Negative

 Hovering shows one abstract at a time.

 Clicking on the link redirects you to a new page.

 Abstract and list of citations

CiteSeer

Comparison

 GoogleScholar (http://scholar.google.com/)

Positive

 Large corpus

 Does not cover one specific field of study.

 Familiar

 Follows the generic Google Search Engine Results Page (SERP)

layout. (Minus ads)



Negative

 Hard to compare papers. No abstract: only shows the line where

search term/s appear. Have to click on link to learn more.

Google Scholar

Comparison

 Citeulike (http://www.citeulike.org/)

Positive



 Can add tags to documents and filter by common tags

 Overall Web 2.0 style

Negative



 Not visually appealing. Must scroll down to see results.

 Must click on title to learn more (slide menu categories)

Citeulike

Our System‟s Key Features

 Allows user to compare papers side by side before choosing to

view the whole document. (slide menu and hover functionality)

 Result filtering:

 Author- List, click or search

 Keyword- Click, cloud or search

 Title- Cloud or search

 Easy to use and visually appealing

 Dynamic/intuitive interface

 Menus scroll with user

 PDF files open in a new tab (easily forgotten in other systems)

 Not built over a relational database.

So how does it work?

 XML (Extensible Markup Language)

 Tag based data storage format

 To allow one to describe a document in terms of its structure,

rather than its page layout

 Used PHP‟s Document Object Model (DOM) library

 „context-rich‟

 Tags can be defined to specify both element names and

attributes.



 In essence, the tags describe a flat-file database

So how does it work?

 PHP

 Widely used scripting language that can be embedded

into HTML

 Dynamic content using static file

 Reading demonstrated how to use the DOM Library to

parse XML.

 Runs on the web server and can run on most OS‟s

 Free download:

 Wamp Server running on PC

 Includes Apache, MySQL and PHP5 for windows

 http://www.wampserver.com/en/





Searching the Web



Arvind Arasu

Junghoo Cho

Hector Garcia-Molina

Andreas Paeckpe

Sriram Raghavan





An overview of current Web search engine design. After introducing a generic search engine architecture,

we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the

use of link analysis for boosting search performance. The most common design and implementation

techniques for each of these components are presented. For this presentation we draw from the literature

and from our own experimental search engine testbed. Emphasis is on introducing the fundamental

concepts and the results of several performance analyses we conducted to compare different designs.





Authorities, crawling, HITS, indexing, information retrieval, link analysis, PageRank, search engine









So how does it work?

 JavaScript (Jquery)

 Used to create slide effects

 One of many useful and extensive JavaScript libraries

 “…simplifies HTML document traversing, event handling,

animating, and Ajax interactions for rapid web development.”

 Code is concise thanks to “chainability”

 Object-oriented programming design pattern

 Every method within jQuery returns the query object itself,

allowing you to 'chain' upon it, for example:

 menuYloc=

parseInt($(name).css(“top”).substring(0,$(name).css(“top”).indexOf(“

px”))) – one line grabs the “100” out of “…top:100px;…” in css file

 Free and easy to download

Summary

 Process

 3 major iterations

 1.0- Basic display & browsing of all documents (10/page)

 2.0- Basic display s.t. keywords were active filtering links

 3.0

 Active author, and keyword links

 Floating side menus – list of authors and paper frequency, # to display

 Basic search (by title, author or keyword) functionality

 Cloud of most common words.





 Speed Bumps

 Dealing with non-alphanumeric ASCII characters

 Figuring out the most appropriate term search algorithm

 Version compatibility issues -> wamp server



 Positive

 Gained valuable experience using popular web application development tools.

 Had fun designing!

 Research sparked interest in semantic web

Future Plans

 Add tag assignment functionality.

 Integrate the system into the Semantic Web

using RDF.

 Expand upon filtering functionality

 Modularize code for use in other CS courses

Sources

 XML and PHP

 http://php.net/

 http://www.ibm.com/developerworks/library/os-

xmldomphp/

 http://www.sis.pitt.edu/~mbsclass/standards/fischer/XML

Uses.html



 PHP/Apache/MySQL PC download

 http://www.wampserver.com/en

 Jquery

 http://jquery.com/



Related docs
Other docs by Jun Wang
Management Two
Views: 6  |  Downloads: 0
Management training Red Cross branch offices
Views: 6  |  Downloads: 0
Management subjekt_ CR
Views: 5  |  Downloads: 0
Management Styles_1_
Views: 22  |  Downloads: 0
Management stratégique
Views: 5  |  Downloads: 0
Management Standards at CARE - CARE Academy
Views: 6  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!