An Intelligent Agent Based Text-Mining System: Presenting Concept through Design Approach
Description
IJCSIS, call for paper, journal computer science, research, google scholar, IEEE, Scirus, download, ArXiV, library, information security, internet, peer review, scribd, docstoc, cornell university, archive, Journal of Computing, DOAJ, Open Access, April 2011, Volume 9, No. 4, Impact Factor, engineering, international, proQuest, computing, computer, technology
Document Sample


(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
An Intelligent Agent Based Text-Mining
System: Presenting Concept through Design
Approach
1 2 3
Kaustubh S. Raval Ranjeetsingh S. Suryawanshi Professor Devendra M. Thakore
M.Tech. (Computer Engineering) M.Tech. (Computer Engineering) Department of Computer Engineering
raval_kaustubh@yahoo.co.in ranjeetsuryawanshi06@gmail.com dmthakore@bvucoep.edu.in
1, 2, 3
Bharati Vidyapeeth Deemed University,
College of Engineering, Pune – 411043.
Abstract – Text mining is a variation on a field useful to the data owner. It derives business
called data mining and refers to the process of intelligence from the data warehouse by using
deriving high-quality information from unstructured advanced analytical techniques such as neural
text. In text-mining the goal is to discover unknown
network heuristics, fuzzy logic, statistical analysis
information, something that may not be known by
etc.
people. Now here the aim is to design an intelligent
agent based text-mining system which reads on the
text (input) and based on the keyword provide the
Automated Data Mining: Using automated data
matching documents (in the form of links) or options mining we can sweep through databases and
(statements) according to the user’s query. In this discover previously unknown patterns. In their
paper the effort is to depict design approach for paper [1], Dr. V. Saravanan and J. Rajan proposed
intelligent agent based text mining system. an automated data mining system which compasses
familiar data mining algorithms. According to them
Keywords – Data Mining, Text Mining, Intelligent
the system will automatically select the appropriate
agent.
data mining technique and select the necessary field
I. INTRODUCTION needed from the database at the appropriate time
without expecting the users to specify the specific
First of all, we need basic information about
techniques and the parameters.
various terms on which this work is to be carried
out.
Text Mining: Text-mining is a variation on a
field called data-mining and refers to the process of
Data Mining: Data mining is the analysis of
deriving high-quality information from the
(often large) observational data sets to find
unstructured text. ‘High quality’ in text-mining
unsuspected relationships and to summarize the
data in novel ways that are both understandable and
112 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
usually refers to some combination of relevance, Engine’ is the best example of optimized intelligent
novelty and interestingness. [3] software agent based text-mining system
encompassing a very large domain of web.
II. SYSTEM DESIGN
Intelligent Agents: Intelligent agents are
System design includes use-case diagram and
software entities that carry out some set of
sequence diagram. Use-case diagram depicts how
operations on behalf of a user with some degree of
the user interacts with the proposed intelligent
independence or autonomy, and in doing so,
agent based system whereas the sequence diagram
employ some knowledge or representation of the
depicts how the flow of actions carried out by
user’s goals or desires. Software agents are useful
different agents in the system.
in automating repetitive tasks, finding and filtering
information, intelligently summarizing complex
data, and so on, but more importantly, just like their
human counterparts, intelligent agents can have
capability to learn from the managers and even
make recommendations to them regarding a
particular course of action. Agents have several
common characteristics, such as their ability to
communicate, cooperate, and coordinate with other
agents in system. Each agent is capable of acting
autonomously, cooperatively, and collectively to
achieve the collective goal of a system. The
coordination capability helps manage problem
solving so that co-operating agents work together as
a single team. [9]
Motivation
The literature study of various research papers
and my interest in the field of ‘Data Mining’
motivated me to take up this as my dissertation
topic for post-graduation.
Study of existing biomedical text mining
system, named, ‘PolySearch’ also provide the
insights to overall ‘text mining system’ and thus
lead me to take up ‘Intelligent Software Agent
Based Text Mining’ as my dissertation topic.
Working scenario of ‘Google Search Engine’
Fig. 1 User Interacting with system
also has been the motivational factor to take up this
As shown in the Fig. 1 user will type the text
topic as my dissertation work. ‘Google Search
then text miner agent 1, which is keyphrase-based,
113 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
will decide the keyword then intelligent agent will III. SYSTEM DESCRIPTION
decide the context for that ‘keyword’ then text System description is the context which
miner agent 2, which is keyword based, will decide includes the details about the overall working of the
the meaning of the keyword in particular context, existing or proposed system.
find out related documents, calculate weight matrix Why Agents?
value and then attach that value to the document. Text mining mainly includes the field of
Then intelligent agent will rank the documents information retrieval which means the finding of
based on weight-matrix values. documents which contain answers to questions and
not the finding of answers itself and for this to
achieve statistical measures and methods are used.
By using statistical measures and methods
automatic processing of text data and comparison to
given question is performed. But the issue here is
how to automate the processing of text data? And
that is where ‘Agents’ come into picture.
System Architecture
Fig. 5 shows the architectural diagram for
intelligent agent based text-mining system. It
includes all the components required to make the
system workable and the relationship and
interaction between them. There are mainly three
agents, one dataset, the user category, and one
cache/log component.
Working of the Intelligent Agent in two phases::
Phase 1:
Takes the input from Text Miner Agent 1 (that
is key-phrase/keyword).
Find out the contexts (documents) for key-
phrase word.
Phase 2:
Takes input from Text Miner Agent 2 that is
links and their associated weight matrix values.
Compare the weight matrix values of various
Fig. 2 Sequence Diagram links and decide which one is the ‘close-to-
Fig. 2 shows the sequence diagram of the best-match’ for user’s query.
system interaction diagram between different The link with the highest weight matrix value
agents of the system. ranked first, the link with second highest
weight matrix value ranked second, the link
114 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
with third highest weight matrix value ranked determined. Fig. 4 shows the pictorial view of the
third and so on. working of the intelligent agent in phase 1 in terms
Display the ranked links to the user. of flowchart.
In phase 2, the intelligent agent takes the input
from text miner agent 2, that is ‘Keyword based
agent’. The input contains the list of links
(documents/options) with associated ‘weight matrix
value’. These links are retrieved by checking the
every context, containing different documents, in
which the ‘key-phrase’ or ‘keyword’ has appeared.
Now, using ‘Decision making algorithm’ the
intelligent agent decides which one of the many
links (documents/options) is the ‘close-to-exact-
match’ for the information user is looking forward.
The link (document/option) with associated
highest ‘weight matrix value’ is decided to be the
‘close-to-best-match’ then the next link with second
highest ‘weight matrix value’ is the second best
match and so on. Then these links are ordered and
ranked according to their ‘weight matrix value’ and
presented to the user. Fig. 5 shows the pictorial
view of the working of the intelligent agent in
phase 1 in terms of flowchart.
Fig. 3 Architecture of Intelligent Agent Based Text-
Mining System
Phases in working of Intelligent Agent
In the proposed ‘Intelligent agent based’
system, the intelligent agent should have to work in
two phases.
In phase 1, the intelligent agent would prompt
the text miner agent 1, which is ‘Key-order and
Key-phrase based agent’, for the required ‘key-
phrase’ based on which various contents need to be Fig. 4 Working of Intelligent agent in phase 1
115 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
base for the specific ‘Text-Mining System for
Medical Science’ and provide the automated
way of dealing with details required for various
diseases and their probable solutions.
2) Space Science
There are always new researches are going
on in the field of space science and those are
mainly related to astronomy.
Scientists are working to find out the cause
of earth’s birth, how the environment has been
developed on earth? How these all planets were
taken birth? How the perimeters have been
decided for every planet? All these types of
questions require mining of too much
information and scientists have to look for each
and every aspect of the information very
carefully.
Thus, the system which is to be developed
Fig. 5 Working of Intelligent Agent in phase 2
can work as the base for ‘Text-Mining System
for Space Science’ and provide the useful
IV APPLICATIONS
information to scientists for their research work.
The proposed system would work as the base
for some specific fields where there is a
3) Engineering Technologies
requirement of intelligent agent based text-mining.
Each of these fields has different requirements Engineering is the field which encompasses
for the type of information according to various various specific fields in it. All these fields have
uses. specific applications and this requires dealing
with too much text content. Engineers in
1) Medical Science
different fields need to be finding out solutions
for various technological and technical
In medical science field, the new inventions
problems. Now, dealing with huge amount of
of medicines and vaccines are increasing day by
text data is not an easy task, so it’s better to
day. So, the doctors need to be aware of what is
have an automated (intelligent agent based)
going on in their field? Moreover, doctors are
system to perform all this work.
concerned to cure patients properly using
medicines and by other means.
The intelligent agent based text mining
system works with huge amount of data and
Thus, the system which is to be developed
retrieve required data in fraction of seconds or
under this dissertation work will provide the
minutes (In an ideal condition). Thus the
116 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 4, April 2011
intelligent agent based systems can speed up the [7] Dae Su Kim, Chang Suk Kim, and Kee Wook Rim,
data retrieval and processing. “Modelling and Design of Intelligent Agent System”,
International Journal of Control, Automation, and
Thus, the system which is to be developed Systems Vol. 1, No. 2, pages 257-261, June 2003.
can work as the base for ‘Text-Mining System [8] Andreas Hotho, Andreas Nurnberger, and Gerhard
for Engineering Technologies’ and provide the Paaß, “A brief Survey of text mining”.
[9] Stuart Russell and Peter Norvig, “Artificial In
useful information to scientists/engineers for
telligence, Chapter 2: Intelligent Agents – A Modern
their research work.
Approach”.
CONCLUSION
AUTHORS PROFILE
Based on these design specifications, the
1) Kaustubh S. Raval graduated (B.E -
intelligent agent based text-mining system would
Computer Engineering) from Gujarat
be developed in which intelligent agent need to
University, Ahmedabad, and State-Gujarat
incorporate two algorithms:
in the year 2009. Currently pursuing
1) Decision making algorithm – to determine
M.Tech. (Computer) with specialization in
possible context (documents) for the
subject ‘Data Mining’ from Bharati
keyword.
Vidyapeeth Deemed University College of
2) Ranking algorithm – to rank the
Engineering, Pune.
documents (options).
2) Ranjeetsingh S. Suryawanshi graduated
(B.E – Computer Engineering) from Pune
REFERENCES
University, and State – Maharashtra in the
[1] Dr. V. Saravanan and J. Rajan, “A Framework of an
year 2005. Currently pursuing M.Tech.
Automated Data Mining System using Autonomous
Intelligent Agents”, International Conference on (Computer) with specialization in subject
computer Science and Technology, pages 700-704, 2008. ‘Data Mining’ from Bharati Vidyapeeth
[2] Ranjit Bose and Vijayan Sugumaran, “IDM: An Deemed University College of
Intelligent Software Based Data Mining Environment”, Engineering, Pune.
IEEE, pages 288-2893, 1998. 3) Professor D.M.Thakore graduated (B.E –
[3] Vishal Gupta and Gurpreet S. Lehal, “A Survey of Computer Engineering) from Shivaji
Text Mining Techniques and Applications”, Journal of
University, Sangali, and State –
Emerging Technologies in Web Intelligence, vol. 1,pages
Maharashtra in 1990.
60-76, August 2009.
He had pursued his M.E. (Computer) from
[4] Ah-Hwee Tan, “Text Mining: The state of the art and
the challenges”. Bharati Vidyapeeth University College of
[5] J. You and J. Liu, “An Agent Based Visual Data Engineering, Pune in 2004.
Mining for Intelligent Web Browsing with E-Commerce He is currently pursuing his Ph.D. with
Applications”, IEEE International Fuzzy Systems specialization in subject ‘Data
Conference, pages 936-939, 2001. Mining/Text Mining’ from Bharati
[6] Azuraliza Abu Bakar, Zulaiha Ali Othman, Abdul Vidyapeeth Deemed University College of
Razak Hamdan, Rozianiwati Yusof, Ruhaizan Ismail,
Engineering, Pune.
“Agent Based Data Classification Approach for Data
Mining”, IEEE, 2008.
117 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
Related docs
Other docs by ijcsiseditor
Digital Images Encryption in Spatial Domain Based on Singular Value Decomposition and Cellular Automata
Views: 0 | Downloads: 0
Agent Behavior in Multiagent Systems: Issues and Challenges in Design, Development and Implementation
Views: 1 | Downloads: 0
Optimizing Cost, Delay, Packet Loss and Network Load in AODV Routing Protocols
Views: 2 | Downloads: 0
Get documents about "