Docstoc

03_Web OCR Services in Indian Language

Document Sample
03_Web OCR Services in Indian Language Powered By Docstoc
					     Web OCR Services in Indian
            Language
                              Presented By:
       Tushar Patnaik            Bhupendra Kumar          Deepak Kumar Arya
   School of Information       School of Information     School of Information
        Technology                  Technology               Technology
       CDAC, Noida                 CDAC, Noida              CDAC, Noida
tusharpatnaik@cdacnoida.in   bhupendra.kumar@cdac.in   deepakarya@cdacnoida.in
   Introduction to OCR
    Provides translation of scanned documents images into
      machine encoded text format.




                          OCR System



Input Image
                                                  Output text
Introduction to Web Services
 Software    system designed to support interoperable
  machine-to-machine interaction over the network.
 Builds distributed computing platform for the web.
 Can combine different web applications and components
 A web application is an application that is accessed over a
  network such as the internet or an intranet.
   do not require any complex procedure to deploy in large organizations.
   little or no disk space on the client.
   Easy upgrade since all new features are implemented on the server
   cross-platform compatibility
   Integration to other web applications
Web OCR Services
 The OCR is implemented as a web application where user
  can upload image and generate text output on the fly through
  the web.
 Challenges:
   Multiple User’s support with session handling
   Handling of non standard documents
   Administrative control over the user and application
   Controlled access to resources
   Scalability issues
Need for Web OCR Service
 The need to develop a web based OCR arise so as
 to make the stand-alone OCR for Indian Scripts
 online and to get feedback from users about the
 availability of such a service.
 Their was a need to provide an online service to
 users globally to take advantage of OCR service
 for Indian Scripts also.
 To maintain large volumes of data through digital
 library.
 To  preserve old and historical documents in
 electronic format.
Why ASP.NET??
Key Criteria            Visual Studio          Netbeans                  PHP




Nature                  ASP.net has dynamic It is also form oriented,    PHP is still stuck to its
                        nature & has broken new object    oriented and   scripting language. It is
                        grounds by entering into precompiled as VS       an old software with no
                        new languages (even                              newer versions.
                        developing some of its
                        own). It is form oriented,
                        object     oriented   and
                        precompiled.

Programming Languages   Visual Studio supports Written in Java but can   PHP code is embedded
                        different   programming run anywhere a JVM is    into the HTML source
                        languages by means of installed.                 document and interpreted
                        language services, which                         by a web server, which
                        allow the code editor and                        generates the web page
                        debugger to support                              document. It also has
                        nearly any programming                           evolved to include a
                        languages.                                       command-line interface
                                                                         capability and can be
                                                                         used     in    standalone
                                                                         graphical applications
Compiler            Parallel compilation on Newer Lexer makes PHP is a loosely typed,
                    multicore systems does faster      runtime objects optional, fixed
                    improve performance by compilation         syntax, component-less,
                    a good 25-30% over                         runtime       interpreted,
                    previous versions on C#                    structured programming
                    apps. It integrates web                    model.     It    is    not
                    services hosting, which                    precompiled and form
                    earlier had to be done                     oriented as VS.
                    separately by the users.


Space utilization   ASP.Net utilizes server It uses server space and Inbuilt memory space is
                    space while running.    not inbuilt memory.      used by PHP while
                                                                     running.

Security            ASP. Net is reputed for Security             techniques PHP provides security but
                    creating       sophisticated present but not as great as does not ensure as much
                    techniques to ensure the VS.                             as DOT Net. It is not
                    safety of confidential data.                             professional and secured
                    It is professional in nature                             as required for corporate
                    and is used for corporate                                projects.
                    projects.
      Architecture
 The OCR Web portal can be
  accessed by two different types
  of users.
 The administrator user control
  other user activities and have
  write     access     to      web
  applications.
 Web portal provides services
  to the users through the
  backhand web applications like
  OCR services, preprocessing
  service, Text editing and other
  image processing facilities.
Web OCR Service…a View
User’s Dashboard..




          No files to Display
File Upload
Preview Uploaded File
File Editing
Editted File
OCR File
OCRed File with Output
User’s Dashboard after File has
been OCRed
Web OCR Services using Grid

  Grid computing follows service oriented architecture and
   provide hardware and software services and infrastructure
   for secure and uniform access to heterogeneous resources
   and enables formation and management of virtual
   organizations
  A computational grid is a hardware and software
   infrastructure that provides dependable, consistent,
   pervasive, and inexpensive access to high-end computational
   capabilities.”
   -”The Grid: Blueprint for a New Computing Infrastructure”,
   Kesselman & Foster
What grid computing can provide?

  • Exploit underutilized resources
    • the application must be executable remotely.
    • remote machine must meet any special hardware, software, or
      resource requirements imposed by the application.
  • Parallel CPU capacity
  • Access to additional resources
  • Resource balancing
  • Reliability
    • multiple copies
    • automatically resubmit jobs
        Basic Grid Architecture

                            Manager
                                              Web OCR
                                              Services
                 Internet




Users                           Internet
                                   or
                                Intranet




                                           Worker Agent
    Future Scope..
 Multiple file upload with status bar.
 Animation/progress indicator at the time of OCR execution.
 Batch processing of files.
 Deciding the process-flow and saving the workflow for future use.
 Dictionary based corrections in the output of OCR.
 Controls for applying multiple types of text formatting like Bold,
  Italics, Underline etc.
 Zoom-in and Zoom-out functions for both input and output images.
 Conversion of exe files for all the OCR’s to dll library files and
  integrating them.
 Authenticating user login through OCR CAPTCHA.
                  CONCLUSION
 The proposed system has been designed and implemented
    providing the services defined.
   At present five scripts OCR have been integrated.
   Seven more scripts OCR are planned to be integrated during
    next two years.
   The computational job of OCR engine will be provided by
    the grid architecture.
   The number of users for Web OCR services may be not large
    in number but as facilities and more number of OCRs will be
    included large number of users will be benefited.
               References
 Software Works, “Comparison of dot net, J2ee, PHP”
  http://software-orks.blogspot.com/2008/12/comparison-chart-
  net-j2ee-php.html
 MSDN http://msdn.microsoft.com/en-
  us/netframework/aa496123
• Foster, Carl Kesselman, and S. Tuecke, The Anatomy of the
  Grid: Enabling Scalable Virtual Organizations,
  International Journal of Supercomputer Applications, 15(3), Sage
  Publications, 2001, USA.
 Rajkumar Buyya and Srikumar Venugopal “ A Gentle
  Introduction to Grid Computing and Technologies” CSI
  Communication VOL 9, july 2005
                    Abstract
 In this paper development methodology for the web OCR services
  is proposed.
 The term Web services describes a standardized way of integrating
  Web-based applications using the XML, SOAP, WSDL and HTTP
 Web services instead share business logic, data and processes
  through a programmatic interface across a network.
 Developers can then add the Web service to a GUI (such as a Web
  page or an executable program) to offer specific functionality to
  users. Services like optical character recognition are still not
  available on web for Indian languages, where user can upload the
  image and get the text output on the fly through web.

         Keywords- Web Services, OCR Services, Image processing
                   INTRODUCTION
 A web application is an application that is accessed over a network
    such as the internet or an intranet. Web applications are popular due to
    the ubiquity of web browsers, and the convenience of using a web
    browser as a client, sometimes called a thin client.
   Common web applications include webmail, online retail sales, online
    auctions, wikis and many other functions.
   Services like optical character recognition are still not available on web
    for Indian languages, where user can upload the image and get the text
    output on the fly through web.
   The framework for the OCR services will be using the ASP DOT NET
    in middle tier application logic. The framework supports multiuser,
    authentication, session handling, multiple file upload, user control on
    technical flow, session saving, multilingual facilities for the user.
   It also supports handling of non standard images, administrative control
    to the client request and resources, multilevel priorities to users,
    handling scalabilities (horizontal and vertical) and transparency to
    replace, repair and upgrade the application.
 Ministry of Information and Technology which has constituted a
  Consortium to develop Indian language OCR where digitization
  of all Indian languages can be done. CDAC Noida as a consortium
  member has developed a Web OCR service portal for the internet
  users.
 The comparative study leads us to selection of Visual studio dot
  net as it has dynamic nature and has broken new grounds by
  entering into new languages (even developing some of its own). It
  is form oriented, object oriented and precompiled unlike PHP.
 VS Team System Database Edition has excellent database-code
  integration tools. LINQ code generators are another excellent
  feature. Winforms and ASP forms are great and better than
  Netbeans.
 In this paper, we propose Development mythology for the Web
  OCR services using Visual studio 2010 dot net tools with ASP dot
  net version 4 as development technology.
                     Architecture
 The Architecture in figure 1
  defines the OCR Web portal can
  be accessed by two different
  types of users.
 The administrator user
  functions and controls are
  different from normal user
  controls.
 Web portal provides services to
  the users through the backhand
  web applications like OCR
  services, preprocessing service,
  Text editing and other image
  processing facilities.

                                     Figure : 1
 The use-case diagram is
  shown in figure 2. The
  diagram describes the set
  of actions that system can
  perform in collaboration of
  external users or actors.




                                Figure : 2
• OCR Web Portal is to
incorporate End-to-end OCR
system for different scripts,
preprocessing modules and
different level of access to the
end user and administrator.
•The user can upload input
file or files through the web
portal after proper login to
the server and then can select
the OCR or preprocessing
module for the execution.
•The text outputs can be
edited by the user through
web portal thus it requires
online keyboard for each
script.
•The administrator control of
web portal is provided with
the facility of controlling
other user activity and to
control the configuration of
OCR and preprocessing
modules.

                                   Fig 3. Workflow
       MODULES AND PROCESSES
 This section provides a general description of the modules
  and where each fits in the global picture. The OCR Web
  Portal comprises of the following modules.
     User activity and control
            • Login module
            • New registration module
     Keyboard
     Administrator activity and control
     OCR modules
     Preprocessing Modules
     Log creation and maintenance
     Output Generation modules
        A. User activity and control
 This module defines the role and accessibility of the end user
    of the OCR Web Portal.
   User module interfaces with the login and new registration
    module.
   The Login module checks for the credentials properties and
    verifies the user.
   The new registration module defines the method to get the
    new credentials for the new user.
   The user module specifies the services provided for the end
    user and to maintain sessions. The services includes file/files
    uploading, downloading the output data, selecting OCR or
    preprocessing module to execute on the input file, editing
    the output text file using online keyboard and logout.
        B. Online Keyboard module

 The module specifies the design and usage of online keyboard
  to be used by the user module for text editing through OCR
  web portal.
 This module will generate a online keyboard for all the script
  (Included in OCR module).
 This also interfaces with the selected OCR by the user so it
  can initialize the correct keyboard for the user on web
  portal.
C. Administrator activity and control
 The module specifies the control mechanism for the Web
    portal.
   The administrator privileged user need to provide the valid
    credential for accessing the services. The service includes
    checking the input and output files of normal user and to
    control the configuration files for the OCR and
    preprocessing modules.
   The OCR and preprocessing modules access the
    configuration file before executing the input to control the
    technical flow.
   The Administrator can control/change the configuration files
    that help in generating better output to the user.
   It can also access the various log file as it interfaces with log
    generation and maintenance module.
      D. OCR Module/Preprocessing
                 Module
 The module is responsible for generating the xml files
  according to schemas which in turn helps in global interfaces
  for any OCR and preprocessing module.
 Current version of Web OCR contains OCR engine for five
  scripts.
 The other image editing facilities are also provided in the
  Web portal like image rotation, brightness control and image
  cropping.
     E. Log creation and Maintenance
 This module interfaces with user module and administrator
    module and log the information about the activities on Web
    portal.
   The interface for the user module is used for creating log of user
    activities while the interface for administrator module is used for
    retrieving the log information.
   This module also provides the important information about the
    text editing done by user.
   The log information contains all the activity done on the text
    output of the document image.
   This information is very much useful for improving the OCR
    engine performance as it can specify the more frequent errors
    caused by the OCR itself at character and word level.
         F. Output Generation module
 The module defines the format for the output text generated by
    OCR engine and the other facilities of text editing to be provided.
   The output text should support Unicode format so that all the
    scripts output are standardized and accessible everywhere.
   The text editing services is provided by the rich text control
    where user edits output with bold, italics, underline, coloring and
    other services.
   .Also this control provided print control where user can get the
    output thorough the printer without saving the output to local
    disk.
   The dictionary module can also be embedded into the rich text
    control.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:4/23/2013
language:English
pages:38