Document Sample
Preproposal Powered By Docstoc
					Portable Audio Visual Book Reader
                  Team 2
          Facilitator: Selin Aviyente

                Jason Cooper
               Arthur Hallman
              Francis Okonkwo
               Jaeseung Shim
                 James Yang

              September 17, 2007

Executive Summary

The purpose of this document is to propose a practicable solution in building a portable

audio visual book reader. The portable audio visual book reader or MP3 player with

display is designed to meet the needs of persons with reading disabilities. The display

will present the printed text in synchronization with the MP3 audio. With a budget of

$500, Design Team 2 will implement a feasible device, which will be ready for

manufacture. This will be done with research, and construction of an MP3 player,

display, and the essential software. The project deadline is December 5, 2007.

                               Table of Contents
Executive Summary                      ……………………… 2
Introduction                           ……………………… 4
Background                             ……………………… 5
Objective and Design Specifications    ……………………… 6-7
Conceptual Design
      Overview                         ……………………… 8
      Design Components                ……………………… 9
                Audio                  ……………………… 9-10
                MP3 Decoding           ……………………… 10-12
                Display                ……………………… 12-13
                MP3 Encasing           ……………………… 13- 14
Risk Analysis                          ……………………… 14
Project Management Plan                ……………………… 15
Budget                                 ……………………… 15
References                             ……………………… 16


    In today's society, MP3 players have become a necessity to many people. Technology

is developing rapidly, and MP3 players are one example of technological growth. Due to

increasing interest in MP3 players, Chrysler, Midland Michigan rotary, and Michigan

State University Resource Center for Persons with Disabilities (RCPD) are sponsoring

Michigan State University's ECE480 Design Team 2 with funds to build a MP3 player

that can transform into a learning device for people with reading disabilities.

    The purpose of this device is to build an MP3 player that is simple to use, useful to

people with learning disabilities, and different in providing ways to educate and teach.

The MP3 player will also be able to use the portable audio/visual book reader similar to

other book readers out in the market.

        Timing is a crucial component in this project. It is important to find the right

software that displays certain pages while the reader is reading out loud from the screen

of the MP3 player. The idea is to create a portable player that is similar to a karaoke

machine: a product that can display words and also synchronize the audio. The main

focus of this project is to design a portable visual audio player that can benefit people

with reading disabilities; however, there are some complications that include finding the

right software that can synchronize audio and text in one MP3 file for the project to be


        Applications such as Kurzweil 3000 have made books and other written text accessible to

individuals with reading disabilities. These applications in all their value confine their users to the

computers running the programs. The goal of this project is to provide a more flexible reading

experience for persons with these disabilities.

       MSU Resource Center for Persons with Disabilities (RCPD) is constantly driving

to developed technologies to maximize the ability and opportunity for full participation

by persons with disabilities. Due to reading hardships experience by persons with

disabilities, MSU RCPD has requested a portable MP3 player capable of displaying

printed text in synchronization with the MP3 audio to alleviate the hardships experience

by the said persons.

       Current technology exists that allow printed text to be scanned and read aloud by

software on a computer or device; the most notable are those developed by Kurzweil

Technologies. One such technology is the Kurzweil 3000 computer program, which uses

specialized OCR software (Optical character Recognition) to speak aloud text from

electronic or printed material in a synthesized voice while highlighting each word for

assistive reading. Another existing technology is the Kurzweil-National Federation of the

Blind Reader (K-NFB). The K-NFB reader is a portable reading device for the blind that

uses a digital camera, PDA, and OCR software to capture printed material and read the

text aloud with a synthesized voice.

       Standards such as synchronized lyrics (LRC files) and karaoke programs have

also been developed which utilizes a very similar application to our project in that song

audio is played in synchronization with the printed lyrics (text) on a display.

Objectives and Design Specifications

        The portable audio visual book reader will extended the functionality of the Kurzweil

3000 to a portable audio player. The Kurzweil 3000 voice synthesizer is to be used to generate an

audio file from a selected text and the display unit is to present the text in synchronization with

the audio. The project objectives have been accounted for with the following basic design


        1)   MP3 Decoder Circuit.

             The Kurzweil 3000 software exports the synthesized audio in MP3 format therefore

             an MP3 decoding circuit must be developed in order to process the audio. The MP3

             decoding circuit must include basic functions such as play, pause, search and volume


        2) Text Synchronization Algorithm

             There is no means to extend the text highlight function of the Kurzweil 3000 program

             therefore the team must come up with an algorithm to present the text in

             synchronization with the audio. Each audio file will be associated to a text file to

             ensure consecutive processing. The synchronized text is to highlight individual words

             as they are being read.

        3) Display

             A display unit such as a Liquid Crystal display (LCD) or Organic Light Emitting

             Diode (OLED) will be interfaced to the device to present the synchronized text as

             well as basic information such the Title, Chapter and page numbers.

        4) Storage Unit

             Audio and associated text files will be stored on a storage unit such as a flash card.

             Removable storage will be used to allow for easy expansion and computer


       5) Power

              Rechargeable Nickel Metal Hydride (NiMH) batteries will be used to power this


       This device is intended for consumer utilization and careful consideration must be

placed on end user experience. The following factors will serve as a feasibility matrix for

our design.

       (1) Usability

       (2) Cost

       (3) Durability

       (4) Size

       (5) Power Consumption.

Conceptual Design


       At its core, the audio-visual book reader will be a portable version of already

existent capabilities. The best software package offering such capabilities is the Kurzweil

3000 software, which allows one to read books through audio and visual cues. These cues

include displaying the pages of the book while highlighted each word as it is read aloud.

This provides a powerful tool for persons with disabilities (those without may also find it

useful). In addition, the Kurzweil 3000 program supplies the user with an option to create

an audio mp3 file of an entire text. In our portable version, we will employ this audio file,

which will be loaded onto an external storage device such as a memory card. Most mp3

files consist of music or other audio clips, thus with each file there is some text associated

with the actual audio component of the file. The best way to visualize this is through

karaoke, which directly correlates the audio and text components of the mp3 file (or other

format). Our goal in with this design is the very same as that of karaoke: associating and

synchronizing spoken text with the correlating mp3 file. Our design takes the text

associated with the audio file and time stamps it to enable proper synchronization with

the audio. This will allow us to display and highlight the proper text as it is being read

instead of an entire sentence or page of text, which may be confusing to the intended

user. The display for the text will be very simple and thus presented on an inexpensive

OLED screen mounted directly onto the player.

         Figure 1. System Block Diagram

Design Components


        To solve the audio aspect of our design we turn to mp3 files. An mp3 is simply an

audio-specific file format. However, an mp3 file is not merely a digital representation of

an analog audio clip. As most sound engineers know analog audio signals are quite rich

in content, thus it would not be practical to store these on a PC let alone a portable device

such as ours. To overcome this flaw, mp3’s encode and compress the audio file greatly.

This in turn greatly reduces the amount of data contained in the file while retaining the

integrity of the audio recording. A digital audio file such as this provides a file that is

very useful. With its compressed size, it is easily transferred to a digital device as ours,

which allows for both a reduction in memory usage and better portability and usability.

Once the file is transferred to such a device, the mp3 file can be quickly decoded

allowing the user to hear the audio clip in its complete and untainted form.

        In order to apply such a powerful file format for our design application, we will

use the Kurzweil 3000 software. This software allows us to scan books or articles of our

choice into the Kurzweil 3000 application. From here, the text is transformed into a

format that allows for automated text to speech conversion. This entails synchronized

visual and audio components. The best feature of this software (at least from our design

aspect) is its ability to create mp3 files of any book it has read. In addition, the features of

the file itself can be customized. For instance, the voices can be changed prior to any

reading and the output is then saved as an mp3 file. It is this very file that we will employ

in this design.

Mp3 Decoding

The Kurzweil application provides us with an mp3 file, which will be utilized by our

device. However, this simply took an audio clip and encoded it in mp3 form, a form that

in its current form is useless, as it does not provide any sound. In order to playback the

audio files created by Kurzweil 3000 an mp3 decoder chip is needed to process the file.

Many mp3 decoder chips such as the VS101 and the STA101 can be used and are readily

available; however, they provide further complexity and cost. For instance, a

microcontroller such as the PIC processor is required to interface the chip. In addition,

the mp3 file must be stored on a memory card. Individually these components must be

tested, programmed and then implemented using custom ordered PCB boards. However,

in order to simplify and reduce the error of this process substantially, we can obtain the

components for the mp3 by purchasing them as a kit, which we can then modify slightly

to fit the design requirements.

                            Figure 2. mp3 kit – circuit board

The main such modification will come with text synchronization, as an mp3 file simply

provides audio. The synchronized text effect can be achieved with the use of LRC files.

LRC is simply a computer file format that synchronizes song lyrics with an audio file.

Thus, when an audio file is played with certain music players on a computer or on

modern digital audio players, the song lyrics are displayed. In order to achieve this

synchronization, LRC (actually the software/hardware that produce them) files simply

place a time stamp after every word in the audio file. This allows the LRC processing

software to display a word at each instance. Prior technology placed a time stamp after

every line of text; however this displays lyrics by line instead of by word, which would

be insufficient for our design. An LRC file is thus simply a text file with time stamps

between words to control how they are displayed. An example of an LRC file is

displayed below:

 [mm:ss.xx]<mm:ss.xx>line 1 word 1 <mm:ss.xx> line 1 word 2 <mm:ss.xx> line 1 last word<mm:ss.xx>
 [mm:ss.xx]<mm:ss.xx>line 2 word 1 <mm:ss.xx> line 2 word 2 <mm:ss.xx>line 2 last word<mm:ss.xx>
 [mm:ss.xx]<mm:ss.xx>last line word 1<mm:ss.xx>last line word 2<mm:ss.xx>last line last

Formatting such as backgrounds and text colors can be applied to highlight individual

words. Software such as A2 Media Player Pro version 2.40 can be used to easily create

LRC files. In their raw form, this LRC file can be processed and synchronized using

additional software applications such as Lyrics3, which embeds the time stamp file

directly into the mp3 file itself. This allows us to avoid additionally .TXT or .PDF files,

which in turn allows us to process the LRC file, more specifically the time stamped file,

by the mp3 decoder kit itself. This method thus requires no additional hardware, reducing

both the cost and the complexity of our design.


A simple OLED display can be interfaced to the PIC processor to display the text from the time

stamp file. OLED displays provide a cost-efficient solution for our display. OLED displays or

Organic Light-Emitting Diode displays describe any flat panel display whose surface has a film

of organic compounds. These compounds are printed onto a flat surface in a matrix pattern

consisting of rows and columns, which results in pixels that can emit light of various colors and

shades. They are ideal for many applications especially the one of interest. Since they can be

printed using existent technologies and on virtually any substrate they are very versatile in nature;

making them perfect for small applications such as an mp3 type player.

                                                         Figure 3. OLED Structure

The main advantage of OLED displays is their operation. The OLED pixels are the ones emitting

the light, thus they do not require a backlight, which is the case with their LCD counterparts. This

in turn reduces power consumption and boosts the overall efficiency and usability of our design.

OLED, similar to LCD displays, are easily interfaced with the PIC processors found in the mp3

kit. This interface allows us to display not only the ‘title’ and ‘artist’ (in our case title and author)

of the audio file but it also displays a word at each time stamp. Since a time stamp is given after a

word, each word is thus displayed independently.

                                     Figure 4. OLED in Practice

Mp3 Casing

With each part independently implemented and then interfaced, some sort of casing or packing

must be implemented to ensure usability. The best option here is to utilize a simple plastic

molding that can be modified to fit all parts of our design. Moldable plastic can take any shape

and then becomes hard after it has cooled to a lower temperature. The moldable plastic comes in

the form of pellets which become moldable after it is put in 160 F water. Once the plastic cools it

becomes very hard and durable which would allow us to protect the rest of our design. This

product is available through for only $15 for every 250 grams. It is a very cheap and

simple solution for our housing design.

                           Figure 5. Moldable Plastic Pellets

                         Figure 6. Hardened Moldable Plastic

Risk Analysis

       Several important challenges arises when designing ways to synchronize the

audio with the text. One of these challenges is designing a way to automate the process

of encoding the time stamps into both the audio files and the LRC files. Manually

encoding time stamps into these two different files will prove to be arduous, ineffective

task for the end-user, which is why an automated process is needed. Another challenge is

creating the LRC file itself. The Kurzweil 3000 software has the ability to provide the

MP3 audio from processed readings but we currently have no effective method to obtain

the text format needed by the LRC files.

Project Management Plan

       The team for this project consists of the following five electrical and computer

engineers: Jason Cooper – management, Arthur Hallman – presentation prep, Francis

Okonkwo – webmaster, James Yang – documentation prep, and Jaesung Shim – lab



Our current budget for our project is provided by the ECE Department and is set at $500.

The team projects this amount to be sufficient enough to complete the project.

▪ LED Display
20x4 STN Negative Blue Transmissive White Edge LED Backlight LCD Module
Price = $30.98

▪ MP3 kit
Daisy open source MP3 player kit
Price = $114.95

Total = $145.93

Kevin, Kelly. "Cool Tool: Shapelock." 20 December 2004. Date Accessed: 18 September

Stoffregen, Paul. "PJRC MP3 Player." 23 February 2005. Date Accessed: 18 September

Wikimedia Foundation, Inc. "OLED Screen." 4 April 2007. Date Accessed: 18
     September 2007.

O’Reilly Media, Inc. "Daisy MP3 Player Kit." Date Accessed: 18 September. 2007

Crystalfontz America, Inc. "LCD Modules." Date Accessed: 17 September. 2007

Kurzweil Educational Systems, Inc. "Kurweil 3000 – Solutions for Struggling Readers."
      Date Accessed: 16 September 2007.

Reading Technology, Inc. "The Kurzweil – National Federation of the Blind Reader."
      Date Accessed: 18 September 2007.