PLANS & PRICING
we make your business better
DOCUMENTS & VIDEOS
STARTING A BUSINESS
GROWING A BUSINESS
Most Recent Documents
Jobs & Careers
Politics & History
Health & Fitness
Art & Literature
GUIDES & INFOGRAPHICS
Mobile Terminal And Menu Control Method Thereof - Patent 8150700
CROSS REFERENCE TORELATED APPLICATIONS The present application claims priority to Korean Application No. 10-2008-0032842 filed in Korea on Apr. 8, 2008, the entire contents of which is hereby incorporated by reference in its entirety.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a mobile terminal and corresponding method that performs operations on the mobile terminal based on voice commands and prior operations performed on the mobile terminal. 2. Description of the Related Art Mobile terminals now provide many additional services beside the basic call service. For example, user's can now access the Internet, play games, watch videos, listen to music, capture images and videos, record audio files, etc. Mobileterminals also now provide broadcasting programs such that user can watch television shows, sporting programs, videos etc. Thus, the mobile terminals include sophisticated graphic user interfaces or GUIs that the user can use to access the various functions on the terminal. For example, the user can access a main menu and then select from one of a plurality ofsubmenus such as an email submenu, a calling history submenu, an Internet access submenu, a pictures submenu, etc. Once the user selects a particular submenu, the mobile terminals provide yet another list of submenus or options that the user can selectto perform a desired function. However, the menu systems are formed in a tree diagram such that the user has to perform several intermediary functions in order to select a desired final function. In addition, because the terminal is small in size, the menu options are alsosmall in size, and difficult to see. Touching a particular menu option when the terminal includes a touch screen display also often results in the user simultaneously touching more than one menu item (because the menu items are displayed close together)or the user touching the wrong menu item.SUMMARY OF THE INVENTION Accordingly, one object of
Systems And Methods Of A Structured Grammar For A Speech Recognition Command System - Patent 8150699
BACKGROUND 1. Field The present invention relates to a speech recognition command system, and, more particularly, to a comprehensive, global speech recognition command system to control multiple software applications. 2. Description of the Related Art Existing speech interfaces generally use fairly small, not comprehensive sets of global commands then augment these global command sets with custom sets of commands for specific programs. These program-specific commands are fairly difficult tomaintain and so they do not support a lot of programs. Additionally, existing speech interfaces often have different ways to indicate the same thing, such as through use of synonyms. Having many different ways to say the same thing makes it difficult to remember, predict, and combine commands. Thus, a need exists for a comprehensive, combinatorial, global speech recognition command system for a speech interface to control multiple software applications and enable control possible by keyboard, mouse, and other peripheral devices.SUMMARY Provided herein are systems and methods of a comprehensive, global speech recognition command system for a speech interface to control multiple software applications and enable everything possible by keyboard and mouse. In an aspect of theinvention, a method for speech command control may comprise providing at least one vocabulary word, providing a set of structured grammar rules, creating at least one speech command from the at least one vocabulary word according to the structuredgrammar rules, and mapping an input function of a platform to the at least one speech command. In an embodiment, the input function may be at least one of a keystroke, a keyboard shortcut, a mouse action, and a combination of input functions. In anembodiment, multiple input functions may be mapped to a single speech command. In an embodiment, the method may further comprise issuing a speech command through an input device to control a platform application. In an e
Method And Apparatus For Embedding Spatial Information And Reproducing Embedded Signal For An Audio Signal - Patent 8150701
The present invention relates to a method of encoding and decoding an audio signal.BACKGROUND ART Recently, many efforts are made to research and develop various coding schemes and methods for digital audio signals and products associated with the various coding schemes and methods are manufactured. And, coding schemes for changing a mono or stereo audio signal into multi-channel audio signal using spatial information of the multi-channel audio signal have been developed. However, in case of storing an audio signal in some recording media, an auxiliary data area for storing spatial information does not exist. So, in this case, only a mono or stereo audio signal is reproduced because the mono or stereo audiosignal is stored or transmitted. Hence, a sound quality is monotonous. Moreover, in case of storing or transmitting spatial information separately, there exists a problem of compatibility with a player of a general mono or stereo audio signal.DISCLOSURE OF THE INVENTION Accordingly, the present invention is directed to an apparatus for encoding and decoding an audio signal and method thereof that substantially obviate one or more of the problems due to limitations and disadvantages of the related art. An object of the present invention is to provide an apparatus for encoding and decoding an audio signal and method thereof, by which compatibility with a player of a general mono or stereo audio signal can be provided in coding an audio signal. Another object of the present invention is to provide an apparatus for encoding and decoding an audio signal and method thereof, by which spatial information for a multi-channel audio signal can be stored or transmitted without a presence of anauxiliary data area. Additional features and advantages of the present invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and otheradvantages of the present in
Stereo Audio Encoding Device, Stereo Audio Decoding Device, And Method Thereof - Patent 8150702
The present invention relates to a stereo speech coding apparatus, stereo speech decoding apparatus and methods used in conjunction with these apparatuses, used upon coding and decoding of stereo speech signals in mobile communications systemsor in packet communications systems utilizing the Internet protocol (IP).BACKGROUND ART In mobile communications systems and in packet communications systems utilizing IP, advancement in the rate of digital signal processing by DSPs (Digital Signal Processors) and enhancement of bandwidth have been making possible high bit ratetransmissions. If the transmission rate continues increasing, bandwidth for transmitting a plurality of channels can be secured (i.e. wideband), so that, even in speech communications where monophonic technologies are popular, communications based onstereophonic technologies (i.e. stereo communications) is anticipated to become more popular. In wideband stereophonic communications, more natural sound environment-related information can be encoded, which, when played on headphones and speakers,evokes spatial images the listener is able to perceive. As a technology for encoding spatial information included in stereo audio signals, there is binaural cue coding (BCC). In binaural cue coding, the coding end encodes a monaural signal that is generated by synthesizing a plurality of channelsignals constituting a stereo audio signal, and calculates and encodes the cues between the channel signals (i.e. inter-channel cues). Inter-channel cues refer to side information that is used to predict channel signal from a monaural signal, includinginter-channel level difference (ILD), inter-channel time difference (ITD) and inter-channel correlation (ICC). The decoding end decodes the coding parameters of a monaural signal and acquires a decoded monaural signal, generates a reverberant signal ofthe decoded monaural signal, and reconstructs stereo audio signals using the decoded monaural signal, its reverberant signal
System And Method For Sending A Message Type Identifier Through An In-band Modem - Patent 8150686
U.S. patent application Ser. No. 12/477,561, claimspriority to the following U.S. Provisional Application Nos. 61/059,179 entitled "ROBUST SIGNAL FOR DATA TRANSMISSION OVER IN-BAND VOICE MODEM IN DIGITAL CELLULAR SYSTEMS" filed Jun. 5, 2008, and assigned to the assignee hereof and hereby expresslyincorporated by reference herein; and 61/087,923 entitled "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS (OR CELLULAR) COMMUNICATION NETWORKS" filed Aug. 11, 2008, and assigned to the assignee hereof and herebyexpressly incorporated by reference herein; and No. 61/093,657 entitled "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS (OR CELLULAR) COMMUNICATION NETWORKS" filed Sep. 2, 2008, and assigned to the assignee hereof andhereby expressly incorporated by reference herein; and No. 61/122,997 entitled "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS (OR CELLULAR) COMMUNICATION NETWORKS" filed Dec. 16, 2008, and assigned to the assigneehereof and hereby expressly incorporated by reference herein; and No. 61/151,457 entitled "SYSTEM AND METHOD FOR PROVIDING GENERAL BI-DIRECTIONAL IN-BAND MODEM FUNCTIONALITY" filed Feb. 10, 2009, and assigned to the assignee hereof and hereby expresslyincorporated by reference herein; and No. 61/166,904 entitled "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS (OR CELLULAR) COMMUNICATION NETWORKS" filed Apr. 6, 2009, and assigned to the assignee hereof and herebyexpressly incorporated by reference herein.RELATED APPLICATIONS Related co-pending U.S. patent applications include: "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS COMMUNICATION NETWORKS", having U.S. patent application Ser. No. 12/477,544, filed Jun. 5, 2008,assigned to the assignee hereof, and expressly incorporated by reference herein; "SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMU
Voice Recognizing Apparatus, Voice Recognizing Method, Voice Recognizing Program, Interference Reducing Apparatus, Interference Reducing Method, And Interference Reducing Program - Patent 8150688
This application is the National Phase of PCT/JP2007/050157,filed Jan. 10, 2007, which claims priority to Japanese Application No. 2006-003650, filed Jan. 11, 2006, the disclosures of which are hereby incorporated by reference in their entirety.TECHNICAL FIELD The present invention relates to a voice recognizing apparatus, a voice recognizing method, a voice recognizing program, an interference reducing apparatus, an interference reducing method, and an interference reducing program. Moreparticularly, the invention relates to a voice recognizing apparatus, a voice recognizing method, and a voice recognizing program for recognizing an input voice including a voice uttered by a user speaker and a voice uttered by an interference speakerother than the user speaker, and to an interference reducing apparatus, an interference reducing method, and an interference reducing program.BACKGROUND ART In the case of recognizing a voice in real environment, there is a problem such that when an utterance voice uttered by the user is influenced by an utterance voice uttered by another person (hereinbelow, called "interfering uttered voice") andambient noise, recognition precision deteriorates. As a method of reducing the influence of ambient noise, a method of emphasizing a voice by using a noise canceller and a microphone array is being studied but does not reach complete elimination ofnoise. One of methods of increasing recognition precision by reducing the influence of ambient noise is a method of suppressing the influence of ambient noise by superimposing weak white noise on an input voice. In the case where ambient noise isweaker than an uttered voice, by whitening the ambient noise with white noise superimposition, the influence can be reduced. A conventional voice recognizing system using white noise is disclosed in, for example, a patent document 1. The conventional voice recognizing system described in the document has a voice detector for calculating voice-likeness
Method And Apparatus For Reducing Access Delay In Discontinuous Transmission Packet Telephony System Method and apparatus for reducing access delay
The present disclosure is related to methods and devices for use in cell phones and other communication systems that use statistical multiplexing wherein channels are dynamically allocated to carry each talkspurt. It is particularly directed tomethods and devices for mitigating the effects of access delay in such communication systems.BACKGROUND In certain packet telephony systems, a terminal only transmits when voice activity is present. Such discontinuous transmission (DTX) packet telephony systems allow for greater system capacity, as compared with systems in which a channel isallocated to a transmitting terminal for the duration of the call, or session. With reference to FIG. 1, in DTX systems, at the start of each talkspurt, the transmitting device 102, typically a wireless handset, requests a transmission channel from the base station 104. The base station 104, which uses statisticalmultiplexing for allocating channels, establishes a path via a network 106 and/or intermediate switches 108 to connect to the remote receiving device 110, which may be another handset, conventional land-line phone, or the like. FIG. 2 presents a block diagram of the principal functions of the transmitting device 102 and the base station 104 in a DTX system. A speaker=s voice is received by an audio input port (AIP) 122 where the voice signal is digitally sampled atsome frequency fs, typically fs=8 kHz. The sampled signal is usually divided into frames of length 10 msec or so (i.e., 80 samples) prior to further processing. The frames are input to a voice activity detector (VAD) 124 and a speech encoder 126. Asis known to those skilled in the art, in some devices, the VAD 124 is integrated into the speech encoder 126, although this is not a requirement in prior art systems. In any event, the VAD 124 determines whether or not speech is present and, if so,sends an active signal to the handset=s control interface 128. The handset=s control interface 128 sends a traffic channel
Invoking Tapered Prompts In A Multimodal Application - Patent 8150698
1. Field of the Invention The field of the invention is data processing, or, more specifically, methods, apparatus, and products for invoking tapered prompts in a multimodal application. 2. Description of Related Art User interaction with applications running on small devices through a keyboard or stylus has become increasingly limited and cumbersome as those devices have become increasingly smaller. In particular, small handheld devices like mobile phonesand PDAs serve many functions and contain sufficient processing power to support user interaction through multimodal access, that is, by interaction in non-voice modes as well as voice mode. Devices which support multimodal access combine multiple userinput modes or channels in the same interaction allowing a user to interact with the applications on the device simultaneously through multiple input modes or channels. The methods of input include speech recognition, keyboard, touch screen, stylus,mouse, handwriting, and others. Multimodal input often makes using a small device easier. Multimodal applications are often formed by sets of markup documents served up by web servers for display on multimodal browsers. A `multimodal browser,` as the term is used in this specification, generally means a web browser capable ofreceiving multimodal input and interacting with users with multimodal output, where modes of the multimodal input and output include at least a speech mode. Multimodal browsers typically render web pages written in XHTML+Voice (`X+V`). X+V provides amarkup language that enables users to interact with an multimodal application often running on a server through spoken dialog in addition to traditional means of input such as keyboard strokes and mouse pointer action. Visual markup tells a multimodalbrowser what the user interface is look like and how it is to behave when the user types, points, or clicks. Similarly, voice markup tells a multimodal browser what to do when the user speaks to it.
Presentation Of Written Works Based On Character Identities And Attributes - Patent 8150695
BACKGROUND Automated methods for recognizing "named entities" (e.g., persons or places) in a body of text are known. The existing methods have been applied primarily to relatively short works, such as news reports, and highly specialized scientific workssuch as biomedical texts. Further, these methods have generally been applied to extract and compile information from texts, not to enhance the reading experience. Written works such as works of fiction often contain a large number of character identities. The character identities and their attributes affect comprehension, interpretation, and understanding of the work and therefore have a profound effecton the reading experience. While most printed copies of written works simply present the work statically in black ink on white paper, the concept of electronic rendering of written works provides an opportunity to customize the presentation of a writtenwork based on the characters within, making the written work more engaging for a user. Existing electronic rendering systems, however, fail to provide such customization. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating an exemplary system according to an embodiment in which a user computing device communicates with a server computing device via a network. FIG. 2 is a block diagram illustrating an exemplary user computing device. FIG. 3 is a block diagram illustrating an exemplary server computing device. FIG. 4 is an exemplary flow chart for presenting a written work. FIG. 5 is an exemplary flow chart for presenting a written work based on one or more character identity attributes. FIG. 6 is an example screen shot of a graphical interface for displaying written work information in a catalog. FIG. 7 is an example screen shot of a graphical interface for displaying content for a written work. FIG. 8 is an example screen shot of a graphical interface for displaying a script based on a written work. FIG. 9 is an example screen shot of a graphica
Methods And Apparatus For Natural Spoken Language Speech Recognition - Patent 8150693
The present invention relates to a speech recognition apparatus and methods, and in particular to a speech recognition apparatus and methods for recognizing the natural language spoken by persons that thereafter is used for composing sentencesand for creating text data.BACKGROUND OF THE INVENTION A statistical method for using an acoustic model and a language model for speech recognition is well known, and has been featured in such publications as: "A Maximum Likelihood Approach to Continuous Speech Recognition," L. R. Bahl, et. al.,IEEE Trans. Vol. PAMI-5, No. 2, March, 1983; and "Word based approach to large-vocabulary continuous speech recognition for Japanese," Nishimura, et. al., Information Processing Institute Thesis, Vol. 40, No. 4, April, 1999. According to an overview of this method, a word sequence W is voiced as a generated sentence and is processed by an acoustic processor, and from a signal that is produced a feature value X is extracted. Then, using the feature value X and theword sequence W, assumed optimal recognition results W' are output in accordance with the following equation to form a sentence. That is, a word sequence such that, when the word sequence W is voiced, the product of the appearance probability P (XW), ofthe feature value (X), and the appearance probability (P(W)), of the word sequence W, is the maximum (argmax) and is selected as the recognition results W'. '.times..times..times..function..times..times..times..times..function..ti- mes..function..times..times. ##EQU00001## where P(W) is for a language model, and P (W|X) is for an acoustic model. In this equation, the acoustic model is employed to obtain the probability P(X|W), and words having a high probability are selected as a proposed word for recognition. This language model is frequently used to provide an approximation of theprobability P(W). For the conventional language model, normally, the closest word sequence is used as a history. An example is an N-gram model.
System And Method For Providing An Acoustic Grammar To Dynamically Sharpen Speech Interpretation - Patent 8150694
The invention is related generally to automated speech interpretation, and in particular, to enhancing the accuracy and performance of speech interpretation engines.BACKGROUND OF THE INVENTION The field of automated speech interpretation is in increasingly higher demand. One use of automated speech interpretation is to provide voice requests to electronic devices. This may enable a user to simply speak to an electronic device ratherthan manually inputting requests, or other information, through pressing buttons, uploading information, or by other request input methods. Controlling various electronic devices through speech may enable the user to use the electronic devices moreefficiently. However, existing technology in the field of automated speech interpretation, such as standard speech engines, automatic speech recognition (ASR), and other systems for interpreting speech, are unable to process a speech signal in an efficientmanner, often constructing large grammars that include a large number of items, nodes, and transitions, which is a concern particularly for large-list recognition for embedded applications. If the grammar for an embedded application grows too much, itmay not fit within the constrained space of an embedded application. With limited CPU power, response time and performance is easily affected due to the significant time needed to compile and load the grammar. Response time is further degraded becausethe speech engine has to parse through a large number of transition states to come up with a recognition result. Even when the speech engine is able recognize a word, the results are often unreliable because large grammars introduce greater risk ofconfusion between items as the size of the grammar increases. Existing techniques focus on reducing the size of a grammar tree by removing command variants or criteria items, but this approach strips functionality from the application. In addition to the performance problems associated with speech re
Method And Apparatus For Recognizing A User Personality Trait Based On A Number Of Compound Words Used By The User - Patent 8150692
This present invention generally relates to speech recognition systems and, more particularly, to techniques for recognizing and reacting to user personality in accordance with a speech recognition system.BACKGROUND OF THE INVENTION It has been argued that users' positive or negative reaction to a speech user interface can be affected by the extent to which they "self-identify" with the persona (voice and human characteristics) of the system. It is generally agreed in thehuman-computer interaction literature that callers can recognize and react to the emotive content in a speech sample in speech recognition systems. However, as a converse to the above phenomenon, the question is raised: can computers recognize and react to the emotive content of what a caller says in a speech user interface? The key problem to addressing this question has been how todevelop an algorithm with enough "intelligence" to detect the emotion (or persona) of the caller and then adjust its dialog to respond accordingly. One current solution to this problem is to capture the voice features (pitch/tone or intonation) of the user and run this information through a pitch-synthesis system to determine the user's emotion (or persona). One of the biggest problemswith this approach is its inconclusiveness. This is based on the fact that the dimensions or resulting categories of emotion are based on matching pitch characteristics (loud, low, normal) with emotional values such as "happy" or "sad" as well as theindeterminate "neutral." The problem with using pitch for emotional determination is that emotional values cannot always be based on absolute values. For example, a user may be "happy" but speak in a "neutral" voice, or they may be sad and yet speak in a happy voice. In addition, it is not exactly clear in this existing approach what constitutes a "neutral" voice and how you would go about measuring this across a wide range of user population, demography, age, etc.SUMMARY OF THE INVENTION P
Method Of Providing Dynamic Speech Processing Services During Variable Network Connectivity - Patent 8150696
BACKGROUND 1. Field of the Disclosure The present disclosure relates to the field of communications. More particularly, the present disclosure relates to providing dynamic speech processing services during variable network connectivity. 2. Background Information Some client devices are connected to network servers. These network servers often have speech processing capabilities, performed by speech processors instantiated on network servers. However, when the network connection is compromised and theconnection to a network server is lost or impaired, client devices that are connected to the network servers are unable to perform speech processing. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows an exemplary general computer system that includes a set of instructions for providing dynamic speech processing services during variable network connectivity; FIG. 2 illustrates an exemplary network connection between a client device and a network server, according to an aspect of the present disclosure; FIG. 3 is a flowchart depicting an exemplary process of providing dynamic speech processing services during variable network connectivity, according to an aspect of the present disclosure; and FIG. 4 illustrates an exemplary client device, according to an aspect of the present disclosure.DETAILED DESCRIPTION In view of the foregoing, the present disclosure, through one or more of its various aspects, embodiments and/or specific features or sub-components, is thus intended to bring out one or more of the advantages as specifically noted below. According to an aspect of the present disclosure, a client device for providing dynamic speech processing services during variable network connectivity with a network server includes a connection determiner that determines the level of networkconnectivity of the client device and the network server; a simplified speech processor that processes speech data that is initiated based on the determination from the connection determiner tha
Autonomous Systems And Network Management Using Speech - Patent 8150697
BACKGROUND 1. Field of the Invention The invention relates to the field of system administration and management and, more particularly, to performing such functions using a voice interface. 2. Description of the Related Art Managing a network of computers and/or other devices can be a challenging task. Many different software programs are available, however, which help system administrators manage networks. Using network management software, an administrator candefine the various components, resources, systems, and/or subsystems (hereafter systems) belonging to the network to be managed. Network management software also permits a system administrator to define tasks that may be performed on the managed system. Presently, system administration follows the traditional model of command-line or console-based control. That is, an administrator interacts with a managed system via a conventional computer terminal, having a keyboard and display, which iscommunicatively linked to the managed system. The administration terminal frequently is located "on-premises" with the managed system. The system administrator can receive notifications and monitor the system by viewing messages on the administrationterminal display. The system administrator also can provide instructions, queries, or other commands to the system by entering the appropriate information into a command line interface of the administration terminal display using the administrativeterminal keyboard. Through the administration terminal, the system administrator can interact with the system to perform administrative, managerial, and maintenance functions. Legacy software components, for example components written in the C family ofprogramming languages, typically are managed through Simple Network Management Protocol (SNMP) via the administration terminal. Other systems have come to use a resource management system for handling communications between the administration terminal and the system. For exampl
Distributed Dictation/transcription System - Patent 8150689
REFERENCE TO CO-PENDING APPLICATIONS FOR PATENT None.BACKGROUND 1. Field The technology of the present application relates generally to dictation systems, and more particular, to a distributed dictation system that is adapted to return in real-time or near real-time a transcription of the dictation. 2. Background Originally, dictation was an exercise where one person spoke while another person transcribed what was spoken. With modern technology, dictation has advanced to the stage where voice recognition and speech-to-text technologies allow computersand processors to serve as the transcriber. Current technology has resulted in essentially two styles of computer based dictation and transcription. One style involves loading software on a machine to receive and transcribe the dictation, which is generally known as client sidedictation. The machine transcribes the dictation in real-time or near real-time. The other style involves saving the dictation audio file and sending the dictation audio file to a centralized server, which is generally known as server side batchdictation. The centralized server transcribes the audio file and returns the transcription. Often the transcription is accomplished after hours, or the like, when the server has less processing demands. As can be appreciated the present computer based dictation and transcription systems have drawbacks. One drawback of client side dictation is that the dictation and transcription is limited to a single or particular machine, sometimes referredto as a thick or heavy client as most of the processing is accomplished at the local user's machine. Thus, unless the user has the particular machine available, the user cannot accomplish dictation. One drawback of server side batch dictation is thatthe transcript is not provided in real or hear real-time. So, while the server side batch dictation systems may use thin clients, the transcription is not provided in real-time or near real-time. Moreover, the re
The best documents & resources to start and grow a business.
How are you planning on using Docstoc?
JOIN WITH FACEBOOK
By registering with docstoc.com you agree to our
terms of service
, and to receive content and offer notifications.
Already a member?
Sign Into your Account
Not a member yet?
Sign in with Facebook