Voice browser seminar report pdf




















To have a claim in this sector or we can say market, AOL bought quack. Almost everywhere the voice portal market is being expanded, be it Europe, the US or Asia. The Voice Web: The level three is the voice web. Many companies have introduced forums for programmers for setting up voice sites so as to enhance the interest in voice browsing and speech recognition.

The forums further become voice webs which sprout voice auction sites and voice based chat rooms. Authoring a web page for any specific type of user agent or system configuration, should never be a completely separate subject with arcane new techniques developed for each special need, but rather an application of the common set of Universally Accessible Design principles that should be part of every web author's repertoire. With few exceptions, pages should never be designed for certain types or brands of browsers, but should instead be designed for all uses and potential uses of the information.

All web documents should be equally accessible to voice browsers as to visual user agents. The HTML Writer's Guild studies, and discussions with web authors, have shown that the primary obstacle to universal accessibility is ignorance. There are few cases where a conscious decision has been made to produce a generally inaccessible web page; rather, the author is simply unaware of the need to create accessible pages and the techniques by which that is done.

Once enlightened, most web authors eagerly embrace the concept of universal accessibility, since the benefits are many and obvious. In this section, some of the primary techniques of Universally Accessible Design will be briefly listed as they relate to voice browsers, and offer ideas as to how authors can implement these considerations when designing web pages.

Aural rendering of a document is already commonly used by the blind and print-impaired communities. It combines speech synthesis and "auditory icons. This result in less effective presentation than would be the case if the document structure were retained. Style sheet properties for aural presentation may be used together with visual properties mixed media or as an aural alternative to visual presentation.

The use of an aural style sheet or aural style sheet properties included in a general style sheet document allows the author to specify www. Combined with the media selector for media types, a well-crafted aural style sheet can greatly increase the accessibility of a web document in a voice browser. Further investigation in this area is encouraged, especially in the area of example aural style sheets and suggestions for authoring techniques.

In other words, by using features found in HTML 4. Judicious and ample use of meta-content within a document allows the author to not simply specify the content, but also suggest the meaning and relationship of that content in the context of the document. Voice browsers can then use that meta-information as appropriate for their presentation and structural needs.

Planned Abstraction: One use for meta-content information is the development of pages, which are designed to be abstracted.

The typical web document found on the web can often be quite lengthy; finding information by listening to web page read out loud takes longer than visually scanning a page, especially when most web pages are designed for visual use. Thus, most voice browsers will provide a method for abstracting a page; presenting one or more outlines of the page's content based on a semantic interpretation of the document. H6 headers. There is any number of other options available for voice browser programmers to use to provide short, easily-digestible versions of web contents to the browser user.

This suggests that the web author should provide as much meta-content as possible as well as careful use of HTML elements in their proper manner. Alternative Content for Unsupported Media Types: The "poster child" for web accessibility is the ALT attribute, which allows alternative text to be specified for images; if a user agent cannot display the visual image, the ALT text can be used instead.

Widespread use of the ALT attribute by all sites on the Internet would likely double the accessibility of World Wide Web with such a simple change. Web authors who do not correctly use ALT text are seriously damaging the usability of the entire medium! For voice browsers, ALT text is vitally important since images cannot be represented at all, aurally.

Especially when used as part of a link, alternative content must be provided so that the voice browser can accurately render the page in a manner useful to the user. Such commands are arranged into objects known as menus. A grammar set is defined to recognise the speech commands. Some of these rules are for administrative control. The other rules are used to control navigation. It is anticipated that they are the most frequently used commands.

The navigation is supported in various ways: within the same page intra- page navigation , browsing a new web page inter-page navigation , bookmarks, history list, document structure or to follow a hyperlink in the web page. All three versions of this command represent one action — moving between structures within a document. Another way to navigate to a specific target page is via dictation. Users dictate to the system by saying words www. Macros and shortcuts are also used to simplify the dictation process.

The dictation menu also allows for corrections to be made, a review of what has been dictated so far and an ability to restart the dictation session. Output from the system is either synthesised text or sounds as auditory icons. When a page is to be read out post translation , the page is broken up one piece at a time and analysed. Two situations can occur: if the piece is a tag with an associated auditory icon, this icon is played out, or, if the piece is simply text, it is synthesised into voice.

Typical application of auditory icons include the creaking opening door creaking to represent internal link link to an anchor within the same web page , or a doorbell to represent an email address, or the clicking sound of a camera shutter to relate to an image. When certain tags are encountered end of paragraph, end of list, end of table row, etc. Three ActiveX controls were used in conjunction with the VB project.

The primary goal of the evaluation of the TeleBrowse prototype is to determine the usability and acceptance of the application as a voice-driven web-browsing tool. Measurement is done via the transformed data. The quality aspect is evaluated through the availability, integrity and usefulness of the data after the translation, and also the support of various structures in the original HTML page, e.

The experiment was carried out on two different groups of users who had similar characteristics, i. The major difference between these two groups was subjects in the first group G1 were normally sighted, while the second group G2 were visually impaired, or to be precise, they suffered from complete blindness. In other words, subjects in G1 were familiar with typical visually oriented web browsers such as Microsoft Internet Explorer.

In this experiment, there were five people in G1 and four subjects in G2. Evaluation sessions were run on a one-to-one basis.

There was also a general discussion involving the group at the end of the individual experiments. No interaction occurred between subjects from differing groups. Each subject was briefed about the operation of the prototype, and the methods for interacting with it.

In the case of the subjects in G2, the sheet was printed using a Braille printing device. The sheet also contained a set of tasks the subjects had to complete using the program. Some of the tasks that subjects were required to complete included: starting the application, checking to see what the current homepage was set to, commencing reading of the web document once loaded, jumping to another location by dictating a URL address directly into the system www.

All tasks were first completed using the software running in emulation mode on the laptop computer. The speech recognition engine was not trained to adapt to any specific person. After completing this and gaining a degree of experience in using the system, subjects were then given an opportunity to use the system as they chose over the phone, completing the effect of a phone-based webbrowsing tool.

After using the prototype for a sufficient amount of time in most cases this was a period of about twenty minutes to half an hour , each subject was asked to complete a questionnaire to record their experiences with the prototype. The questionnaire was arranged into two parts: the measurement of efficiency and integrity.

Some of these questions asked the participants to give numerical scores in a scale of 1 very poor to 7 very good. Some of the other questions were free format where the subjects could provide their own comments. This gave us an initial and very broad insight into how subjects from each group responded to using the prototype in the experiments. In addition to the rated response questions, The evaluation questionnaire contained a further thirteen questions of free-form response in nature.

These free-form questions were designed to draw out any comments, problems, criticism, or general feedback from the test subjects. There were, on average, two such questions per section of the evaluation criteria. Voice Recognition: It was one of the more poorly rated criteria. While both groups considered it a necessary technology for the idea of a phone browser, subjects suffered from its shortcomings, and it did result in a loss of efficiency for most users.

Subjects from G2 responded more favourably than those from G1. Of particular concern was the dictation of URL addresses. This was noted as a shortcoming in the interface by every user from both groups.

The idea of having to spell out URL addresses one letter at a time and wait for confirmation of each letter was not well received. Speech Synthesis: There was little or no problem with this sub-system. Subjects found the voice easy to understand and of suitable volume and pitch.

The major contrast between the two groups was the usage of the speed control feature. Subjects from G1 saw no reason to adjust the speed of the synthesised voice. They were content with the default normally paced speaking voice. However, subjects from G2 tended to change the speed to a much higher rate before doing anything else.

Navigation: The overall ability of intra-page and inter-page navigation using the system was rated favourably by both groups. The use of auditory icons to mark HTML structures was viewed by the G2 subjects as being superior to any similar screen reader marking scheme.

Subjects from G1 also appreciated the ability of the voice icon to quickly and to simply mark structures from a document, in a way that was natural and easy to remember. The idea of metaphorically matching the meaning of sounds with the structure they were representing was well liked and accepted. There was little problem with remembering the mapping of sound to structure, especially after using the system for an extended amount of time.

A criticism with the auditory icons was that they appeared too frequently, and could be seen as breaking up the flow of text unnecessarily.

A comment made by many subjects was that the prototype offered similar and familiar functionality to that of browsers they have previously used. The ability to follow links contained in documents was well liked. Using different auditory icons for the different types of links allowed subjects to know in advance whether the link would be to a target within the same document, or an external link to another document.

This too was well liked. Again, the problem with dictating URL addresses was brought up. Online Help: This section of the criteria did not rate well, due to the lack of help associated with commands and prompts used in the system. Certainly the need to refer to other supporting documentation for more detailed information shouldbe avoided, as access to this information would not be available inenvironments where a phone browser might be used.

The www. G2 was more willing to accept the level of integrity of information presented to them by the prototype than G1. They reported frustration when they were forced to listen to the content in sequential fashion. Overall Impression: The subjects from both evaluation groups accepted the prototype as a viable method of browsing the web in the audio realm by phone. The efficiency of the product was quite highly regarded by most subjects. The system interface faired very well.

The only major problem was the dictation of URL addresses to the system. The voice portal market is going to reach billions in just a few years. It is estimated by the kelsey group that voice browsing market will reach 6. Anyone may guess the actual growth of the industry of voice technology due to variations in these figures. It is very difficult to navigate on a WAP to scroll through many lists.

Hands-free interaction enables us to develop an easy communication between the user and the system. Version 2. Version 3. What about HTML? HTML dont have Tampered prompts Grammar specifying alternative words that the user can speak in response to the question. Instructions to the text-to-speech synthesizer about how to say words and phrases. Version 1. Parsing for the purpose of voice recognition done when the page is accessed.

May or may not produce a voice feed back. Limited Information Access Useful information in limited domains like weather in a city, checking stock updates etc. Audio feed back Spoken Dialog Systems Client-server architecture is used Used for connecting to a remote server by a Java applet client.

Examples are connecting to email servers Benefits Voice is a very natural user interface which speeds up browsing. Less space requirements. Portable voice browsers can also be implemented. Practical interface for functionally blind users. Users can browse web while keeping there hands and eyes for other jobs Future Voice browsing will become visual Multi-modal Can be integrated to an OS Integrated to every application.

Conclusions Browser technology is changing very fast these days and we are moving from the visual paradigm to the voice paradigm. Voice browser is the technology to enter this paradigm. Voice browser is a device which interpret voice input and generate voice output. Open navigation menu. Close suggestions Search Search. User Settings.

Skip carousel. Carousel Previous. Carousel Next. What is Scribd? Explore Ebooks. Bestsellers Editors' Picks All Ebooks. Explore Audiobooks. Bestsellers Editors' Picks All audiobooks. Explore Magazines. Editors' Picks All magazines. Explore Podcasts All podcasts.

Difficulty Beginner Intermediate Advanced. Explore Documents. Voice Browser Seminar Report. Document Information click to expand document information Description: Seminar report. Did you find this document useful? Is this content inappropriate? Report this Document. Description: Seminar report.

Flag for inappropriate content. Download now. Related titles. Carousel Previous Carousel Next. The fact that the system deals with speech is obvious given the first word of the name,but what makes a software system that interacts with the user via speech a "browser"?

This situation is only exacerbated by the fact that much of today's content depends on the ability to run scripting languages and 3rd-party pl. Tropical Coasts Vol. Jump to Page. Search inside document. Pooja Reddy. Manish Kaushik. Kis Gez. Mohamed Khan. Vignesh Selvaraj.



0コメント

  • 1000 / 1000