speech-dispatcher: Basic Design

 
 1.2 Design
 ==========
 
 Current Design
 ==============
 
 The communication between all applications and synthesizers, when
 implemented directly, is a mess.  For this purpose, we wanted Speech
 Dispatcher to be a layer separating applications and synthesizers so
 that applications wouldn't have to care about synthesizers and
 synthesizers wouldn't have to care about interaction with applications.
 
    We decided we would implement Speech Dispatcher as a server receiving
 commands from applications over a protocol called 'SSIP', parsing them
 if needed, and calling the appropriate functions of output modules
 communicating with the different synthesizers.  These output modules are
 implemented as plug-ins, so that the user can just load a new module if
 he wants to use a new synthesizer.
 
    Each client (application that wants to speak) opens a socket
 connection to Speech Dispatcher and calls functions like say(), stop(),
 and pause() provided by a library implementing the protocol.  This
 shared library is still on the client side and sends Speech Dispatcher
 SSIP commands over the socket.  When the messages arrive at Speech
 Dispatcher, it parses them, reads the text that should be said and puts
 it in one of several queues according to the priority of the message and
 other criteria.  It then decides when, with which parameters (set up by
 the client and the user), and on which synthesizer it will say the
 message.  These requests are handled by the output plug-ins (output
 modules) for different hardware and software synthesizers and then said
 aloud.
 
 [Speech Dispatcher architecture]
 
    See also the detailed description ⇒Client Programming
 interfaces, and ⇒Server Programming documentation.
 
 Future Design
 =============
 
 Speech Dispatcher currently mixes two important features: common
 low-level interface to multiple speech synthesizers and message
 management (including priorities and history).  This became even more
 evident when we started thinking about handling messages intended for
 output on braille devices.  Such messages of course need to be
 synchronized with speech messages and there is little reason why the
 accessibility tools should send the same message twice for these two
 different kinds of output used by blind people (often simultaneously).
 Outside the world of accessibility, applications also want to either
 have full control over the sound (bypass prioritisation) or to only
 retrieve the synthesized data, but not play them immediatelly.
 
    We want to eventually split Speech Dispatcher into two independent
 components: one providing a low-level interface to speech synthesis
 drivers, which we now call TTS API Provider and is already largely
 implemented in the Free(b)Soft project, and the second doing message
 managemenet, called Message Dispatcher.  This will allow Message
 Dispatcher to also output on Braille as well as to use the TTS API
 Provider separately.
 
    From implementation point of view, an opportunity for new design
 based on our previous experiences allowed us to remove several
 bottlenecks for speed (responsiveness), ease of use and ease of
 implementation of extensions (particularly output modules for new
 synthesizers).  From the architecture point of view and possibilities
 for new developments, we are entirely convinced that both the new design
 in general and the inner design of the new components is much better.
 
    While a good API and its implementation for Braille are already
 existent in the form of BrlAPI, the API for speech is now under
 developement.  Please see another architecture diagram showing how we
 imagine Message Dispatcher in the future.
 
 [Speech Dispatcher architecture]
 
    References: <http://www.freebsoft.org/tts-api/>
 <http://www.freebsoft.org/tts-api-provider/>