Speech Signal Processing Toolkit (SPTK)
Version 3.2
November 14, 2008
README
Speech Signal Processing Toolkit (SPTK)
version 3.2 release November 14, 2008
The Speech Signal Processing Toolkit (SPTK) is a suite of speech
signal processing tools for UNIX environments, e.g., LPC
analysis, PARCOR analysis, LSP analysis, PARCOR synthesis
filter, LSP synthesis filter, vector quantization techniques,
and other extended versions of them.
SPTK was developed and has been used in the research group of
Prof. Satoshi Imai (he has retired) and Prof. Takao Kobayashi
(currently he is with Interdisciplinary Graduate School of
Science and Engineering, Tokyo Institute of Technology) at P&I
laboratory, Tokyo Institute of Technology. A sub-set of tools
was chosen and arranged for distribution by Prof. Keiichi Tokuda
(currently he is with Department of Computer Science and
Engineering, Nagoya Institute of Technology) as a coordinator in
cooperation with Dr. Takashi Masuko (currently he is with
Corporate Research & Development Center, Toshiba Corp.),
Dr. Kazuhito Koishida (currently he is with Microsoft Research),
Dr. Shinji Sako (currently he is a Research Associate, Nagoya
Institute of Technology), Dr. Heiga Zen (currently he is with
Toshiba Europe Research Ltd. Cambridge Research Laboratory), and
other collaborators (see "Acknowledgments" and "Who we are" in
README).
The original source codes have been written by many people who
took part in activities of the research group. The most
original source codes of this distribution were written by Takao
Kobayashi (graph, data processing, FFT, sampling rate
conversion, etc.), Keiichi Tokuda (speech analysis, speech
synthesis, etc.), and Kazuhito Koishida (LSP, vector
quantization, etc.).
This version is accompanied by a Reference Manual. A small
User's Manual "Examples for using SPTK" is also attached.
****************************************************************
What's new
****************************************************************
The differences between version 3.1 and 3.2:
- released under the New and Simplified BSD license
- add 'gmm', 'delta', and 'dct' commands
- Command options are rearranged for improving consistency
among related commands, e.g., 'mgcep', 'mglsa', etc
- bug fixes
- support Windows (except 'xgr')
Please see ChangeLog included in the release for details.
****************************************************************
Copying
****************************************************************
The Speech Signal Processing Toolkit (SPTK) version 3.2 is
released under the New and Simplified BSD license (see
http://www.opensource.org/). Using and distributing this
software and its documentation is free (without restriction
including without limitation the rights to use, copy, modify,
merge, publish, distribute, sublicense, and/or sell copies of
this work, and to permit persons to whom this work is furnished
to do so) subject to the conditions in the following license:
/* --------------------------------------------------------------- */
/* The Speech Signal Processing Toolkit (SPTK) */
/* developed by SPTK Working Group */
/* http://sp-tk.sourceforge.net/ */
/* --------------------------------------------------------------- */
/* */
/* Copyright (c) 1984-2007 Tokyo Institute of Technology */
/* Interdisciplinary Graduate School of */
/* Science and Engineering */
/* */
/* 1996-2008 Nagoya Institute of Technology */
/* Department of Computer Science */
/* */
/* All rights reserved. */
/* */
/* Redistribution and use in source and binary forms, with or */
/* without modification, are permitted provided that the following */
/* conditions are met: */
/* */
/* - Redistributions of source code must retain the above copyright */
/* notice, this list of conditions and the following disclaimer. */
/* - Redistributions in binary form must reproduce the above */
/* copyright notice, this list of conditions and the following */
/* disclaimer in the documentation and/or other materials provided */
/* with the distribution. */
/* - Neither the name of the SPTK working group nor the names of its */
/* contributors may be used to endorse or promote products derived */
/* from this software without specific prior written permission. */
/* */
/* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND */
/* CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, */
/* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF */
/* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE */
/* DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS */
/* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, */
/* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED */
/* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, */
/* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON */
/* ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, */
/* OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY */
/* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE */
/* POSSIBILITY OF SUCH DAMAGE. */
/* --------------------------------------------------------------- */
Although this software is free, we still offer no warranties and
no maintenance. We will continue to endeavor to fix bugs and
answer queries when can, but are not in a position to guarantee
it. We will consider consultancy if desired, please contacts us
for details.
If you are using SPTK in commercial environment, even though no
license is required, we would be grateful if you let us know as
it helps justify ourselves to our various sponsors. We also
strongly encourage you to
* refer to the use of SPTK in any publications that use SPTK
* report bugs, where possible with bug fixes, that are found
****************************************************************
Environment
****************************************************************
We expect that all programs can be compiled and can work on most
of UNIX-type operating systems. For Windows operating systems,
a makefile for Microsoft Visual C++ is also included.
Note that some commands need C-shell (/bin/csh) since they are
implemented by C-shell scripts.
****************************************************************
Installation
****************************************************************
1) Type
% ./configure --help
and read the help messages.
2) To build and install all tools, type:
% ./configure --prefix=/usr/local/SPTK
% make
% make install
The X-window library is not required for compilation of all tools
except a command 'xgr' since only 'xgr' uses the X-window library.
For Microsoft Visual C++, please use src/Makefile.mak. Note that
some of commands of SPTK which use GUI and/or audio I/O are not
supported on the Windows operating system.
****************************************************************
Notice
****************************************************************
As the version advances, specifications for the Speech Signal
Processing Toolkit will be changed without notifications.
****************************************************************
Bug report
****************************************************************
Bug reports, comments, questions for the Speech Signal
Processing Toolkit are very welcome. Please submit them to
Tracker of the SPTK SouceForge page:
http://sourceforge.net/projects/sp-tk/
This page also includes many information about SPTK
(e.g., "Examples for using Speech Signal Processing Toolkit").
****************************************************************
Notes
****************************************************************
Generic properties of each command are summarized as follows:
i) Data has float-type format, i.e., single-precision floating
point format. This can be changed to double-type format by
specifying a compile option.
ii) Data files do not have headers nor any structures, i.e.,
they are flat row files.
iii) Basically they manipulate data through standard input and
standard output.
iv) To display (error) messages they use standard error output
rather than standard output.
v) They do not request interactive key inputs.
vi) Options are specified on the command line.
****************************************************************
Acknowledgments
****************************************************************
The following people have contributed to the development of SPTK
in various ways. It is their work that makes it all possible.
In no special order:
Takao Kobayashi
Keiichi Tokuda
Takashi Masuko
Chiyomi Miyajima
Masatsune Tamura
Takayoshi Yoshimura
Shinji Sako
Yoshihiko Nankaku
Fernando Gil Resende Junior
Toshihiko Kato
Gou Hirabayashi
Naohiro Isshiki
Noboru Miyazaki
Toshio Kanno
Kenji Chiba
Toshiaki Fukada
Satoshi Imai
Tadashi Kitamura
Heiga Zen
Toru Takahashi
Keiichiro Oura
and others.
****************************************************************
Who we are
****************************************************************
The SPTK working group is a voluntary group for developing the
Speech Signal Processing Toolkit. Current members are
Keiichi Tokuda (Coordinator) http://www.sp.nitech.ac.jp/~tokuda/
Takashi Masuko
Toru Takahashi http://winnie.kuis.kyoto-u.ac.jp/~tall/
Shinji Sako http://www.mmsp.nitech.ac.jp/~sako/
Yoshihiko Nankaku http://www.sp.nitech.ac.jp/~nankaku/
Heiga Zen
Junichi Yamagishi http://homepages.inf.ed.ac.uk/jyamagis/
Keiichiro Oura http://www.sp.nitech.ac.jp/~uratec/
and the members are dynamically changing. If you want to contact
the SPTK working group, please use Tracker of the SPTK
SourceForge page (http://sourceforge.net/projects/sp-tk/).
****************************************************************
Last modifiedNov 14, 2008