UCL Psychology Speech Group Dysfluency Database
Introduction
The Dysfluency database consists of recordings of 61 speakers
who have been studied by the UCL Psychology department speech group as
part of their research into stammering. The database is a unique
resource for researchers interested in the linguistic, phonetic and
acoustic properties of stammering. Some of the data has been
phonetically annotated and aligned transcriptions are available in
a number of formats.
Classes of speakers
The participants in the recordings are divided into five classes, as follows:
- Class 5
- are participants for whom there are at least three recordings
at age 8, age 10-12 and teenage. For some of these there are
additional recordings beyond teenage.
- Class 4
- are participants for whom there are recordings for age 10-12 and teenage. The lack of recordings at
age 8-10 reflects the fact that these children were not seen at clinic until they were aged 10+.
- Class 3
- are participants for whom there are recordings for age 8 and teenage, but not age 10-12 (often because the recording sessions
clashed with school or family obligations and could not be rescheduled).
- Class 2
- are participants for whom there are recordings for age 8 and age 10-12 but not for teenage,
This class also includes speakers that are still being monitored but who haven't yet
reached teenage. This class also contains recordings of some speakers younger than 8.
- Class 1
- are participants where only one recording is available and includes speakers who were either a) older than the maximum age of our
target group when they were first seen (i.e. only seen after they have reached teenage),
b) speakers who are in the target age ranges but who were only available at one target
age because they live too far away from the laboratory or c) speakers who were in the
required age range but with whom we have lost contact (most often because they have
moved home and have not notified us of their new address and telephone number).
Content
Only a small part of the complete database is certified for distribution.
A summary of the files currently available on the DVD is shown in this table:
Class |
# of participants |
# of files |
# orthographic transcription available |
# phonetic transcription available |
# aligned transcription |
Class 5 |
5 |
7 |
1 |
0 |
3 |
Class 4 |
7 |
20 |
6 |
6 |
0 |
Class 3 |
1 |
3 |
0 |
0 |
0 |
Class 2 |
7 |
13 |
5 |
1 |
2 |
Class 1 |
41 |
95 |
19 |
17 |
11 |
Totals |
61 |
138 |
31 |
24 |
16 |
File format
Materials
are included as WAV files at the original sampling rate
of 44.1 KHz. Audio data is also supplied in SFS
(Speech Filing System)
format and in MP3. Transcription files are supplied in
text format and pre-loaded into the SFS files.
Reference
P. Howell & M.Huckvale, "Facilities to assist people to research into stammered speech",
Stammering Research, Volume 1 Issue 2, July 2004, pp130-242.
Acknowledgment
This
database was produced as part of studies funded by the
Wellcome Trust.
For more information, please contact
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x