About us

Valentina Schettino (Castellammare di Stabia, 1989) is a PhD Student in the Department of Literary, Linguistic and Comparative Studies in the University of Naples “L´Orientale” since November 2014. Her undergraduate studies at the University of Naples “Federico II” consisted of classes of General Linguistics, German Linguistics, Language and Literature, English Linguistics, Language and Literature, Italian Linguistics and Literature, Germanic Philology, Didactics. In 2014 she graduated with an experimental thesis in German Linguistics entitled “Reduktionsprozesse in der fließenden Rede: ein empirischer Vergleich zwischen Phonetik und Phonologie im Italienischen und Deutschen” [Reduction processes in connected speech: an empirical comparison between Phonetics and Phonology in Italian and German], under the supervision of Prof. Dr. Livio Gaeta. The main research interests are:

Phonetics and Phonology of German, comparison with Italian
Prosodic Prominence
- Acoustic correlates, interaction and disentanglement
- Prosodic context (non-prominent syllables)
Prosody
- Local vs contextual intonational phenomena
- Acoustic, prosodic and phonetic aspects in German L1 and L2

About the corpus

Our database consists of 24 German native speakers producing more than 9 hours of spoken speech. The registration process was carried out in the recording studio of the Bielefeld University. Most of the speakers in our corpus were students at the Bielefeld University; they attended different faculties, but all of them had a course on Italian as a second language; the remaining participants were students of Italian in a High School in Bielefeld; just one informant was a professor of Italian in the same school. Mean age was 25,6 years, with a total amount of 9 men and 15 women; different levels of fluency in Italian were examined, and specifically 14 speakers of the level A, 8 of the level B and 2 of the level C (CEFR); we did not take the diatopic variation into consideration. We decided to organize our interview as follows: at first, participants were asked to read an Italian text; secondly, they were asked to comment some images in Italian (this phase was optional, depending on the fluency level of the participant); lastly, two participants were asked to play the TicTacToe game together, both in German and in Italian. Our interview was actually organized in two main phases: in the former, the speaker was alone in an isolated room for the reading and commenting parts; in the latter, two participants played together, thus recording spontaneous speech. Both phases were registered with high quality microphones in one of the anechoic chambers of the registration studio; a video camera registered the scene, too: in this way, we have the possibility of examining body gestures, enlarging the exploitation horizon of the corpus; moreover, the video file can be used in some circumstances to disambiguate unclear meanings using non-verbal information. As far as segmentation is concerned, we followed different criteria for the read and spontaneous productions, in order to reflect the characteristics of each different speech type: for the read part, we segmented the flow of speech in clauses; the commentaries were segmented recalling the concept of breath group; as regards dialogues, we segmented the productions in speech turns. For the transcription procedure, we started from the orthographic one: an automatic transcription was provided by Cedat85, a leading speech processing company which operates both in the field of public administration and for private citizens, through the website «www.trascrivi.it». The lexical sequence was later integrated with an annotation of specific linguistic and extra-linguistic phenomena. Our recordings have been transcribed according to the transcription rules used in the CLIPS project (Corpora e Lessici dell'Italiano Parlato e Scritto, cfr. Savy & Cutugno 2009): we labeled all linguistic and extra-linguistic phenomena affecting communication: pauses, noises, overlapping productions (both in the same turn and between turns), together with transcriber’s comments about voice quality (cfr. Savy 2005). For the labeling phase, we used Praat (cfr. Boersma 2002; Boersma & Weenink 2010.), a well-known software for phonetic analyses. The labeling annotations are aligned with the relative audio file; the format of the annotations is «.TextGrid». The whole corpus was PoS-tagged with the Treetagger tool (cfr. Schmid 1995; Schmid et al. 2007). Seen the central role played by the syllable in spoken production, we decided to implement our corpus annotations and segmentation with an automatic syllabification tool: we used Prosomarker, developed by Origlia & Alfano (2012). This tool also supply the user with some intonational information: specifically, an automatic stylization of the pitch movements is provided.

About us

About the corpus

Copyright Valentina Schettino 2016