Abstract
In spontaneous speech, words are normally uttered in continuous chains whereby boundaries are blurred and segments can undergo changes, e.g., vowel centralization, consonant lenition, or deletion, due to structural linguistic factors or contingent sociolinguistic factors. However, such “reduction” processes are often studied by comparing connected speech realizations with isolated words in their expected, phonological forms. This study aims at investigating systematic patterns of phonetic variation in a dataset of spontaneous Italian discourse by considering the syllabifications of the speech chain at the phonological and phonetic levels. Moreover, since phonetic segmentation, based on human perception, implies a discretization of continuous productions, we employ unsupervised computational techniques, namely clustering algorithms, which allow for the observation of patterns emerging from the data without external (human) interference. The linguistic analysis shows systematic reduction processes related to the syllabic structure and lexical stress. The computational representation highlights a convergence of syllables to the
non-marked structure in Italian (CV).