Abstract
When writing, discourse or coherence relations (Mann & Thompson 1988; Kehler 2002; Asher et al. 2003; Miltsakaki et al. 2004) are a paramount strategy to logically connect semantically related stretches of text. Formally, languages provide extensive sets of connectives that encode these semantic relations explicitly (Pander Maat & Sanders 2006). Although the use of such explicit cohesive devices is not necessarily correlated with coherence or text quality judgments (Crossley et al. 2016), its acquisition is an important steppingstone in text competence development. Thus, an in-depth analysis of the types and variety of connectives used at different stages of a writer’s school education could provide important empirical data for training of textuality features and writing assessment in L2 and L1 teaching practice of a particular language.
In our contribution, we analyzed the quantity and repertoire of explicit connectives found in argumentative texts of L1 and L2 speakers of Italian in the 3rd year of lower secondary school, and after four years of training, i.e., in the 4th year of upper secondary school. In our analysis, we focus on explicit causal connectives as one important means for constructing coherence in argumentative texts, in that they explicitly point out supporting reasons and anticipated consequences, to convince an audience of a statement.
Our research questions are:
• Are there any common trends in the use of explicit connectives employed by students through time, regarding quantity and repertoire of uses?
• Are there any significant differences in the use of explicit connectives by L1 and L2 speakers at the same developmental stage?
To answer these questions, we automatically annotated the explicit causal connectives in a sample of 200 texts, evenly distributed between the four conditions, namely first/second language and lower/upper secondary school. All texts were gathered in the multilingual province of Bolzano/Bozen in Italy and originate from three different learner corpora. Argumentative texts of lower secondary school writers were randomly sampled from the L2 and L1 writers in the Italian sub-corpus of LEONIDE (Glaznieks et al. 2022). The texts of upper secondary school writers were drawn as a random sample from the Italian Kolipsi-2 corpus (L2 data, Glaznieks et al. 2021) and from data collected in the ITACA project (L1 data, https://itaca.eurac.edu/). The automatic annotation follows a dictionary-based approach aided by the Lexicon for Italian COnnectives (LICO) (Feltracco et al. 2016), a repository of Italian connectives aligned with the PDTB 3.0 (Webber et al. 2019). To analyze quantity, we observed both the number of causal connectives per text (normalized per 100 words to account for text length differences) and the ratio of causal connectives of all connectives. Furthermore, we investigated the students’ repertoire of causal connectives qualitatively and quantitatively, extracting frequencies from a reference corpus (CORIS, Rossini Favaretti et al. 2002) to understand which kind of connectives (if low or high frequency) were present in the four groups. We calculated both the mean and the standard deviation of the frequencies of connectives used in each group, to measure differences in the repertoires.
Our analysis, aided by linear regression models, shows that the number of causal connectives decreases significantly in the upper grades, independently of L1/L2 variable. However, the category is not internally homogeneous: causal connectives of the result type (e.g., quindi, di conseguenza) display a remarkable relative growth in upper secondary school, suggesting that result relations are more complex and therefore learned later on. Regarding the kind of connective used, older students of both groups use significantly less common connectives than younger students. Changes in the variety over time exist only in the L1 group, in which new, rarer connectives (e.g., per via, siccome, cosicché), may emerge aside the high frequency ones typical of lower grades (e.g., per, perché, così, quindi). L2 students, instead, seem to use a narrower range of connectives with similar frequency (higher for lower grades and lower for upper grades). In general, the change in the use of causal connectives over time was similar for both L1 and L2 students, with the only significant difference between L1 and L2 students visible only in the variety and average frequency of connectives used in the upper grades. Results suggest that, for both L1 and L2 writers, quantity and variety of connectives employed are in a tradeoff: while through time students may learn other strategies to express coherence relations – determining the decrease in the quantity of connectives –, they learn to use also rarer connectives, supposedly the ones present in formal, academic language.
References:
Asher, N. M., & Lascarides A. (2003). Logics of Conversation. Cambridge: CUP.
Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The development and use of cohesive devices in L2 writing and their relations to judgments of essay quality. Journal of Second Language Writing, 32, 1-16.
Feltracco, A., Jezek, E., Magnini, B., & Stede, M. (2016). LICO: A Lexicon of Italian Connectives. In A. Corazza, S. Montemagni, & G. Semeraro (Eds.). Proceedings of the Third Italian Conference on Computational Linguistics CLiC-it 2016. Torino: Accademia University Press, 141-145.
Ferrari, A. (2014). Linguistica del testo. Principi, fenomeni, strutture. Roma: Carocci.
Glaznieks, A. Frey, J.-C., Nicolas, L., Abel, A. & Vettori, C. (2021). Kolipsi-2 Corpus v1.0, Eurac Research CLARIN Centre, http://hdl.handle.net/20.500.12124/30.
Glaznieks, A., Frey, J.-C., Stopfner, M., Zanasi, L., & Nicolas, L. (2022). LEONIDE: A longitudinal trilingual corpus of young learners of Italian, German and English. International Journal of Learner Corpus Research, 8(1), 97-120.
Kehler, A. (2002). Coherence, reference, and the theory of grammar. Stanford: CSLI Publications.
Mann, W., & Thompson, S. (1988). Rethorical Structure Theory: Toward a functional theory of text organization. Text, 8, 243–281.
Miltsakaki, E., Prasad, R., Joshi, A., & Webber, B. (2004). The Penn Discourse Treebank. In M. T. Lino, M. F. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.). Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04). Lisbon: European Language Resources Association, 2237-2240.
Pander Maat, H., & Sanders, T. (2006). Connectives in Text. In K. Brown (ed.). Encyclopedia of Language & Linguistics. Amsterdam: Elsevier, 33-41.
Rossini Favaretti, R., Tamburini, F., & De Santis, C. (2002). CORIS/CODIS: A corpus of written Italian based on a defined and a dynamic model. In A. Wilson, P. Rayson, & T. McEnery (Eds.). A Rainbow of Corpora: Corpus Linguistics and the Languages of the World. München: Lincom-Europa, 27-38.
Webber, B., Prasad, R., Lee, A., & Joshi, A. (2019). The Penn Discourse Treebank 3.0 Annotation Manual.