Ticklish trawling: The limits of corpus assisted meaning analysis

Miller, Donna Rose; Bayley, Paul; Bevitori, Cinzia; Fusari, Sabrina; Luporini, Antonella

Although our past corpus research has been based on the proposition that “A corpus […] is a treasury of acts of meaning which can be explored and interrogated from all illuminating angles, including in quantitative terms.” (Halliday 1996: 406), it is right to stop and take stock of just what we have done and what we have failed to do and take cognizance of what may have emerged as the limits of such interrogation. Perhaps, in short, it is time to engage more seriously with the assertion of Martin (2003: 177) that “It is texts that mean, through their sentences and the complex of logogenetic contingencies among them – they do not mean as a selection from, or a sum of, or worse, an average of, the meanings within the clause.” Or, one might add, within the conventional concordance line’s 9-word window. There have been many recent valuable attempts to reconcile corpus data and methods with a needed concern with how texts mean within specific cultural contexts (e.g., Partington, Morley & Haarman (eds.), 2003, Bayley (ed.) 2004; Morley J. and P. Bayley (eds.) 2009; Thompson & Hunston 2006), but can we in good faith say the same about a concern with how they mean in extended co-text? Have we, in other words, given too-short-shift to the notion of logogenesis: “[…] the unfolding of the act of meaning itself: the instantial construction of meaning in the form of a text […] in which the potential for creating meaning is continually modified in the light of what has gone before […].” (Halliday and Matthiessen 1999: 18, our emphasis). Has the ‘woods vs. trees’ conflict, which Thompson and Hunston (2006: 3-4) elegantly suggest might be somehow circumvented, truly been adequately resolved? Do we need to accept that ‘high level’ semantic analysis can only be performed manually; is the trade-off between volume and richness ever a judicious and advantageous one (Halliday and Matthiessen 2004: 48-49)? These and related thorny issues will be tackled by the contributers to the colloquium as follows: Donna R. Miller: Prologue Miller introduces the colloquium issues outlined above, exempifying these from her own series of ‘ticklish’ corpus-assisted, ‘register-idiosyncratic’ appraisal studies of US congressional speech (e.g. Miller & Johnson, to appear) and also engaging with what Thompson (under review) calls “the ‘Russian Doll’ dilemma”, particularly problematic for quantitative appraisal analysis. Paul Bayley & Cinzia Bevitori: In search for meaning: what corpora can/cannot tell Over the last decade or so there has been an ever-growing interest in the combination of corpus linguistics and discourse analysis (e.g. Baker 2006) and SFL studies (Thompson and Hunston, eds, 2006). Procedures have been glossed through metaphors such as ‘trawling’ the corpus and ‘shunting’ back and forth between corpus and text. However, it still remains to be seen to what extent the use of very different approaches to the study of language can contribute to an understanding of how texts mean what they do in socially situated contexts (beyond the simple concept of ‘aboutness’). This presentation will reflect on what can be achieved through a reconciliation of corpus and discourse studies and what, perhaps, cannot, based on an ongoing study of a diachronic corpus of over 220 years of messages from US Presidents to Congress. Sabrina Fusari: The potential and drawbacks of annotation: what taggers can/ cannot do Fusari elaborates on a previous study of the institutional and newspaper discourse of the EU sovereign debt crisis (presented at 23rd ESFLCW) which was performed on an unannotated corpus. In this second step of her study, Fusari explores the potential offered by some SFL-specific annotation tools (UAM Corpus Tool and the Halliday Centre Tagger). The analysis focuses on Transitivity, with special attention paid to Relational, Mental and Behavioural processes, which the previous step of this study identified as playing a crucial role in the construction of ideology. The pros and cons of this “very expensive and time-consuming” but “particularly important” (Wu, 2009: 142) procedure are evaluated, with a view to showing what results SFL annotation on a specific tool can deliver in comparison with fully manual methodologies. Antonella Luporini: Grammatical metaphor in Business English and Italian – to what extent does the corpus assist the researcher? Luporini selects results from her PhD thesis, in which she uses corpus methodologies to investigate the use of grammatical metaphor (Halliday and Matthiessen 2004: 586-658; Thompson 2004: 219-239) in construing the 2008 global crisis in the British and the Italian specialised press. Two ad hoc, tree-tagged corpora, specifically designed for this research and composed of articles from The Financial Times and Il Sole 24 Ore (approximately 350.000 and 650.000 words), are interrogated. Metaphor is a major challenge for corpus linguistic research: in the absence of automatic metaphor identification procedures that can match the level of accuracy of manual analysis, this is time-consuming and subject to bias. Luporini would shed light on this issue by describing the ‘hybrid approach’ she adopted in analysing her data, whereby parts of the texts were selected for close reading, largely on the basis of automatic queries results (obtained using The Sketch Engine: Kilgarriff et al. 2004), e.g. Word Sketches and collocational patterns.

CRIS Current Research Information System