Research (Evaluation) : Jean-Yves Antoine

icone Presentation

Evaluation is nowadays a central question in terms of best practices for Natural Language Engineering. Evaluation campaigns are regularly conducted on a large variety of application domains and provide useful information on the behaviour of NLP systems in real situations of use. One should however regret that most of these evaluations favour immediate results (tuning on development and test data) and do not really provide a deep explanation on the successes and failures of the assessed systems. This is why I conduct some methodological thoughts on how assessing NLP systems.

Man-machine dialogue and speech understanding - Definition of several methodologies (DCR, DEFI) of test for the predictive evaluation of speech understanding. Based on linguistically-motivated test suites, these paradigms associate the definition of objective metrics with a detailed analysis of the behaviour of speech understanding systems. They have inspired the evaluation paradigm used during the MEDIA/EVALDA (2002-2005) French-speaking evaluation campaign of speech understanding systems
Augmentative and Alternative communication - Evaluation of AAC systems from the user point of view, aside from standard but quite artificial metrics such as keystroke saving rates (KSR).
Emotion detection and annotation - Experimental studies of the inter-coders agreement metrics currently used in NLP when considering the emotion annotation of speech corpus, as well as analysis of the influence of this agreement on the results of evaluation campaign in emotion detection. In particular, this work has shown some limitations of the Kappa statistics for NLP tasks.

icone Works and projects

Spoken language processing

ARC ILOR-B2 de l'AUF (1996-2000) - DCR (Demand - Control - Request) methodology for the evaluation of speech understanding, in collaboration with Jérôme ZEILIGER (ICP Grenoble), Jean CAELEN (CLIPS-IMAG, désormais LIG Grenoble) and Jacques SIROUX (IRISA, Lannion).
5.5 Workgroup on Speech understanding (GDR-I3 du CNRS ; 1998-2005) - DEFI ("Contest") methodology for the evaluation of speech understanding
MEDIA/EVALDA (2002-2005) campaign of evaluation of French speech understanding systems.
Ester 2 (2009) and ETAPE (2012) evaluation campaigns : named entities recognition in speech broadcastings.

Augmentative and Alternative Communication (AAC)

ESAC_IMC (Fondation Motrice , 2006-2007) - Survey of the behaviour of AAC systems with users suffering from additional language disabilities (dyslexia...).
VOLTAIRE project (AFM , 2008-2009) - Integration of Sibylle word prediction in the CVK / CiViKey freeware virtual keyboard - Long-term evaluation of the keyboard with real users (PhD of Samuel Pouplin ).

icone Some publications

Jean-Yves ANTOINE, Marc LE TALLEC, Jeanne VILLANEAU (2011) Evaluation de la détection des émotions, des opinions ou des sentiments : dictatute de la majorité ou respect de la diversité d'opinions ? Actes TALN'2011, Montpellier, France, Juillet 2001 [HAL-00625727]
Damien NOUVEL, Jean-Yves ANTOINE, Nathalie FRIBURGER, Denis MAUREL (2010) An analysis of the performances of the CasEN named entities detection system in the Ester2 evaluation campaign. Proc. 9th European conference on Language Resources and Evaluation, LREC’2010, Valetta, Malta, May 2010. [HAL-00502370]
Philippe BOISSIERE, Igor SCHADLE, Jean-Yves ANTOINE (2006) A methodological framework for writing assistance systems: applications to sibylle and VITIPI systems. AMSE Journal on Modelling, Mesurement & Control, Série C., Barcelona, Spain. Vol 67, pp. 167-176 .
Laurence DEVILLERS, H. MAYNARD, P. PAROUBEK, S. ROSSET, J-Y. ANTOINE, F. BECHET, C. BOUSQUET, O. BONTRON, L. CHARNAY, K. CHOUKRI, K. McTAIT, L. ROMARY, M. VERGNES, N. VIGOUROUX (2004) The French MEDIA/EVALDA project: the evaluation of the understanding capability of Spoken Language Dialogue Systems. Proc. 4th European Conference on Language Resources and Evaluation, LREC'2004, Lisbonne, Portugal .
Jean-Yves ANTOINE, Caroline BOUSQUET-VERNHETTES, Jerome GOULIAN, Mohamed Zakaria KURDI, Sophie ROSSET, Nadine VIGOUROUX, Jeanne VILLANEAU (2002) Predictive and objective evaluation of speech understanding: the “challenge” evaluation campaign of the I3 speech workgroup of the French CNRS. Proc. 3rd International Conference on Language Resources & Evaluation, LREC’2002, Las Palmas de Gran Canaria, Espagne. pp.529-535
Jean-Yves ANTOINE, Jacques SIROUX, Jean CAELEN, Jeanne VILLANEAU, Jerome GOULIAN, Mohamed AHAFHAF (2000) Obtaining predictive results with an objective evaluation of spoken dialogue systems : experiments with the DCR assessment paradigm, Proc. 2nd International Conference on Language Resources & Evaluation, LREC’2000, Athenes, Grèce .