PUBLIQUE: French corpus repository
PUBLIQUE ("Free Speech")
is a repository website where you can freely download corpora of
spontaneous speech in French language, under a Creative Commons
licence. It is more precisely dedicated to pilote
corpus studies for Human-Machine Dialogue, Augmentative and
Alternative Communication (AAC) and Coreference Resolution, but our
corpora can obviously serve
other purposes. Generally speaking, PAROLE PUBLIQUE provides the audio
tracks and orthographic transcripts of the recorded dialogues. Some
corpora are enriched resources with anaphoric or coreference relations.
At last, the TestAccord
data bank collects various annotation data sets which provides useful
data for the experimental studies on inter-coders reliabilty,
regardless of the studied language.
If you are interested by our corpora, there are great chances that you
are at least a proof reader of French (congratulations !). This is why
I invite to to consult the French
pages of this website.
May 2014 - The Accueil_UBS and OTG corpora are now also available on the Speech and Language Data Repository