|Multilingual Natural Speech Technology|
An Italian Broadcast News Corpus
This project aims at collecting a multimedia corpus of radio broadcast news in Italian. The corpus will include audio signal, transcription and documentation for the users. Broadcast news will be acquired for the digital archive of the Italian major broadcaster Radio RAI. Hence, the project aims at producing a new language resource starting from digital audio recordings.
Work Plan (internal use)
Transcription specifications (in Italian)
- Names: A-C - D-J - K-P - Q-Z - Italian telephone direcotry - La Repubblica 08/2000
- Search engines: Arianna - Altavista
- Geography: Italian comuni - world maps
- Politics: world political leaders
- Sport: tennis players names - skiers names - Formula uno
- Arts: musicians/writers - actors/directors
- LDC conventions
- Transcriber tool
This page is mantained by Paolo Coletti.