Current issue
|
subscribe
|
back issues on-line
|
order back issues
|
advertise
|
Search
|
ERCIM web site
ERCIM News No.26 - July 1996
The European Language Resources Association
by Khalid Choukri
The European Language Resources Association (ELRA) was established as a non-profit organization in Luxembourg, in February 1995. The overall goal of ELRA is to provide a centralised organization for the validation, and distribution of speech, text, and terminology resources and tools, and to promote their use within the European telematics R&TD community.
Language Resources (LRs) are universally acknowledged to be critical for the development of robust, broad-coverage, and cost-effective applications in all sectors of telematics, in particular those for written and spoken language. The cost of developing such resources is prohibitive, and due to the lack of sufficient co-ordination, existing LRs cannot be easily adapted for multiple users, thereby hindering the rapid deployment of new applications.
Market Situation
The LR area can be considered as three quite distinct fields, all of which are covered in the three 'colleges' of ELRA; terminology, written resources, and spoken resources. There is a great deal of terminology work going on in all the main languages of Europe, both at a general level and in every major sector of industrial and commercial activity. But the work is to a large extent uncoordinated and very little effort has been made to turn this work into commercial products using common standards, a situation that ELRA intends to rectify.
In the written field, the collection of corpora has become important in recent years, and is beginning to be a commercial activity; much remains to be done to organise this activity systematically and to cover all languages and user domains. The production of written lexica is very expensive and although there are many toy systems, there are few commercial activities outside those of the major publishing houses; an important source of material is the work of the established national language centres.
Spoken resources have become a fully commercial product in the last few years as the speech processing field has reached technical and commercial maturity. The major telecommunications firms have moved in and it is here that the market for ELRA's distribution activities is, in the short run, at its most mature.
In all three fields, the LR project activities of the EU Language Engineering (LE) programmes are producing new products and standards which must be a prime target for ELRA distribution activity.
To achieve its objectives ELRA has established a distribution unit (European Language resources Distribution Agency - ELDA) as the infrastructure within ELRA for identifying, collecting, classifying, validating, distributing, and exploiting LRs. ELDA manages and oversees these activities. Additional activities include developing evaluation guidelines, serving as a broker between producers and users of LRs, and functioning as a central clearinghouse for information.
ELRA appointed several panels of experts which will advise the ELRA Board in crucial aspects of its activities. The initial panels appointed by the board are:
Panel for the identification and collection of LRs
Panel for the validation of LRs
Panel for the distribution of LRs
Panel for external relationships
Each panel consists of a core of ELRA members, selected to represent the expertise of the 3 colleges (speech, written, terminology) and chaired by a convenor.
ELRA/ELDA has started addressing the fundamental organisational, technical, and economic problems which constitute the crucial barriers to the development of the market of LRs. For this purpose, ELRA is now working in order to:
constitute a catalogue of existing LRs and start to negotiate with suppliers the acquisition of an initial selected set of best-seller LRs for distribution
define a variety of viable contractual options for the suppliers and users of LRs
establish a pricing policy
study, with the assistance of a legal expert, practical methods and licence agreements to overcome problems related to intellectual property rights
establish co-operation links with permanent major suppliers of LRs
define a methodology for LR validation: whereas for the validation of speech LR methods and tools have already been to a certain extent studied and experimented, very little is known about methods for validating written and terminological LRs. ELRA will promote specific research on these issues, in synergy with LR projects launched in the 4th Framework programme of the European Commission
actively market and distribute the initial set of LRs which ELRA acquired.
The services provided by ELRA could vary from the simple cataloguing and propagating of information, to promotion and brokerage, through assistance to the producers in preparing their LRs for documentation, validation and normalization of LRs, including their physical distribution.
Requirements
Because the field is relatively immature, one of the first priorities is to establish standards to facilitate reuse for performance and interworking and standards for quality control of the resources. The project of the Expert Advisory Group on Language Engineering Standards (EAGLES) and other LE projects (SPEECHDAT, PAROLE, INTERVAL) will be used as the basis for this work in establishing the standards, but the role of ELRA is to ensure that the standards are applied, not least in quality control of resources.
Ownership rights are also a major problem in the field, with the associated problems of copyright and copying prevention. The project will analyse various possible solutions, suggest codes of conduct, stipulate contracts which regulate the status of LRs distributed.
Results of the work of the association can be measured by the number of members, by the number of LRs handled, and by the number and value of the LRs collected, validated, and disseminated. In a more qualitative, but perhaps in the long run more important sense, the success of ELRA will be judged by how it succeeds in raising the profile of LRs and LE throughout the EU. Results will also come from the stimulation it provides to the creation of LRs, and in particular in those fields where some social or other non-commercial incentive is provided for the creation and dissemination of LRs.
Please contact:
Khalid Choukri - ELRA
Tel: +33 1 45 86 53 00
E-mail:
elra@calvanet.calvacom.fr
return to the contents page