Project Practice of Corpus and Terminology Management

Project Background:

Volkswagen is a world-renowned automobile manufacturer with multiple models under its umbrella. Its demand is mainly concentrated in the three major languages of German, English, and Chinese.


Customer requirements:

We need to find a long-term translation service provider and hope that the translation quality is stable and reliable.

Project analysis:

Tang Neng Translation has conducted internal analysis based on customer needs, and in order to have stable and reliable translation quality, corpus and terminology are crucial. Although this client has already paid close attention to the archiving of documents (including original and translated versions), so they have the prerequisite for supplementary corpus work, the current problem is:
1) The majority of clients’ self proclaimed ‘corpus’ is not a true’ corpus’, but only bilingual corresponding documents that cannot be truly utilized in translation work. The so-called ‘reference value’ is only a vague and unrealistic wish that cannot be realized;
2) A small portion has accumulated language materials, but clients do not have dedicated personnel to manage them. Due to the replacement of translation suppliers, the formats of the corpora provided by each company are different, and there are often problems such as multiple translations of one sentence, multiple translations of one word, and mismatch between the source content and the target translation in the corpora, which greatly reduce the practical application value of the corpora;
3) Without a unified terminology library, it is possible for various departments of the company to translate terminology according to their own versions, resulting in confusion and affecting the quality of the company’s content output.
As a result, Tang Neng Translation provided clients with suggestions and offered services for corpus and terminology management.

Key points of the project:
Process bilingual documents of historical corpus and non corpus according to different situations, evaluate the quality of corpus assets, increase or decrease processes based on quality, and fill in previous loopholes;

New incremental projects must strictly use CAT, accumulate and manage language materials and terminology, and avoid creating new vulnerabilities.

Project thinking and effectiveness evaluation:
effect:

1.In less than 4 months, Tang was able to process bilingual historical documents using alignment tools and manual proofreading, while also organizing previously disorganized parts of the corpus. He completed a corpus of over 2 million words and a terminology database of several hundred entries, laying a solid foundation for infrastructure construction;

2. In the new translation project, these corpora and terms were immediately utilized, improving quality and efficiency, and gaining value;
3. The new translation project strictly uses CAT tools, and the new corpus and terminology management work continues on the original basis for long-term development.

Thinking:

1. Lack and establishment of consciousness:
Few companies realize that language materials are also assets, as there is no unified document and language material management department. Each department has its own translation needs, and the selection of translation service providers is not uniform, resulting in the language assets of the company not only lacking language materials and terminology, but also the archiving of bilingual documents being a problem, scattered in various places and with confusing versions.
Volkswagen has a certain level of awareness, so the preservation of bilingual documents is relatively complete, and attention should be paid to timely archiving and proper storage. However, due to a lack of understanding of the production and technical tools in the translation industry, and the inability to comprehend the specific meaning of “corpus”, it is assumed that bilingual documents can be used for reference, and there is no concept of terminology management.
The use of CAT tools has become a necessity in modern translation production, leaving translation memories for processed text. In future translation production, duplicate parts can be automatically compared in CAT tools at any time, and a terminology library can be added to the CAT system to automatically detect inconsistencies in terminology. It can be seen that for translation production, technical tools are essential, as are language materials and terminology, both of which are indispensable. Only by complementing each other in production can the best quality results be output.
So, the first thing that needs to be addressed in the management of language materials and terminology is the issue of awareness and concepts. Only by fully realizing their necessity and importance can we have the motivation to invest and fill the gaps in this area for enterprises, turning language assets into treasures. Small investment, but huge and long-term returns.

2. Methods and Execution

With consciousness, what should we do next? Many clients lack the energy and professional skills to complete this task. Professional people do professional things, and Tang Neng Translation has captured this hidden need of customers in long-term translation service practice, so it has launched the product “Translation Technology Services”, which includes “Corpus and Terminology Management”, providing outsourcing services for customers to organize and maintain corpora and terminology databases, helping customers to effectively manage them.

Corpus and terminology work is a work that can benefit more as it is done earlier. It is an urgent task for enterprises to put on the agenda, especially for technical and product related documents, which have high update frequency, high reuse value, and high requirements for the unified release of terminology.


Post time: Aug-09-2025