Experimenting with DeepL: trying to improve inter-segment term consistency
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Mar 29, 2018

By default, DeepL will segment the input that you present to the system:

split

So, if you send these 4 related segments, they will be treated as 'not related', which can lead to inconsistent translation of terms (here: Gesamtanlage):

Maßzeichnung der Gesamtanlage
Gesamtanlage starten
Gesamtanlage nach Sicherheitsstopp starten
Beschickung der Gesamtanlage


0

This result matches the results of the 4 individual segments:

2

3

4

Screen Shot 2018-03-29 at 10.06.20

However, when you force DeepL to interpret the 4 segments as 1 segment, thus making them related, you'll get a better result ('Gesamtanlage' is translated consistently):

Screen Shot 2018-03-29 at 10.12.51

(On a side note: This approach has the disadvantage that the chapter titles in the example are interpreted as instructions.)

Of course, the peepl at DeepL are working hard on improving inter-segment consistency (which is considered to be the Holy Grail of MT).

In the meantime we could try to let CT offer several segments at the same time and re-segment the translated result to the TM.


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
English as the inter-language Mar 29, 2018

As long as DeepL makes this kind of 'mistakes', we human translators have nothing to fear:

Screen Shot 2018-03-29 at 11.07.41


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Chopping up your source text in chunks of 5000 characters Apr 5, 2018

Here are some nice macros to chop up your source text in chunks of 5000 characters, which is the maximum number of characters to insert in DeepL web:

https://forum.keyboardmaestro.com/t/would-like-to-create-a-macro-chop-up-text-in-approx-400-word-segments/9946/7

One idea that comes to mind is to save
... See more
Here are some nice macros to chop up your source text in chunks of 5000 characters, which is the maximum number of characters to insert in DeepL web:

https://forum.keyboardmaestro.com/t/would-like-to-create-a-macro-chop-up-text-in-approx-400-word-segments/9946/7

One idea that comes to mind is to save these chunks of 5000 characters and the DeepL-generated translations as bitexts and import these in CafeTran Espresso 2018.

I'd have to investigate if this approach would improve terminological consistency.
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Example Apr 5, 2018

A Terminotix bitext looks like this:

<?xml version="1.0" encoding="utf-8"?>
<bitext version="1.2">
<meta>
<langsrc>ENG</langsrc>
<langtgt>DEU</langtgt>
</meta>
<segments>
<seg id="1">
<src>Source sentence one. Source sentence two. Source sentence three.</src>
<tgt>Target sentence one. Target sentence two. Target sentence three.</tgt>
</se
... See more
A Terminotix bitext looks like this:

<?xml version="1.0" encoding="utf-8"?>
<bitext version="1.2">
<meta>
<langsrc>ENG</langsrc>
<langtgt>DEU</langtgt>
</meta>
<segments>
<seg id="1">
<src>Source sentence one. Source sentence two. Source sentence three.</src>
<tgt>Target sentence one. Target sentence two. Target sentence three.</tgt>
</seg>
</segments>
</bitext>

Import of a sample bitext:

1

2

Of course it would be nice, to have the segmentation automatically adjusted.

[Edited at 2018-04-05 19:05 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Experimenting with DeepL: trying to improve inter-segment term consistency






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »