A Computer Journal For Translation Professionals
|
|
_________________________________________________
This edition of the Tool Box Journal provided to you by
|
|
________________________________________________
|
|
Issue 21-11-331
(the three hundred thirty first edition)
|
|
Here is something to cheer you up and -- I don't know -- maybe marvel at how big the world is and how different its people are. I recently heard someone say it's a curse that people all seem to be alike (we all eat, drink, sleep, go to the bathroom, and have noses) but we turn out to be so very different. This is true for people around us, within our own culture, but, oh, so much more for people from a culture different than the one we're primarily familiar with.
What I'm going to share with you may be a good example of that. Most of you know that I've been working for a few years now on this crazy project called Translation Insights & Perspectives (TIPs), for which I collect specific examples from translations of the Christian Bible where something is gained in translation in a way that might be insightful for people who don't speak the language of the translation. If you don't know what I'm talking about, you might want to look at Source, the ATA Literary Division publication that just published an article about TIPs.
In the last year or so, I have been focusing more on non-textual examples of translation, including sign language, oral stories, song, and dance. This represents a shift happening within the world of Bible translation as well, where it's now widely recognized that there are different forms of language transmission, with textual being just one and not necessarily the most fitting for a particular situation or people.
In this video you can see the Old Testament book of Jonah translated into Southern Altai throat singing. Southern Altai is a Turkic language in Southern Siberia, spoken by about 55,000 people. Traditional throat singing has recently experienced a new surge in popularity in that part of the world and is widely embraced as the best way to tell stories. Amazingly, every word of the song in the video can be understood clearly by listeners who speak Southern Altai. If you spend a couple of minutes with the video, you'll see that there is a rather lengthy traditional introduction (which includes "let my great guttural singing be heard forever" -- a line that cracks me up every time I listen to it) and then drone footage of the area the language is spoken in as well as an artistic rendering of the story by a local artist. I really, really love this video -- not only because of the traditional great guttural singing boast, but because it's such a powerful translation that is recognizable even to us English speakers in its back-translated subtitles. (Plus, to the slight frustration of my wife, I have started to like and play the music.)
So much for the decidedly non-technical introduction to this edition of the Tool Box Journal . . ..
|
|
Contents
Windows 11
SightConsec: Use automatic speech recognition to improve your consecutive interpreting (Column by Josh Goldsmith)
A task workout routine for slow days (Column by Dorothee Racette)
Post-editing vs. the many other ways to work with machine translation -- the developers' view
New password for the Tool Box archive
The last word on the Tool Box Journal
|
|
With the Intento MT Hub for Localization, you get the best MT for every language pair at your fingertips.
Start the trial to do more in less time, and get your clients excited by both quality and speed.
- Get instant access to 40+ MT engines & 100500+ language pairs (no kidding!)
- Use intelligent AI to select the best-fit engine based on our benchmarks
- Work with 11 TMS/CAT (Fluency Now, Lingotek, Matecat, memoQ, Memsource, Smartcat, Smartling, Trados, Wordbee, Wordfast, and XTM Cloud)
|
|
Operating systems have become less and less important now that so much has moved to the cloud and so many browser-based applications don't care whether you run them on Windows, macOS/iOS, Linux, or Android. On the other hand, there are still plenty of applications that depend either completely or mostly on the operating system, including many of the translation environment tools we use. I say this, of course, because Windows 11, a major new operating system version, has officially been released, and you likely have already been prodded to upgrade or at least check whether your computer's hardware is compatible. (Assuming, of course, that you use Windows in the first place.)
I'm always excited to look at each new version of Windows. Not so much because of the new and widely touted features (most of which are really lame in this version, if you ask me), but to find out what new multilingual features are available -- such as newly translated versions, new kinds of keyboards, or new voice recognition languages. I was disappointed to find out that this version of Windows is localized into exactly the same number of languages as version 10, and I was not able to find anything relating to additional keyboard layouts.
But the "voice typing" options have gone from seven languages to . . . a LOT. Specifically, these languages are now supported: Bulgarian, Chinese (China, Hong Kong, Taiwan), Croatian, Czech, Danish, Dutch (Netherlands), English (Australia, Canada, India, New Zealand, UK, US), Estonian, Finnish, French (Canada and France), German, Gujarati, Hindi, Hungarian, Irish, Italian, Japanese, Korean, Latvian, Lithuanian, Maltese, Marathi, Norwegian, Polish, Portuguese (Brazil and Portugal), Romanian, Russian, Slovak, Slovenian, Spanish (Mexico and Spain), Swedish, Tamil, Telugu, Thai, Turkish, and Vietnamese.
To me, this is really great news. While it was possible to dictate in most of these languages on a Mac or with a rather convoluted system via a cell phone and an automatic transferal to a PC, things will become much easier for translators into the newly supported languages. To be clear, this refers to the Windows internal "voice typing" (which, by the way, you install alongside the keyboard of the language in question), so it's not as advanced as Dragon voice recognition. This means that there are no customized commands and no training or incremental improvement, BUT it's still really quite good. I can personally speak only for English and German, but my sense is that most of these languages will be more or less supported with the same level of accuracy. My assumption is reinforced because punctuation is also now available in each of the respective languages rather than only in English (see the list of dictation-language-specific voice commands on the page I linked to above).
So, congrats to all of you Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Gujarati, Hindi, Hungarian, Irish, Japanese, Korean, Latvian, Lithuanian, Maltese, Marathi, Norwegian, Polish, Romanian, Russian, Slovak, Slovenian, Swedish, Tamil, Telugu, Thai, Turkish, and Vietnamese speakers!
My tips to those who have never tried voice recognition: Don't translate in single words but in longer fragments or even sentences. Use your regular voice rather than a special dictation voice. And be aware that there is a little learning curve.
|
|
The Tech-Savvy Interpreter 2.0 - SightConsec: Use automatic speech recognition to improve your consecutive interpreting (Column by Josh Goldsmith)
|
|
This month, I interviewed Lilia Pino Blouin -- a rockstar member of the techforword insiders community and New York-based freelance interpreter who works from Italian, French and Spanish into English and English into Italian -- about her experience with a new hybrid interpreting technique.
In SightConsec, the interpreter uses automatic speech recognition to transcribe speech in real time, then interprets from the transcription when the speaker pauses.
Read on to learn more about Lilia's experience with this exciting new technique -- or check out the video of our interview!
JOSH: How did you get interested in speech recognition?
LILIA: Thanks to you, obviously! I've always thought speech recognition was interesting, but didn't know how to apply it. Then I saw your automatic speech recognition course and figured it would be a great primer.
JOSH: You've experimented with different speech recognition tools. Which ones do you like most?
LILIA: For English, Otter.ai works shockingly well. I first tried it during the pandemic, for former New York Governor Cuomo's daily press conferences. I figured: "I'm home and have nothing else to do. Why don't I use this for practice?"
So I used the method from your course and set up Loopback on my Macbook Pro to transcribe what Cuomo was saying. I started practicing "SightConsec" and thought it was brilliant.
It transcribed everything. All the numbers. All the names. Complicated names of institutions. Even personal names.
Of course, it's not perfect, and it took a lot of getting used to, but it was awesome. You didn't have to train the tool at all! I just connected it up -- which takes about two minutes -- then hit a button and it started transcribing. It was perfect. I was so impressed.
JOSH: What do you use for your other languages?
LILIA: Otter.ai is limited to English. For my other languages, I tried several other options and settled on Web Captioner. You can set the language and even the variety, like French from Quebec or France. It works really well with my languages -- Italian, French, and Spanish -- with no training. Every now and then a word isn't accurate -- let's say 5%. But normally I would interpret with nothing. So having 95% versus nothing is great!
JOSH: When did you first try SightConsec?
LILIA: The first time I tried it was a book prize on Zoom. The event was very information-dense. I had my whole setup ready because I'd been using it for practice. I just gave it a try! I had Web Captioner on and started taking notes, but also sight translated the transcription. It was brilliant. It saved my life. I did it for the entire hour-long event, and it really saved the day.
JOSH: Did you take notes?
LILIA: Yes. I didn't really trust the technology. I knew it was reliable because I'd done it a million times in practice. But you never know. So I also took notes. At first, I thought I'd rely on my notes and just look at the text if I wasn't sure about something. But my notes were so much less accurate than the transcription!
I kept reverting to the transcription. When I realized the tech was working, I started taking very minimal notes -- just so I could say something if there was an epic fail.
I still take notes to be on the safe side. But at the end of the day, my notes are a little useless.
JOSH: What gear do you need?
LILIA: Just Loopback and my computer. I have the meeting on my MacBook and the transcription and reference material on a 27" external monitor. But you could watch the meeting and transcription on your computer screen. Of course, you use a headset -- and camera, because sometimes you'll be on video.
JOSH: So, you just route the sound into Web Captioner or Otter.ai and see the transcription in another window?
LILIA: Yes. Other than that, everything is normal. Setting this up is the easiest thing in the world. I'm really grateful for your help. Your course is very hands-on and includes a step-by-step guide that I followed to the letter. When it started to work, it seemed magical. I thought, "This can't be real." It was so accurate.
JOSH: What about confidentiality?
LILIA: I only use speech recognition for public meetings. Confidentiality is vital, so we can't use cloud-based transcription.
JOSH: Do you use two different tools when you work between two languages?
LILIA: Yes. I use Otter.ai and Web Captioner. Toggling them on and off would be too much work. When I speak Italian, Otter transcribes mumbo jumbo and vice versa. But that helps me find the right spot quickly. The only challenge is page scrolls. With long speeches, you have to go back to the beginning, which can take a while. But flipping through your notes in a physical meeting poses the same challenge.
JOSH: When you're listening to the speech, do you use a glossary or research anything you hear?
LILIA: At some events, I had my glossary open in another window. When I heard a technical term, I knew I could stop taking notes and look up my term.
What I really love about this technique is that it gives me time to think about solutions. In regular consecutive, I'm so stressed about having complete notes that I write more than I should. I'm terrified I won't remember things. When something is complicated, I flag it in my notes and think I'll have to come up with a really creative solution. But with SightConsec, I can already think about my translation. I feel I do a much better job.
JOSH: How do you discuss speech recognition with clients?
LILIA: If the event is public and open to anyone, I don’t necessarily feel the need to discuss this with the client. Clients hire me to interpret, and that’s what I do.
For example, when I use pen and paper, I don’t talk about my note-taking method with clients. Similarly, I don’t think I need to specify that I’m using speech recognition. I’m not hiding it from them. I just don’t think how I work makes any difference to them -- unless they feel more relaxed about being able to speak for long periods.
At the end of the day, what matters is the quality of my interpreting -- not the tech I use.
JOSH: Any final comments?
LILIA: If it were up to me, I would do this all the time. I have a very high-profile assignment next week, and I would give anything to be able to transcribe it!
JOSH: Thanks, Lilia! I hope many colleagues go out and try SightConsec -- and then report back and tell us how it's working for them!
Josh Goldsmith is a UN and EU accredited translator and interpreter working from Spanish, French, Italian, Portuguese and Catalan into English. A passionate educator, Josh splits his time between interpreting, researching and teaching through www.techforword.com, which empowers language professionals to make the most of technology.
|
|
Want to use automatic speech recognition to interpret better?
|
|
A task workout routine for slow days (Column by Dorothee Racette)
|
|
As a self-employed linguist, you probably are familiar with the inevitable days when your inbox shows no job requests or project announcements from clients. I have always wished for a better way to predict those days, but they still take me by surprise. "Slow days are my least productive days," a friend once told me. "I know I should be handling a dozen things I never have time for, and yet all I seem to accomplish is cruising social media and hoping that someone will contact me with new work."
We yearn for downtime when we are inundated with project work, but the truth is that a slow day -- that perfect opportunity to finally clean up, balance the books, and create marketing messages -- goes by mostly unused. The 2020 pandemic lockdown has made it clear that waiting for "things to calm down" is a fallacy. Most of us were probably much too worried to focus on big-picture projects.
What is the best way to make use of a slow day? Here is an approach based on the structure of a physical workout that produces good results for me. You can adjust the details to fit your preferences, but the following elements are key to avoid getting bogged down:
- Don't start the day by sitting at your computer
- Don't open any social media platforms unless you're posting content
- Don't check your email obsessively for new messages
The most effective way to tackle a slow day in business is to keep moving. You can play music, wear comfortable clothes, and take breaks as you go through the following phases:
Warmup (30 minutes or more)
The sluggishness we feel on slow days comes from a mixture of dread ("what if I never get work again") and the discomfort of not having a habitual routine to follow. You can break through this sense of disorientation with a few physical tasks such as decluttering your office space and addressing small maintenance tasks on your list. Make those ergonomic adjustments to your desk setup you always wanted, shake crumbs out of the keyboard, or sort through papers. Resist your impulse to sit down and work on computer tasks in the warm-up phase. Instead, assess your workspace and look for any small improvements you can make.
Light workout phase (in multiples of 10-15 minutes)
Now make a list of easy work-related tasks that take no more than 15 minutes to complete. Don't choose any work that will take hours to complete. If you like, you can organize tasks into different categories, such as
- Administrative -- file structures, profile names
- Software -- patches, updates, password changes
- Finance- payment entries, account balances, forecasts
- T&I tools -- upgrades, cleanup
- Marketing -- set up a place to collect positive client feedback
Starting with the most doable pieces, work in time blocks and set a timer. Single-task as much as possible and work on making practical improvements to your work setup. If you notice that a task is ballooning, switch categories.
Run through several 15-minute blocks of task work, and then take a break.
High-intensity workout phase (no more than 60 minutes at a time)
Pick ONE big task in an area that feels challenging for you. That could be working on a blog post, writing website copy, or identifying potential clients you want to contact. Be clear what you want to accomplish in the given time and don't allow disruptions to take you out of your focused effort.
After an hour, save your work, and leave your office for an extended break. Watch a conference session or TED talk in a different room, read a book, or go for a walk.
Cooldown phase (30 minutes)
Return to an easier task level and continue to make improvements that will boost your productivity on busy days in the future. The cooldown phase is the perfect time to watch short "how to" videos on YouTube and to explore shortcuts. For example, you can try out software features functions you've never tested or set up a few hotkeys.
Here are a few slow-day lessons I've learned over the years:
- The next project is sure to come. There is no need to check your email every 10 minutes or to accept work at lower rates out of desperation.
- It's OK to get some rest.
- A single day's strategy work can significantly boost your business and income for the following months and years.
- If a slow day or a slow week leads to money worries, it may be necessary to take a hard look at your pricing practices. Running a successful freelance business should not mean living hand to mouth.
Dorothee Racette, CT has been a full-time freelance GER < > EN translator for over 25 years. She served as ATA President from 2011 to 2013. In 2014, she established her own coaching business, Take Back My Day, to help individuals and organizations solve problems related to workflow and time management. As a certified productivity coach (CPC), she now divides her time between translating and coaching. Her book Complete What You Started (2020) provides a blueprint for carrying big projects across the finish line. You can read her blog at takebackmyday.com/blog.
|
|
Productivity solutions on the go
Need a few quick actions to address a challenge with your business or productivity?
Try laser coaching with Dorothee -- one topic, twenty minutes of conversation, fresh insights.
|
|
Post-editing vs. the many other ways to work with machine translation -- the developers' view
|
|
One of the sessions I gave at the ATA conference a couple of weeks ago was subtitled, "Why It's So Important to Find the Right Words for What We Do." Most of you have read some of my lamentations on poorly chosen terms such as "localization" or "transcreation," especially when they are not applied to exactly what they were initially intended to mean (such as a series of tasks that include translation but also functional alteration to a software product or website in the case of "localization," and using techniques to alter advertisement copy for a new audience that might or might not include translation for "transcreation"). In many cases, they are used instead to mean something like "translation+" or "high-quality translation." This kind of employment of either of those terms is plain wrong and hurts us all (because it cheapens what "translation" means).
My ATA talk began with these two examples of how we hurt ourselves by diluting our message and miscommunicating what we actually do (though we ironically and very predictably get upset when the media mixes up "translation" and "interpreting").
I then moved on to a second and even more important and timely aspect of the words we choose to describe what we do: the definition of "post-editing."
To give you an understanding of why I think it's important to reflect on the meaning behind the concept of "post-editing," here is an email that I sent to all the different translation environment tool providers last week:
I believe it's become increasingly clear that translators need to become more creative with the use of machine translation not only in post-editing (i.e., correcting a machine translation from one engine per segment) but additionally or instead using MT as a resource along and in concert with other resources (termbases, TMs, corpora, dictionaries, style guides, etc.). I also believe that overall we have not really found the many ways that translation data can be extracted from MT suggestions. Some of the more creative ways that are being used by existing CAT tools include simultaneously using several MT engines and harvesting fragments from those via AutoComplete, correcting TM matches with MT fragments, evaluating MT suggestions via TMs or termbases, and others. All of these features -- and potentially so many others -- are features of the translation environment tool rather than the machine translation, which in turn puts all of you in an interesting position.
So here are my questions for you:
- In which ways are you already supporting working with MT that go beyond the classic post-editing?
- And what are your plans to do that in the near future?
I really believe these are important questions that will not only help to make your tool more attractive for users but will help the industry as a whole move away from the silly paradigm "if it has to do with MT, it must be post-editing."
I assume that y'all get my point, right? We need to get away from viewing using machine translation as something that by default has to be "post-editing" and instead can -- and should be -- used in a wide variety of ways.
A number of vendors wrote back with responses, some more to the point than others, but I will let you be the judge of that. (If you don't see a vendor listed here, it is likely because they chose not to respond.)
Across (via Christian Weih-Sum)
Over the last year, we have followed the demand from our user base regarding MT, and that demand has mostly been
- to support automated synchronization of TMs and terms with the engines so that they can get better (depending on what the MT providers are able to do)
- to provide context for the MT, e.g., by not sending segments sequentially but by sending them XLIFF, too (some MT providers, for instance, Omniscien, are asking us not to send segment after segment, but to send the whole file as XLIFF so that the MT may take context into account)
- to give feedback via post-editing distance reports
- to support various MTs to go for best-of-breed per project instead of one-size-fits-all
- to mark MT as creator / modifier of TUs
- add various features to the translation editor to make it easier for translators to work with MT output (filters, markings, QA features, etc.)
We will continue to implement feature wishes of our user base plus proactively implementing new stuff of the MT community (should there be any).
memoQ (via Éva Nagy)
When it comes to using MT as a translator, most people usually think of machine translation post-editing first. However, there are myriad creative uses of MT that can help you achieve better results when translating, and memoQ offers several ways to work with machine translation which go above and beyond the classic post-editing.
On-the-fly translation with multiple MT engines
In addition to the classic use of machine translation post-editing (where you edit an MT-pre-translated text), memoQ also allows their users to work with MT suggestions on-the-fly, in combination with other resources.
The project manager of the translation workflow can assign one or more MT services to the project, and the translator will see suggestions from the MT service(s) under translation results along with all other resources (TMs, LiveDocs corpora, TBs, etc.). It is up to the translator to decide for each specific segment as to which one to use and edit, if necessary.
Let memoQ's functionality choose the best MT engine for your project
If you're not sure which MT service provider to use, the Intento MT plugin in memoQ can even choose the best provider for your content. When you use this "smart routing" setting, the Intento plugin sends the source segments to the MT service it considers the best one for your content and language pair based on their regular evaluations.
Use your own glossary to enhance MT matches
With several MT plugins (e.g., Google Advanced, Amazon, DeepL), memoQ enables their users to upload their glossaries to improve the terminology used by the specific MT services, making the translation and the terminology more accurate for the specific project.
As for future developments, memoQ keeps investigating their users' needs and suggestions to see exactly what functions they need during their localization projects. The memoQ team are continuously talking to users as well as MT providers, investigating options to send metadata, segment context, project resource info to MT services along with the translatable segment to allow MT services to provide more accurate translations.
Memsource (via David Čaněk)
Making post-editing more efficient has been our mission since our early days. In fact, we introduced the "post-editing analysis" already in 2011. There are a number of post-editing features. Some of the more interesting and not as widely available are:
- Non-translatable segments: Identifying entire non-translatable segments and automatically marking them non-translatable
- MT Auto-select: Selecting the most optimal MT engine for a translation job, based on source language, target language, and the job's domain (e.g., automotive, IT, life sciences, etc.)
- MT Quality Score: Providing a score at segment level for MT output (Machine Translation Quality Estimation -- MTQE)
- Post-edit only low quality: Intelligent post-editing workflow routing just low-quality MT output to post-editors
And here are some of the features that are coming up:
- MT Flat Fee: our post-editing features including the actual MT service will soon be available as for a monthly flat fee in Memsource
- MT glossaries: seamless integration of customer glossaries with MT engines that support the feature
-
Seamless MT customization: this is very exciting, but we would prefer not to publish too many details yet.
OmegaT (via Jean-Christophe Helary)
It's pretty easy for OmegaT to create/use an MT-based TM (not specific to OmegaT, except for the ease of TM management), to add penalty to it and ensure that the translator is *visually* aware that the inserted match comes from MT. So, using MT as *reference* (not specifically for PE) is already available in OmegaT; also we already have multiple engines supported simultaneously.
I personally use MT externally all the time to speed up string entry, but it's not really PE (most of the time). Basically, I align a TM with the source and use the result as TMX.
Depending on the way you use the MT data (as TMX or as "glossary"), you can process it as fuzzies, or as "glossary items" with autocompletion, etc.
I'm not aware that OmegaT has plans to have super fancy things regarding MT, but since we don't have any official roadmap, there may be something that's brewing somewhere that I'm not aware of.
Star Transit (via Judith Klein)
Star Transit uses a system called TM-validated MT where an MT suggestion is compared with the target part of a very high source fuzzy match. If the MT suggestion only differs in the parts where the source is different (in comparison to the fuzzy match from the TM), then those differing parts are highlighted in the MT suggestion, allowing the translator to only focus on those parts.
A similar system is used with the termbase. If a source term is found in the termbase, the system will check whether either the correct translation according to the termbase is used in the MT suggestion or whether a disallowed translation is used. Either of those will be highlighted accordingly so the translator knows that the correct term is used or a blacklisted term needs to be corrected (note that Star Transit also supports morphology rules in 15 European languages which helps with this recognition).
And a new feature (which at this point is only implemented in the TextShuttle MT system but will be implemented with other systems, including DeepL and Star's own MT system as well) automatically switches terminology in MT suggestions according to the attached termbase.
Text United (via Marek Piorkowski)
It is a fascinating topic, indeed.
I do not agree that the paradigm is 'stupid,' though.
I think orchestrating an MT engine with human post-editing is a science on its own, and the human intervention can go beyond just editing MT-ed segment.
In TextUnited we focus on extracting as much data from the edits and funneling it back to MT, QA and workflow systems.
Trados (via Daniel Brockmann)
Trados Studio at last count offers access to 60+ providers or machine translation. Like you often say, the app store/open platform approach is something that we have always been keen on, and this is a nice proof point. At the same time, it also shows how fragmented the MT market has become, not least after the rise of NMT which has provided the quality progress that was needed for MT to become a realistic productivity helper for professional translators.
We have also decided to provide free capped NMT with Language Weaver (former SDL Machine Translation) and see thousands of users using it every day -- it's actually the most used MT in Studio today. DeepL and other providers are also very prominent obviously -- it's very important to have choice to have the optimum engine for the job at hand.
Why have so many providers gravitated to our platform over the years? We have the platform that is probably easiest to plug in -- see previous point. We are doing the same in the cloud where we have the beginnings of an app store as well with well-known MT plug-ins available also.
I think we are providing all the features you mentioned:
- Interactive MT usage including segment by segment or fragment by fragment
- I would say that fragment/AutoSuggest was more meaningful for SMT than NMT, as NMT tends to be more fluent so it's less important to be able to delete a bad translation and just use the useful fragments -- but users can still do this of course if it's better for their particular scenario.
- Many options around applying MT/TM interactively -- leaving segment empty by default and use fragments, or start with a segment and then use productivity tools for efficient review
- Batch MT usage (classic PEMT use case)
- Fuzzy match repair (MT fragments correcting fuzzy matches)
- Using several engines at the same time
-
Using apps to evaluate several MT engines and pick the most appropriate one -- such as with this tool
-
Some of the apps also go very far in supporting NMT beyond the basics -- this tool is an interesting example
Our plans include these:
- With Trados in the cloud, we have introduced the concept of "translation engine" where you have just one 'engine' that combines NMT, TM, terminology, rather than having to specify three separately.
- This is an evolution away from looking at the three main productivity tools separately with a somewhat fragmented UX and to help users to configure the best 'engine' for a job at hand by combining the best NMT model(s), TM(s) and termbase(s) in one go.
- This also holds interesting potential for the future when it comes to always providing the 'best possible match' for every segment and boosting productivity a bit more than today.
- MT quality estimation is also an interesting area to explore -- as this might help the translator/reviewer in the loop to focus most on those segments which might need deeper review than others.
- Adaptive Machine Translation (based on NMT rather than SMT) is also a key area that we are exploring in Language Weaver + Trados as a result. It's obviously already possible to train models, but shifting this to becoming more real-time will be interesting.
- Having said all this -- and I think this is also key -- it's important to understand that while review times may be lower than translation-from-scratch times, it is not good to squeeze rates even further by assuming kind of "quasi-zero review time." It is very key to be aware of the fact that translators need time to review, and this needs to be reflected in the rates as best as possible.
- Overall, with NMT becoming better, translation work will keep shifting away from translation from scratch to reviewing more-or-less 'reasonable' initial MT suggestions. I hesitate to call this PEMT, as that term for me is quite wedded to SMT, while with NMT the cognitive task is shifting a bit more to reviewing fluent translations -- which often makes it harder to spot errors such as terminology, faux sense, etc. So while overall, things are becoming more efficient, it's key to grant enough time (and payment!) for this shifting task!
Wordfast (via Yves Champollion)
Wordfast translators can use the "sub-segment from MT" feature. When post-editing MT would take too long, and the translator just translates from scratch, AutoSuggest will still suggest phrases, chunks, sub-segments, even terms, salvaged from MT suggestions. For what it's worth, needless to say.
Sub-segments are extracted from a cache that fills up as segments are being translated (the cache is cleared between documents). And having multiple MT sources enhances the effectiveness of that device.
Similarly, Wordfast (Classic only) can build a mini-glossary of terms that are fetched from the MT source (house = maison). Once there is an equivalence between source and target terms, target terms are expanded using Ms-Word's thesaurus to include the nearest synonyms and/or secondary meanings of an ordinary term. That brings a rich AutoSuggest feature. To be totally honest, the value of such a device is found more in the typing economy (number of keystrokes saved), than in lexicography. But here it is. In creative translation, transcreation, the device may be of greater help, because the rich array of suggestions can stimulate the transcreator's mind.
We have no "auto-assembling" scheme, like DejaVu was doing in a past era. The idea is brilliant, but results, in real use, are too scant.
XTM Cloud (via Andrzej Zydroń)
We are working in partnership with a leading NMT provider on a very new and really exciting initiative 'AI enhanced TM', which uses the power of NMT to ‘complete’ fuzzy matches. The full release will come in Q1 2022. A limited beta version of this functionality was released as part of XTM 12.8 and has proved to be very useful for the users involved.
Plus, we are working on advanced NLP AI technology to complement/analyze and take matching down to the phrase level.
|
|
New password for the Tool Box archive
|
|
Subscriber to the Premium version of the Tool Box Journal have access to an archive of Premium journals going back to 2007.
You can support the Tool Box Journal and be subscribed to the Premium edition right here.
Thanks!
|
|
The last word on the Tool Box Journal
|
|
If you would like to promote this electronic journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.
If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.
If you are interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.
© 2021 International Writers' Group
|
|
|
|
|
|
|