The 286th Tool Box - Premium Edition

A computer journal for translation professionals

Issue 18-4-286
(the two hundred eighty sixth edition)

Contents

1. What Exactly Is a Technical Freelance Translator?

2. Equipping Yourself

3. The Tech-Savvy Interpreter: It Finally Happened . . . AI Tried to Replace Conference Interpreters

4. Quo Vadis, Content?

5. This 'n' That

6. New Password for the Tool Box Archive

The Last Word on the Tool Box

"Some problems lend themselves more easily to A.I. solutions than others"

Katie Botkin of MultiLingual pointed to a fascinating part of Mark Zuckerberg's statement in his U.S. Senate hearing that perfectly demonstrates the nature of language and the intricacies of translation. Here's what he said:

"Some problems lend themselves more easily to A.I. solutions than others. So hate speech is one of the hardest, because determining if something is hate speech is very linguistically nuanced, right?

"It's -- you need to understand, you know, what is a slur and what -- whether something is hateful not just in English, but the majority of people on Facebook use it in languages that are different across the world. (...)

"Hate speech -- I am optimistic that, over a 5- to 10-year period, we will have A.I. tools that can get into some of the nuances -- the linguistic nuances of different types of content to be more accurate in flagging things for our systems.

"But, today, we're just not there on that. So a lot of this is still reactive. People flag it to us. We have people look at it. We have policies to try to make it as not subjective as possible. But, until we get it more automated, there is a higher error rate than I'm happy with."

Sounds a lot like he's talking about machine translation, doesn't it? And, yes, I couldn't agree more with his idea that AI is a long way from "understanding" language -- though I'm very skeptical about the (infamous) "5- to 10-year period."

ADVERTISEMENT

Imagine all the world's best dictionaries at your fingertips!

For a fixed monthly fee, on all your devices, integrated in your daily work applications!

WordFinder Unlimited - One service, 5 applications and more than 260 dictionaries.

Click here to find out 10 things you need to know about WordFinder Unlimited

1. What Exactly Is a Technical Freelance Translator?

I was asked some time back to write a book chapter about freelance translators and translation technology. Not surprisingly, I started by defining a "freelance translator" in this context. Here's what I came up with:

"According to Wikipedia, a 'freelancer' is 'a person who is self-employed and is not necessarily committed to a particular employer long-term. (...) The term freelancing is most common in culture and creative industries [such as] music, writing, acting, computer programming, web design, translating and illustrating, film and video production, and other forms of piece work which some cultural theorists consider as central to the cognitive-cultural economy.'

"With translators listed directly in the middle of groups identified as typical freelancers, we need to further narrow the distinction between literary and technical translators. 'Technical translation' is defined according to Sofer (The Global Translator's Handbook. Lanham: Taylor Trade Publishing, 2012, 20) 'by asking, does the subject being translated require a specialized vocabulary, or is the language non-specialized?' A sampling of areas in which technical translators are active includes aerospace, automotive, business/finance, chemistry, civil engineering, computers, electrical/electronic engineering, environment, law, medicine, military, nautical, patents, social sciences, and telecommunications (ibid., 67f.).

"The diversity of fields for technical freelance translators is reflected in other areas of diversity as well.

"First, there is a wide array of commitment to the task of technical translation, ranging from voluntary, occasional (paid), and full-time translators. In the context of this contribution, we will consider only technical translators who make a substantial part or all of their livelihood by performing translation for one or -- more typically -- many clients. These clients could be translation agencies that subcontract to individual freelance translators or direct clients who hire freelance translators without a mediating actor. End clients may range from large international organizations to individuals who need to have personal documents translated.

"Second, the most natural area of diversity originates in the many different language combinations. Both source and target languages differ greatly in how they are supported by technologies. This includes

access to dictionaries and/or corpora
spell- and grammar-checking
input methods (including voice recognition)
morphology recognition
machine translation
the applicability of technologies that rely on parameters such as space-based word delimiters or fuzzy term recognition in languages with no traditional word boundaries or no inflection

"Third, there tends to be a correlation between the translated languages and the location of the translator. In turn, the location has an impact on the access to various kinds of technologies, from limitations to online resources applied by service providers or political control or simply prohibitive costs.

"And finally, the nature of each translator's specialization also results in differing technology requirements, including potential limitations of using certain technologies that may not match security protocols or regulations or a particular high (or low) appreciation of very specific terminology with its corresponding technology requirements.

"Given all this, the following observations are by necessity generalizations about the members of this diverse community."

Is that how you would define (professional, technical) freelance translator? I'd be eager to hear some feedback.

ADVERTISEMENT

With GT4T, the ultimate online reference tool, you can use Google Translate, DeepL and many others without ever having to leave your work environment.

GT4T works in all programs. Get ready to be amazed.

Learn more and download at gt4t.net.

2. Equipping Yourself

I had an interesting talk with Deepinder ("Deep") Singh last week. Deep, a veteran localization product and localization manager at Dell, partnered with Prasoon Rana, who has had a long career at SAP, to form Prudle Labs . After reading their website, you -- like me -- might still have a hard time actually understanding what Prudle Lab is all about, but I think it's worthwhile to look at. Especially if you are a small or mid-size language service provider who knows that there is a very significant gap between what you can offer on the high-end technology spectrum in comparison to your much larger competitors and/or what your client might expect. You might have your translation processes nailed down, have adequate linguistic quality assurance processes, and even use a good supporting tool set. But when it comes to the internationalization of some applications, localization of an app, or the implementation of technical quality assurance, you might not feel as confident. This is where a company like Prudle Labs might just be helpful.

The Prudle founders looked at their combined experience at Dell and SAP (two of the largest translation and localization buyers and practitioners), identified the typical weaknesses (corporate speech: "pain points") in their processes, and built a chain of products and processes that specifically address those and everything else in the process. Now, Prudle Labs would be happy to work directly with end clients to do the whole shebang (i.e., lifecycle from internationalization to localization/translation, quality assurance, and product launch). After talking with Deep, however, I realized that they -- and comparable companies -- might fit much more seamlessly into an existing infrastructure by supplementing the technical capabilities of the myriad translation agencies out there that quite frankly sit out a lot of opportunities because they don't think they have the expertise.

Last week I sent out a tweet that garnered remarkable Twitter-tention:

"Just spoke with a translation agency that's successfully been in business for 20 years and that has never really used translation technology of any kind. Not as an example to follow but to show that the world of translation is very diverse, more than we sometimes think."

I was not talking about an agency that translates only high-end marketing or other highly creative, non-repetitive materials; this agency actually offers technical translation. They'd never gotten around to implementing translation technology and had been doing fine. Could they have been more successful with a robust set of technology? Probably, maybe even likely so. The point is this: They've been doing okay (and probably had very happy contracting translators who kept the benefits of using technology all to themselves), and they're not the only ones. In fact, there's a broad range of translation companies with every possible combination of technical readiness, but only very, very few are technically equipped (either in experience or equipment) with everything that's out there. A company like Prudle Labs offers the possibility to focus on the areas in which you are strong and find experienced partners for the rest.

ADVERTISEMENT

Across Quick Tutorials

Are you new to the Across Translator Edition? If so, take a look at our new YouTube channel. The channel features various tips and tricks to help you get started. Go to across.net/youtube.

3. The Tech-Savvy Interpreter: It Finally Happened...AI Tried to Replace Conference Interpreters (Column by Barry Slaughter Olsen)

Tencent's AI Interpreting Fail

At this year's Boao Forum in China's Hainan Province, Chinese internet giant Tencent rolled the AI dice and unveiled its speech-to-speech translation system with great fanfare. According to Harry Dai, Vice-Dean at the Graduate Institute of Interpretation and Translation at Shanghai Foreign Studies University, the announcement sent shockwaves through the professional interpreting community in China in the lead up to the Forum. In the press, there were claims that the system achieved 97% accuracy. Interpreters were worried. (Watch Harry's presentation at the 22^nd SCIC Universities Conference here, starting at 5:43:46)

Then, as reported by the South China Morning Post, the speech-to-speech translation system "made an error-filled debut" and "spouted gibberish" displayed on a screen in the conference venue and on a special WeChat app. Upon hearing the news, nervous interpreters were quick to express their relief and even gloat about the tech disaster on social media. (Check out the Slator story on the topic to see some examples of the gibberish produced and interpreters' snarky responses to the debacle.)

I've been monitoring the mainstream and tech press for articles about technology and interpreting for years, but I'd never seen a bomb drop like this one before -- the use of speech-to-speech translation at a high-level international forum to replace highly-trained conference interpreters. That took some serious bravado or serious ignorance about what simultaneous interpretation requires to be done successfully, or perhaps a bit of both. But I can't help but wonder where Tencent acquired so much faith in their system to accept such a high-stakes debut on the international stage. All other demos I have seen from the likes of Google and Microsoft, were carefully orchestrated and short. They were crafted in an effort to create that "wow factor" that makes everyone think that the technology can do more than it really can when the conversation goes beyond simple greetings and pleasantries.

Interpreters everywhere were quick to point out that this massive failure of AI applied to language interpretation is evidence that we are not going to be replaced anytime soon. And they are right. However, this story is noteworthy for another reason -- end user expectations. Tencent surely expected its technology to work. Many tech analysts and investors expect AI to replace interpreters. And more and more audiences expect to be able to have easy access to interpreting services, human or otherwise in an increasing number of settings. Although the story of a major AI-powered speech-to-speech translation fail should calm our fears of being replaced, it should also motivate us to find new, better and more convenient ways to provide our service. Stubbornly putting all our professional eggs in a basket that hasn't changed much in 70 years is a bad idea. We need to diversify.

Microsoft, Google, IBM, Facebook, Amazon, Alibaba, Tencent, Baidu, and many other technology companies are actively developing and marketing neural machine translation platforms that can be connected to speech recognition and speech synthesis programs. To be sure, there will be more attempts -- and more failures -- when it comes to speech-to-speech translation. My hope is that they will be matched by more attempts -- some successful, others not to make high-quality human interpretation available in new ways and in new settings. In an ever-changing multilingual communications landscape, our relevance as a profession depends on it.

Do you have a question about a specific technology? Or would you like to learn more about a specific interpreting platform, interpreter console or supporting technology? Send us an email at [email protected].

ADVERTISEMENT

Interpreting Technologies Alliance (ITA)

The confluence of globalization and technology is leading the private sector to embrace interpreting in new and exciting ways. Learn about the companies that are driving that change. Visit www.itaglobal.org.

4. Quo Vadis, Content?

Two years ago, in edition 261 of the Tool Box Journal, I wrote this about ContentQuo:

"MultiQA, the quality assurance and terminology tool that I highly praised in June of 2013, had a rather rocky road but one that might still eventually lead to greener pastures.

"Briefly, it really was never widely used except internally by ITI, the Russian company that developed it. To put it mildly, that's something I simply don't understand. While it was not a super-easy tool to use, it excelled at areas where it virtually had no competition, especially in morphology-based terminology recognition.

"Anyway, the result is that it's not available anymore to the general public. That's a bummer. But it has been taken over by ContentQuo, a startup company led by Kirill Soloviev. Kirill's vision is to offer a kind of quality assurance that is more holistic than just looking at individual linguistic parameters, instead taking into account key performance indicators and user response through web analytics as well. You can see some more on that in this presentation."

Just this week I touched base with Kirill, who is finally about to launch a product by the end of this month. Why did it take so long? Well, partly because he really had to rethink his concept. Did it make sense to look at localization from a results-oriented perspective (Kirill called it "outcome-based localization)? Absolutely! After all, if there are fewer complaints about a translated support database, better response rates with localized email campaigns, or higher response rates in apps, this could and should be seen as a result of successful localization. (Of course, the opposite is true as well.) Why did this concept not work (at least yet)? Because data tends to be siloed in organizations. With no data to access, there are no insights about success.

So, Kirill went back to the drawing board and came up with a different product with more "mundane" features (his words, not mine).

ContentQuo is still a product with great ambitions, but rather than having to rely on data that is difficult or impossible to access, it uses data generated through its own interface.

It's a TQM (Translation Quality Management) platform that offers an interface for the review of translated bilingual files (XLIFF in all kinds of flavors, a number of current and legacy translation-environment-tool-specific formats, but also CSV files and TMX and TBX files). The reviewer can apply the quality metrics of the DQF-MQM error typology that can be tailored down to the needs of the customer and applied to any part of the segment. (The latter fixes a shortcoming in some other implementations I've seen that allows the reviewer to assign an error only to the complete segment.)

If the file is not only assessed (by applying error categories, severity grades, and comments) but (optionally) also edited, the changes are displayed in an MS Word-like tracked changes format; if so desired, it can also be saved and pushed back to the originating location or system. (This last step presupposes that the entire file is being reviewed in the ContentQuo environment rather than a sample, which is more likely to be the default option.)

And why push back to a "system"? The idea is that ContentQuo will (eventually) connect with and to third-party translation management systems (Plunet, XTRF, WorldServer, etc.), which will then allow for a direct connection between the quality management interface that ContentQuo provides and the location where the translation management stores the translated files. This will be done via API connectors that will be developed by client demand and then made available to everyone (the goal is to have two to four of those by the end of this year).

Another area where connectors are to be developed is with automated quality assurance tools, such as Xbench and Verifika, so they can be brought into the process as well. The morphological abilities of the features that were part of MultiQA will be brought into ContentQuo in that manner, as well.

Applying the quality metrics -- whether automated or manual -- results in an overall quality rate expressed as a percentage (every error equates with a deduction from the ideal 100% grade). It's up to the client's quality manager (or perhaps more likely, the localization manager) to determine what is considered a non-acceptable failure rate. The results of the QA processes are displayed first by language and then by translator/project in a portal that also allows for quick access to the actual review interface. Translators (and reviewers and arbiters -- who are called into situations of possible disputes between translator and reviewer) have their own access to the system where they can see what evaluations they have received (and why) and also have the possibility to respond. The interaction between the different parties can either be anonymized or attached to actual identities.

ll of this is hosted in the cloud -- or can be hosted onsite if so desired. ContentQuo is a Lithuanian company, its development team is located in Russia, and the cloud is hosted in Germany. The pricing for this obviously SaaS-based offering starts at 99 euro a month (with educational and nonprofit discounts) and relates to the number of words processed through the system.

Since clients might need some outside expertise to determine certain details, like what part of the DQF-MQM metrics to apply or what fail rate to execute, Kirill from ContentQuo is happy to refer to a group of third-party quality experts who can aid that process.

And who are the clients? While Kirill is presently focusing mostly on larger enterprise clients, it could (and, he hopes, will) eventually include LSPs and translation buyers of all sizes.

I failed to discuss with Kirill when I talked to him what a fantastic tool this would be as part of a platform for translators where they could openly show their expertise and quality ratings for specific kinds of projects or industries. Anyone willing to tackle that?

ADVERTISEMENT

SDL Passolo 2018 is here -- a new experience for software localization

The demand for localizing software --in particular from the gaming industry -- is rising every year.

SDL Passolo 2018 is the visual translation environment that helps translators meet that demand by simplifying the localization process whilst most importantly, keeping quality high.

Learn about the key features in SDL Passolo 2018

5. This 'n' That

The 2018 meeting of the European Association for Machine Translation (EAMT) will be held in Alacant/Alicante, Spain, on 28-30 May 2018, and will have a translators' track for the first time. I've been told that Alicante is not a bad place to be at the end of May....

xl8.link, the link shortener just for translators and interpreters, is open again on a new platform. We previously had problems with phishing attacks, so we hope this change will deter any evil phishers. Either way, any URL you've already shortened or will shorten will continue to work.

Microsoft has released a shiny and highly accessible new Writing Style Guide for English. This is obviously a highly important resource for anyone translating Windows- or otherwise MS-based software and related data into English. There are also style guides for other languages, but unfortunately they're not nearly as comprehensive and easily navigatable. (And the English Apple counterpart can be found here -- thanks, Andrea Bernard, for the tip.)

Amazon Translate has now officially opened up access to its neural machine translation engine in English <> Arabic, Chinese (Simplified), French, German, Portuguese, and Spanish. At this point it appears that it's primarily geared toward larger service providers rather than individual translators, and that might be a good thing because it looks like Amazon has not yet caught up with the latest developments in data privacy. If you look under "Privacy" right here, you can see that data submitted through Amazon Translate is used for purposes other than your own, which is so "last month" (when Microsoft finally joined Google and DeepL in no longer doing this as long as the data comes through an API connection).

One more thing about MT. GT4T, the little tool that gives you access to a large number of MT engines, has added access to DeepL Pro. It had already provided access to DeepL for some time, but that was through the non-API access and was therefore not confidential. Another reason why this relevant is because, as I reported in the last Tool Box Journal, DeepL is asking for a 20-euro-a-month minimum payment for the use of the translation engine that otherwise uses a character-based payment schedule. Since this might not be very cost-effective for some, access through GT4T might make more sense. This works because GT4T essentially acts as a reseller for DeepL and can afford to include the DeepL data that its users are using through its licensing fee. (Note that DeepL Pro only works if you have a character-based license with GT4T rather than a time-based license -- because this might "bankrupt" GT4T, as Dallas Cao, the developer, remarked to me.)

The only drawback I see to using MT through GT4T's interface is its lack of adequately deep integration into the translation tool environment. Once again, my greatest benefit from MT is not the MT'ed suggestion of the whole segment but the more granular data that I can choose to use or not, ideally by the system suggesting these fragments based on what I have already entered. If GT4T could only provide that!

Dallas?

ADVERTISEMENT

memoQ 8.4 & memoQfest

Get first-hand information on memoQ 8.4 at our 10th international conference, memoQfest 2018, which will take place on 30 May - 1 June in Budapest, Hungary. Attend a hands-on workshop, meet with developers, exchange ideas with attendees and the memoQ team.

Get your ticket at www.memoqfest.com by 4 May 2018!

6. New Password for the Tool Box Archive

As a subscriber to the Premium version of this journal you have access to an archive of Premium journals going back to 2007.

You can access the archive right here. This month the user name is toolbox and the password is lowtide.

New user names and passwords will be announced in future journals.

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.