The 273rd Tool Box Journal - Premium Edition

A computer journal for translation professionals

Issue 17-4-273
(the two hundred seventy third edition)

Contents

1. Behaving Badly Like the Silicon Valley Big Boys

2. Google Translate's Neural Machine Translation . . . (Premium Edition)

3. The Tech-Savvy Interpreter: ZipDX -- Cutting-Edge Technology Providing Remote Simultaneous Interpreting for Old-School Conference Calls

4. TM Marketplace

5. This 'n' That

6. New Password for the Tool Box Archive

The Last Word on the Tool Box

Incidental (Easter) Art

Happy Easter.

ADVERTISEMENT

Not yet registered for memoQfest?

memoQfest, Kilgray's 9th conference on translation technology returns 7-9 June in Budapest, Hungary with thought-provoking presentations from key industry personalities, and great networking opportunities.

Visit www.memoQfest.com, learn more about the conference program, and register today!

1. Behaving Badly Like the Silicon Valley Big Boys

Such was the tweet of Barry Olsen-Slaughter, the faithful author of the Tech-Savvy Interpreter column for the Tool Box Journal since November of 2015 (!). What did it refer to? The same thing that others on Twitter have referred to as a fight between David and Goliath, unfair, really scummy patent trolling, and something that will retard important research (I cut out the more R-rated references).

All these comments refer to a lawsuit that SDL brought against Lilt on April 4, reported on by Slator a few days ago. I recommend that you read the Slator article, which has valuable information on what the lawsuit specifically addresses, though I don't happen to agree with Slator's implied conclusion that SDL was somehow aggravated by Lilt's claims of increased productivity. While the lawsuit actually cites some of those numbers, these were based on an actual case study and really are not up for debate (unless you do another case study and show differing results).

Instead, I think what is happening is much more scary and worrisome. SDL is under new leadership. Their new CEO Adolfo Hernandez has worked for companies like IBM, Sun Microsystems, and Alcatel-Lucent (now Nokia), and their even newer "Chief Product Officer" Jim Saunders has worked for companies like Apple, AOL, Netscape, SAP Business Objects, and others. What distinguishes the experience of both men from the rest of us is clearly that they are software executives rather than folks who have had any exposure to the world of translation.

I'm afraid that what Barry says in his tweet is true: These are practices that might be acceptable in Silicon Valley, but I have a tough time fitting them into our context.

First, I don't believe for a second that the complaints have any merit. And I also don't think SDL really believes it either. (They -- correctly so -- did believe an infringement had taken place a couple of years ago, when Translated used SDL-owned file filters for their MateCat solution without any license. But rather than suing -- and Translated would surely have been a richer target than Lilt -- SDL just let them know they needed to stop it -- and so they did and developed their own filters.) Instead, I believe that this is simply a commonplace practice for folks like Hernandez and Saunders; they are well aware that the crazy patent system in the US allows very widely defined processes to be patented, giving them a perfect playground for these legalized protectionist practices.

About four years ago (in issue 220), I wrote about a very productive standalone product developed by Linguee that allowed the retrieval of context-sensitive translation. Linguee never released that product in the US because their lawyers told them that it was too risky because of a patent for "Recognition and translation system and method" owned by MT and dictionary pariah Babylon.

In the end, Linguee completely dropped the product. Of course it's hard to say whether it would have survived if it had been distributed in the US, but it serves as yet another story that demonstrates how patent litigation (in this case only potential litigation) stops innovation.

Again, I don't think SDL is likely to win the suit. But is that really their end goal? My sense is that SDL is trying to make a very aggressive point to discourage any competitor -- not just Lilt -- from continuing to explore options in MT technology that responds to user input. Never mind that Lilt released their technology an entire year before SDL released theirs -- but who wants to fight against a bully with pockets full of money? In fact, Hernandez seems to feel fairly certain of this new legal protectionist strategy since he just bought more than $300,000 worth of SDL stock.

One Twitter user referred to an interesting parallel: the story of Nuance and Vlingo. Both are Boston companies and both were developing speech recognition technology. Nuance was (and still is) run by CEO Paul Ricci, who very aggressively grew his business by acquiring competitors or, where that was not possible, suing them for patent infringements. He wasn't particularly successful with his suits as far as winning them in court but, as in the case of Vlingo, the battles so exhausted his competitors that they finally agreed to be acquired.

Now, I neither think nor hope that the same will happen here, but as a user I want to see translation technology companies investing into technology rather than legal battles and lawyers. I want to see companies like Lilt and the many other translation technology companies of the future blossom and create solutions that help me and you as translators. Surely we agree that one way to make machine translation more useful is for it to be adaptive to user input. If only one company is allowed to develop such solutions, progress will be severely limited if not stopped.

Clearly, my soapbox alone will not convince Messrs. Hernandez and Saunders to cease and desist. But I strongly think this kind of business practice should not be part of the world of translation. So what can we do? I can think of a number of things, but a good starting point might be to let folks from SDL that you know or are going to meet in the course of this year know how much you disapprove.

ADVERTISEMENT

Get a firsthand look at Wordfast Pro 5, sign up for the Wordfast Forward user conference today!

Tool Box readers can take advantage of a special discount on the conference registration fee. Use the following promo code and save €68: toolbox-reader.

2. Google Translate's Neural Machine Translation . . . (Premium Edition)

. . . is now available right in your translation environment tool. In December of last year I reported on Google's ongoing transition to neural machine translation (NMT), which for some languages was available (if applicable) through their web interface. If retrieved through the API (this happens if it's used through a third-party tool like a translation environment tool), it was possible to get to the NMT data but only through a complicated application process, and it was also unclear what it would eventually cost as compared to the "old" statistical machine translation (SMT), which was still the default when retrieved through the API. Without much fanfare, however, Google has decided not to maintain two different engines per language, so you'll now get NMT through the API -- whether you want it or not. Of course, this is only applicable for the available languages, including English <> French, German, Spanish, Portuguese, Chinese, Japanese, Korean, Turkish, Russian, Hindi, Vietnamese, Arabic, and Hebrew, but more will follow soon.

Also, at the end of March, Microsoft changed their subscription model if you want to use their API (for Microsoft Translator). You might notice that you either get error messages or simply unretrieved data from that service within your tool of choice. To fix that, you will have to apply for a new API key through the Azure Portal. You can find information on that right here.

Just like Google, Microsoft has also started to use NMT (between English, Arabic, Chinese. French, German, Italian, Japanese, Portuguese, Russian, and Spanish), but is not offering those through an API at this point.

I would find it interesting to compare the actual usefulness for translators of NMT versus SMT when using MT as a resource to auto-suggest subsegments -- the process that I think in almost all cases is the most effective use of MT for translators. It's true that NMT results typically read more fluently than SMT-translated texts (see here for a demonstration of that), but that's not the point. If you're using MT'ed segments for the purpose of possibly reutilizing smaller fragments with the segments, overall fluency doesn't make much difference, but the fragments do. And since in SMT these come from mostly professionally translated data, the fragments are often good. They might not be an appropriate choice for the translation you're working on at the moment, but in and of itself they are likely to be good. And that's not necessarily the case with NMT, despite its greater fluency. I'd be really interested to see some studies in this area. Or maybe even some experiences you've had?

Needless to say, the typical warnings about confidentiality when using these services still apply -- and, if you do find the use of either Google's or Microsoft's engines helpful as an additional translation resource, you'll need to make sure which engine is usable for which client.

This past week I was reminded, though, how pervasively all aspects of our work have started to be cloud-based, starting from email services to social media to even working in MS Word. The latest (preview) edition of MS Office 365 offers this option:

While at this point still disabled by default, Word, PowerPoint, and Outlook will remind you every time that this service would be "helpful" -- meaning that many of us will activate it at some point. Creepy? Not really. Something to be aware of? Absolutely -- and if only to be able to show clients what kind of measures you do or do not take to assure confidentiality.

ADVERTISEMENT

Protemos Translation Business Management System

Clients and vendors, projects management, invoices and payments, business reports...

Now integrated with SmartCAT. Free for freelancers, 3 month trial for agencies, try it now!

3. The Tech-Savvy Interpreter: ZipDX -- Cutting-Edge Technology Providing Remote Simultaneous Interpreting for Old-School Conference Calls (Column by Barry Slaughter Olsen)

* Be sure to check out this month's Tech-Savvy Interpreter video premiering on April 18.

**Full disclosure: In 2011, I helped design, test and launch the ZipDX Multilingual platform. Therefore, this article could never be a completely unbiased review. In fact, since I am so close to this technology, I have hesitated to write about ZipDX in the Tech-Savvy Interpreter in the past, although it is arguably the oldest remote simultaneous interpretation platform on the market today. So, don't take my words at face value. I encourage you to verify all the assertions made in this month's column.

Whenever I learn about a new interpreting delivery platform (IDP), one of the first questions I ask is, 'What's your main use case?' Any interpreting platform that boasts the ability to provide any kind of interpreting, in any language, anytime and anywhere, makes me wonder if they have done their homework and have a clearly defined use case that they have designed their platform around. ZipDX understood the importance of focusing on a specific use case from the very beginning, and their IDP shows it.

The Company

ZipDX was founded in 2007 by telecommunications veteran David Frankel as a next-generation audio conferencing company. That's right, audio, plain and simple. And there's a reason for that. Despite all the Internet-based communication technologies available today, the phone is still the preferred method the world over for voice communications. Large parts of the globe still do not have dependable broadband Internet connections to use the many VoIP and WebRTC services currently available. What is more, the telephone is still the only truly universal communications network where anyone can pick up their phone and call any other phone on the planet.

The multilingual side of ZipDX started in 2011, when the company was approached by the International Telecommunication Union (ITU) to use ZipDX for remote participation by delegates in its Geneva-based meetings in the six official UN languages. ZipDX's multi-channel audio platform was paired with Adobe Connect to make remote participation possible. However, this is an entirely different use case from the one I want to focus on in this column, but it is what led to the creation of the remote simultaneous interpretation platform I'm focusing on today.

The Use Case

Remote simultaneous interpretation for virtual meetings, particularly conference calls, is where the ZipDX Multilingual platform really shines. Think of a conference call where the participants can connect from anywhere by phone. They are connected by either conference phone, landline or cell phone. Participants may also connect by VoIP using a soft phone on a computer. Simultaneous interpreters connect to the call using a computer with a broadband Internet connection and a USB headset. ZipDX provides a multi-channel audio bridge that allows callers to listen to and speak in their preferred language, just as if they were participating in a multilingual meeting at the United Nations or the European Union, for example. If you'd like a more detailed explanation of the IDP design, you can go here.

These virtual meetings usually last anywhere from 30 minutes to two hours. They are not designed to replace face-to-face meetings. Adding remote simultaneous interpretation to them extends language services to a meeting format (conference calls) where it was not technically feasible in the past. Although these calls are much shorter than a normal full day of conference interpreting, they tend to be more frequent and often take place regularly (e.g. once a week or month, or quarterly). So, interpreters may actually end up interpreting the same participants frequently for the same client, and that familiarity makes the task easier over time. As more multilingual communication goes virtual, it only makes sense that interpreting services need to follow, otherwise the profession will not evolve and adapt to new forms of communication.

Wideband Audio (aka HD Voice)

Audio quality is an important part of any meeting with simultaneous interpretation, whether face-to-face or virtual. The quality of the interpretation (and sanity of the interpreter) depends on it. The ZipDX platform can send and receive wideband audio (often referred to as HD Voice). Traditional telephones are limited to a frequency range of 300 Hz to 3.4 kHz, called narrowband. HD Voice expands that frequency range in both directions to 50 Hz and to 7kHz, respectively. This is a huge improvement and much closer to the range of the human voice (75 Hz to 14 kHz). As a side note, the fundamental parts of an average man's speaking voice is between 85 Hz and 155 Hz and an average woman's between 165 Hz and 255 Hz. Some interpreters maintain that HD Voice is still unacceptable, as it cannot capture the full range of what humans can hear (theoretically 20 Hz to 20 kHz). My own experience and that of other interpreters working with HD Voice for several years now would seem to prove otherwise. Regardless, HD voice is a huge improvement over traditional narrowband audio.

That does not mean that all connections on a conference call will have HD Voice quality. If a participant connects with a traditional landline, the audio will be narrowband. And if a caller connects with a low-quality cell phone in an airport, well...you get the picture.

With Great Power Comes Great...Complexity

The ZipDX Multilingual platform is powerful and has many features that help fine tune the participants' and the interpreters' experience like individual volume control for each connection, hard mute and soft mute of audio, multi-channel recording, a waiting room feature, one-way glass for focus groups, and a slew of other features. I don't want to get too far into the tall grass by explaining each feature, so I'll limit this first look to two key interfaces for the interpreter-the Dashboard and the ZipLine Virtual Interpreter Console. The ZipDX dashboard and interpreter console aren't flashy but they are useful and easy to use.

Dashboard

A few minutes before an interpreted conference call begins, interpreters use an internet browser to log into and view the call dashboard. The dashboard provides a wealth of information about the call-the names of call invitees, who has connected, what language channel each participant is listening to, and how they are connected to the conference bridge (i.e. by cell phone, landline or VoIP). It includes and activity window showing the name of each participant as he/she talks and includes a multi-tab chat window that interpreters use while they are working as a back channel for communication among themselves and with the conference moderator or host.

The dashboard provides broad situational awareness to the interpreters as they interpret without any visuals of the call participants. Keep in mind that the call participants do not see each other either, as they are connected by phone. The dashboard also has a feature that indicates when a participant connected via VoIP over the Internet has connectivity problems like a slow or unstable connection, which helps isolate and identify problems so they can be addressed.

ZipLine Virtual Interpreter Console

The ZipLine 3.0 Virtual Interpreter Console is designed to mimic the interpreter console found in a typical conference interpreting booth. It has been designed for simplicity and ease of use. It uses a simple color scheme to indicate when interpreting is on or off and the mic is open or muted (Red: off, muted / Green: on, unmuted). It also includes clear buttons to switch directions when working in a bilingual booth. ZipLine can also be configured to work with a traditional 'pure booth' setup with relay.

While bilingual conference calls are the most common, multilingual calls do happen frequently, some with nine languages or more. In theory, the ZipDX platform can handle interpretation in and out of 48 different languages on one conference call. However, the complexity of providing remote interpreting for such a virtual meeting is mind boggling.

ZipLine is based on WebRTC technology and requires that the interpreter use the Google Chrome browser.

Websharing, Video and Other Bells and Whistles

ZipDX has a suite of other features that make it a powerful collaboration tool. Like many other online collaboration platforms, it has screen sharing capabilities and can handle up to three simultaneous video streams from participants' webcams. If there are more than three participants connected via video, the system prioritizes the video of the three most recent speakers. Of course, all these additional features require both interpreters and meeting participants to be in front of their computers and connect to the platform through a web browser. If meeting participants actually use all these features, this substantially changes the original use case.

As a side note, using all of these features is great for teaching simultaneous interpreting online, but I have to leave that use case for a future column.

If I Had My Druthers...

One of the drawbacks of the current ZipDX configuration is that the dashboard, the ZipLine virtual interpreter console and the Webshare window (screen sharing and video feeds) are all separate. When interpreting for a conference call, you have to keep track of all the windows you are using and size them so they fit on the screen. With the dashboard and ZipLine, one screen is enough, but if you need to see the Webshare or video too, it's better to have two screens.

In the future, I'd like to see the interpreter console incorporated into the dashboard so all it takes is one click to open both. For single-screen setups, it would be good to have an 'always on top' option so that the interpreter console doesn't get lost behind other open windows.

Do you have a question about a specific technology? Or would you like to learn more about a specific interpreting platform, interpreter console or supporting technology? Send us an email at [email protected].

ADVERTISEMENT

Talk Business Anywhere with Cadence

Cadence streams real-time interpretation into business meetings and live events. Are you an interpreter? Have you been asked to interpret remotely? From now until May 1, interpreters can use the Cadence platform for free!

Go to www.cadencetranslate.com to learn more.

4. TM Marketplace

Twelve years ago, I partnered with Donna Parrish from MultiLingual to launch TM Marketplace, a company that was built around the idea of creating connections between owners of translation memory data and users who could put that data to new and productive use. It was an innovative idea, and we did our homework as far as the legal structuring of the business and the amount of energy we would need to invest (lots). Unfortunately, we failed to read the market correctly, or the current state of technology for that matter. At that point, TMs were used in very one-dimensional ways (essentially just as containers of perfect and fuzzy matches and only in the context of translation environment or CAT tools), and there was even greater uncertainty on questions like proprietary rights and value of data. Our plan didn't work out, and we closed our business a few years down the road.

Others did pick up some of our ideas, in particular TAUS, and you can benefit from that up to the present day by using the terminology services of the TAUS Data Cloud.

More recently, a company in China has come up with their own offering, and it might just work for them. The company is called TMXMall (the website as of now is only in Chinese, but there will be an English version in "two or three months"), and I conversed a bit with their CEO Zhang Jing.

TMXMall is essentially a collection of services and products that all revolve around translation memories:

A "smart aligner" that is able to provide better results because it is run with a large corpus in the background which is used to verify alignments
A TM exchange platform where one can retrieve two translation units for every translation unit uploaded (for the uploaded data, they offer an anonymization service where any identifying phrase can be replaced)
A TM management and verification system that assesses and improves the quality of TM data (again with the help of the corpus)
API and plugins to a few translation environment tools, including Trados and memoQ, to access the corpus
TM marketplace (hurts a little bit to write that...) where users can upload documents, search for matches, and purchase those
An (upcoming) peer-to-peer system where users can sell and trade data with each other

Data sold on the TM marketplace costs app. $1.50 for 1000 words with a 100% match and then in several stages down to about $.45 for 1000 words with a 75-84% match. The money goes to the owner of the data (if it is owned by a third party), with TMXMall receiving 10-20% from the transaction fees.

Here are some usage numbers that Jing supplied me with: There is a total number of about 30,000 users with 500,000 API calls every day. Here's what I find most amazing about these numbers: These are substantial numbers by any stretch of the imagination, and I can almost bet that -- unless you're a Chinese translator -- you haven't even heard about this, let alone realized that this service is processing some real data. It seems a little bit like a case of parallel universes.

The service is used not only by individual translators but by translation companies and, as you won't be surprised to hear, machine translation companies that use the data to train their engines.

The vast majority of the data is in English<>Chinese, but there is also data available in Chinese<>Japanese, Korean, French, German, Hindi, and Spanish.

ADVERTISEMENT

SDL Trados Roadshows coming to a city near you. Save the date!

Join us as we explore the exciting and ground breaking innovations in SDL Trados Studio 2017, the newly launched SDL Studio GroupShare 2017 and SDL MultiTerm 2017. Learn more about the new translation productivity features: AdaptiveMT and upLIFT technology and network with peers.

5. This 'n' That

In the last issue of the Tool Box Journal, I pointed you to an article in the New York Times about how lawyers are confronted by new technology, including artificial intelligence and how they are dealing with it. It was uncanny to read how similar the concerns in that industry are to ours. This past week I stumbled over an article in the New Yorker, which essentially runs through the same scenario for the medical profession. Very worthwhile reads for the sake of your own sanity as a translator!

I often mention Twitter in the Journal and I always feel a little weird about it because I know it's not something that everybody feels as positive about as I -- and for many good reasons. Still, for me it's a good source of information, and I try to curate as much of that information in my own Twitter feed as I can. Of course, it's possible to simply go to the feed of anyone posting on Twitter and check it there without being a member of Twitter or without following that account. That's a little tedious, though. It used to be possible to subscribe to a Twitter feed without actually "officially following" it or even owning an account on Twitter by subscribing to an RSS feed of that Twitter user and reading the tweets with whatever tool you use to read RSS feeds (I use MS Outlook). Twitter itself disabled that feature a few years ago, but some clever programmer came up with his own solution so that you can subscribe to an RSS feed of whatever Twitter user you would like. Or you can also create an RSS feed for just one hashtag if you'd like to follow one specific conference or event.

Would you like to know what I use it for? Since Twitter has become the number one megaphone in US politics, I follow accounts that I don't want to grace with my support as an additional follower (which might be wrongly taken as a sign of support). By using this round-about method through RSS, I still get all of their (often aggravating) tweets right in my inbox. And I'm not sure that I'm better for it but at least I know what to watch out for...

ADVERTISEMENT

Leave the office 20 minutes earlier today!

Using MindReader for Outlook, you can write e-mails more quickly and more consistently.

Watch the short video for more information on MindReader for Outlook functionality and usage: https://www.youtube.com/watch?v=YAPLHSvVrBc.

6. New Password for the Tool Box Archive

As a subscriber to the Premium version of this journal you have access to an archive of Premium journals going back to 2007.

You can access the archive right here. This month the user name is toolbox and the password is RainSunStorm.

New user names and passwords will be announced in future journals.

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.