fingers on keyboard
Knowledge & Information Technology
No. 277 - 1 December 2020
Searching through Legacy Documents
Jim Reavis, co-founder and CEO of the Cloud Security Alliance, was recently asking about how one can search through legacy documents: "How would we go back and tag 11+ years of content? It seems that some structured documents could be tagged automatically, but not sure what we would do with webinars, videos, podcasts without going back and listening to them all." With the increasing use of audio and video as a way to deliver content, the question is pertinent. But there are solutions out there, and while they may not be perfect, they are evolving rapidly. Look at the Wikipedia article on audio mining for starters.

Indexing videos will be more challenging. In an audio file, most of the content of interest is probably speech; once transcribed automatically, it becomes searchable. Video is likely to contain speech, to which the same process can be applied, and text, which can be identified through optical character recognition. But what about objects, scenes, people, etc.? Ideally, I should be able to search for "Paris" and see in the results a video in which the Eiffel Tower appears, even if the name "Paris" is neither said nor written. We know that there are AI-based tools that recognize monuments and other objects (see Google Lens and Blippar). Integrating all this together into a service that can index a video file in a useful manner is probably just around the corner -- and if you're aware of an existing implementation, please let us know!
Responsible Use of Artificial Intelligence
Prof. Roberto Zicari, from the Big Data Lab at Goethe University in Frankfurt, is offering a new online training course on the responsible use of AI. This is an in-house course -- taught to a team from a single company at a time. There are four modules of 90 minutes each (50 minutes of lecture, 40 minutes of discussion):
  • Trustworthy AI
  • Legal relevance of ethical rules on AI
  • Technical privacy
  • Implications of GDPR for information systems and platforms
See more details here, including how to contact the author for pricing information.
Answering "I don't know what I need, but give me a solution"
A person who shall remain nameless recently asked on a professional forum the following question: "I am in search of third-party IT providers for my medium-sized company. We are not yet in the cloud, but would be interested in moving. Our current IT provider just doesn't seem to be meeting expectations and I think it is time I start looking at other options. Does anyone have any recommendations?"

We've heard this kind of cart-before-the-horse question a few hundred times before. Two members of the forum, including yours truly, tried to steer the requester in the right direction. Between us, we said:
  • It's almost impossible to make a recommendation without knowing the specifics.
  • In fact, as a consultant working in this area among others, I'd find it improper to start naming providers without having looked at requirements.
  • Has your organization put out an RFP with one of the primary objectives being the ability to execute a lift and shift or the ability to optimize your current infrastructure for the cloud?
  • Even if you don't issue a formal RFP in all its bureaucratic glory, you need at least some process to establish requirements and selection criteria, then make a list of potential providers, get information from their sales teams (using a third party if you want to be shielded from the endless sales calls after that), then narrow it down to a few finalists, then do a serious comparison with a small team, then work with your purchasing department and/or lawyer to review the contract (cloud service agreements have a lot of fine print that create specific risks with respect to data protection, cessation of service, etc., etc.)
  • Read the advice you can find in the free white papers from the OMG's Cloud Working Group. In particular, look for the Practical Guide to Cloud Computing, the guide on cloud migration, and the Practical Guide to Cloud Service Agreements.
Words Matter -- ACM Tackles Diversity and Inclusion in Language

The Association for Computing Machinery is wading into the sometimes controversial topic of updating the language we use in the computer professions in order to avoid phrases that have may imply gender or racial biases. The initial advice only covers three pairs of nouns (master/slave, blacklist/whitelist, blackhat/whitehat) and the use of gender-neutral pronouns, but more is to come and suggestions are encouraged. Read the announcement.
Seen Recently...
"There’s no machine understanding without shared semantics, and no shared semantics without standards."
-- Alan Morrison, from PriceWaterhouseCoopers, in an interview by Teodora Petkova,
quoted by Charlie Hoffman in his newsletter on Digital Financial Reporting