As previously mentioned, machine learning tools need ongoing
updates, especially in terms of data to keep the models current and
meaningful. New tools will be acquired and older tools can be retired,
although decommissioning a tool will not be as frequent for typical
software that is based on a defined set of rules or logic. AI tools are
more data driven, and the algorithm that is used to create a model is
stable and less susceptible to technology or environmental changes.
The amount and frequency of support activities will depend largely
on the strategic decisions made in acquiring the tools. One major
concern will be the location of the data being used by the AI tools and
whether the vendor or cloud provider stores a copy of the data. That
has implications for privacy and security, which is discussed in a later
chapter. Allowing vendors access to your data is dangerous because
they can still retain privacy of the data but use it to build a model
themselves that is used by a different organization.
There is a third option that is a blend of make or buy. With this
selection, the decision is about how much of a vendor’s services
should be purchased and how much work will remain within the
organization. There is a growing number of vendor-based IT
resources and services available, as well as several cloud-based
providers, that include machine learning capability as part of their
services. It is best to gain knowledge about these solutions, preferably
from someone who has experience with each one, because the
descriptive language and marketing content tend to simplify yet
overstate the actual capability.
AWS cloud. Amazon Web Services includes machine learning
capability and pre-trained services, such as computer vision, language
processing, and forecasting. This site provides developers with the
ability to quickly build and deploy machine learning models with a
workflow that includes data labels and data preparation as well as how
to tune models to optimize them for deployment. That’s mainly a
marketing description because the work is more complex than it
Heroku. This is a cloud-based platform for building applications and is
frequently used by start-ups for the free sandbox. It has the ability to
scale with the business and supports numerous software languages.
This is a primary selection for my researchers to host their software.
Google AutoML. Google offers a suite of tools to perform machine
learning, which allows developers with limited expertise to build and
train models. AutoML includes a labeling service as well as support for
data cleansing, which results in high-quality data for the algorithms.
Twilio Autopilot. This website allows users to quickly build chatbots
that function for a variety of online and mobile apps. Once again, it
sounds good, but the reality is more complex when attempting to build
applications for project management.
THE RISKS OF IMPLEMENTATION
Acquiring and implementing AI tools is a great opportunity, and similar
to all changes, there are risks. The first risk is security and privacy.
Will the people who have access to the model results also have
access to the training data? In some cases, this is a good thing. On
the other hand, for sensitive data, restrictions need to be considered
and any loopholes must be closed.
infiltration. Think of this as a malicious virus intended to corrupt a
system or something like ransomware where the owner of an infected
system is asked to pay a fee to unlock databases. An attack on
training data can result in disastrous consequences, mainly with a
decision to perform the exact opposite of what a true machine learning
algorithm would normally produce with good data. I don’t know why
some people are malicious, but it happens. Let’s say a country is in
the process of launching an extremely valuable satellite into space. A
dissident group hacks into the launch system database and adds fake
data to the machine learning training datasets. The launch fails and
the dissidents expose their work so that they can take credit and gain
publicity. Organizations need to secure the data against data attacks
similar to the efforts that are already underway currently for many
organizations. The difference is that attacks on machine learning
datasets are likely to be less noticeable but they produce bad results.
A second risk is biased data, something already mentioned
elsewhere. There needs to be a commonsense assessment of
machine learning results in order to validate the outcome. There are
times when a machine learning result is accurate based on the data
but the data itself is not representative of the current environment.
Another risk is poor extrapolation of data or incorrectly interpreting
statistics. It has been suggested that a budget allocation be set aside
for auditing or validating machine learning results.
17 These issues will
be downplayed by vendors, and in the early days of AI, the
occurrences of these types of risk are low. That will change as AI
becomes more pervasive, and the biggest risk is that you have a
problem in the AI tools that goes completely unnoticed. There are
always risks with new technology, and the most reasonable response
is to become knowledgeable in how to manage or eliminate the risks
the same way that we reduce or eliminate the probability and impact in