artificial intelligence

THE IMPORTANCE OF DATA

The importance of data cannot be underestimated, and if data is

managed properly, the machine learning results will provide enormous

value to the organization as well as to each project. Data is objective

and not judgmental. You need data to train an algorithm, and there are

many problems concerning data when dealing with a machine learning

tool. Data wrangling is a term that refers to the work of analyzing and

preparing raw data into a format or structure that can be used by a

machine learning tool. Because roughly 80 percent of the time spent

in the process of creating a machine learning algorithm is committed

to data, it is clearly an essential step.

For most organizations, data is not clean. Data fields contain

typos or improperly capitalized words. There is a variety of formats,

the data fields do not have a clear meaning, and the contents do not

follow a consistent format. Field formats are different across or

sometimes within the same database. For example, a date field can

be dd/mm/yy, mm/dd/yy, mm/week, yyyy/mm/dd, or any other possible

permutation. There can be two data fields that actually mean the same

thing and one data field that has two meanings.

During one of my projects, there was a data field for an owner’s

name, but it occasionally contained three names as there were three

owners. And, of course, there is usually a column of data containing a

single data field that is supposed to have a numeric value, and it is

blank. AI cannot handle blank data fields. For project management

data, it is likely that several files need to be combined or joined in

some way so that the total data requirement is met. Another

consideration for project data is the variety of file formats. The scope

document is normally a text or pdf format, the quality metrics might be

in a spreadsheet, and the schedule is in any number of formats, such

as MS Project, Primavera, LiquidPlanner, or Asana. Therefore, NLP

must be used to convert the words or phrases to usable data, which

adds another layer of complexity.

Fortunately, there are software tools capable of scanning

databases and finding unstructured data. There are also vendors that

perform this task, although it is messy work and requires collaboration

with the organization to clean the data. Data cleansing vendors that I

work with are normally hired as part of a data migration effort, typically

when upgrading to a new software solution. They are also available

for software deployments, such as AI tools. It is easy to misjudge

when the work to provide a structured dataset is complete.

Organizations might retain a lot of historical project data, but the data

needs to be processed to determine whether there any outliers that

simply do not belong or if there are gaps where data is missing. The

purpose is to take the raw data and transform it into data that can be

used by a machine learning algorithm to develop patterns that form

the model.

Data mining can be used in an organization to determine why

projects fail and also at what point in the process or schedule they

failed. The root cause can then be traced back to a document or

process in the project that needs to be corrected. From this data, a

machine learning model is built and used to evaluate both new

projects and projects that are in the execution stage. Some of my

clients ask about acquiring performance data from projects. What they

mean is the ability to determine the accuracy of task estimates based

on the amount of effort that was recorded to complete the tasks. When

bidding on a new project contract, the analysis can result in more

accurate financial submissions and, it is hoped, more wins. Some

organizations are amazed by the ability to use vast amounts of

historical project information and apply it to the next project. In fact, an

organization that manages to collect the most information regarding a

specific client will gain a considerable competitive advantage. This

may already happen now, although in the future, the value of acquiring

and using the data for AI tools will make this an insurmountable

advantage.

In a project study from 2019, it was reported that for long-term,

exploratory-type projects, the project managers often suffered from

what the researchers termed uncertainty blindness. This refers to

project managers who lose track of project history and are unsure of

how to manage uncertainty as the project progresses.

Humans are

fallible, and AI tools that have access to historical data can prevent

these types of concerns in projects. The AI tools become subject

matter experts in nearly every subject, assuming they have sufficient

data. The project closing stage becomes more important for data

retention and ensuring structured data formats. To support supervised

learning, the project manager and PMO must think about adding

labels to datasets throughout the project. A cumulative benefit can

occur, which means that tools may start off less accurate, but as more

data and labeled datasets are added, the tools will become

exponentially more accurate and more effective.

Although projects are unique by definition, they have similar

components. Consider a project to land a colony on another planet.

This is certainly unique and challenging, so what historical data can be

used to feed AI tools? While projects are unique, the more grandiose

ones normally build on existing accomplishments. There have already

been projects to land people on the moon, projects to send orbiters to

other planets, and even projects on earth that attempt to simulate life

on another planet. This means that we have some of the data and

there is hope that the data from previous projects will still provide

accurate machine learning results.

تعليق واحد

اترك تعليقاً

لن يتم نشر عنوان بريدك الإلكتروني. الحقول الإلزامية مشار إليها بـ *

9 − 2 =

زر الذهاب إلى الأعلى
إغلاق