AI in Project Management

WHERE WE ARE AND HOW WE CAN MOVE FORWARD

Cornelius Fichtner, with his course, AI for Project Managers, online at Project-Management-Prepcast.com, presents a detailed and comprehensive overview of how Project Managers can use today’s Artificial Intelligence systems to make their work easier and more efficient. A key challenge, however, is understanding what AI can do, and being able to communicate that to various stakeholders, including project sponsors, team members, and people who are effected by the project and the system.

Fichtner provides that needed understanding of how AI works today and how to use ChatGPT, and ChatGPT enabled tools. The 15-hour course is like a 1-credit seminar on AI for an MBA or other graduate program from a lecturer who clearly knows and loves his subject. It is also worth 14.75 PDUs for PMP re-certification.

Fichtner divides his course into seven (7) sections:

  1. “The AI PM Revolution,”
  2. “Fundamentals,”
  3. “Ethics, Integrity & Transparency,”
  4. “Data,”
  5. “Generative AI,”
  6. “The Future: AI-Driven PM,” and
  7. “AI in Action: Expert Insights.”

The AI PM Revolution” discusses how AI will change Project Management as a profession, and how Project Managers will adapt.

Then, he presents the “Fundamentals,”  including:

  • “AI Basics,”
  • “Machine Learning,”
  • “Natural Language Processing,”
  • “Large Language Models.”

This is logically followed by a series of videos on Ethics, including “Responsible AI,” a series on Data, focused on data quality and data literacy, and a series on Generative AI.

Then, “The Future: AI-Driven PM” section includes,

  • “Becoming an AI-Powered Project Manager,” and
  • “Developing the AI-Powered Project Team,” parts 1 and 2.

The course concludes with “AI in Action,”

  • “Beyond Theory – Real-Life AI Solutions in PM,”
  • “Data Literacy for Project Managers in the Age of AI,”
  • “AI in Project Management: Promise vs. Practice.”

This course provides a terrific starting point for project managers who want or need to start using AI or who want or need to deploy AI Systems. Fichtner tells us where we are, where we can go from here, and how we might move forward. Early adopters, he says will have an edge on people who don’t understand or use AI capabilities.

For people who want to know how we got here, additional background on AI can be found in “Big Data,” by Victor Mayer-Schonberger and Kenneth Cukier, Harper Business, 2014, “Big Data @ Work,” by Thomas Davenport, Harvard Business Review Press, 2014, and “Minds Over Matter,” by Jeffrey Rothfeder, Simon & Schuster, 1985.

As incredible as AI is today, and as powerful, capable, and valuable as it will become in the future, AI systems seem to be as uncurious and non-goal-oriented as they were when Rothfeder was interviewing Edward Feigenbaum, Roger Schank, and other pioneers in the 1980’s. However, the big questions today are philosophical and ethical, not technical.  Who owns the system? What are the questions which the system owners want to investigate? What are the problems they want to solve? As Fichtner points out, the biases of the trainers will become the biases of the AIs. We need to understand the biases in the systems we use and in the systems we create. We need to know the sources of data on which AIs have been trained.

For example, in mid-July, 2023, I witnessed a trivial “AI Hallucination.” I asked ChatGPT, “Who wrote, ‘Sympathy for the Devil’?”

It answered “Bob Dylan.”

I corrected it, typing in, “No, ‘Sympathy for the Devil’ was written by Mick Jagger and Keith Richard of the ‘Rolling Stones.’”

ChatGPT responded with what a teacher might call “a lame excuse,” writing back, “Mick Jagger and Keith Richards were trying to write in the style of Bob Dylan.  I am sorry. I am still learning.”

Parenthetically, the fact that ChatGPT recognized similarities between Bob Dylan’s lyrics and those of Mick Jagger and Keith Richards illustrates that it can learn by analogy, which, in and of itself, is amazing. And is also an indication of intelligence.

On Sept. 9, 2024, I asked ChatGPT how it corrects errors – how it learns. It responded, in part,

“If I provide an incorrect response, and you correct me, … Here’s how I would typically handle the situation:

  1. Acknowledge the Mistake: I would recognize that I gave the wrong answer and acknowledge your correction.
    1. Provide the Correct Answer: I would follow up with the correct information, incorporating your correction to ensure the conversation proceeds based on accurate details.
    1. Learn from the Error?: While I can’t “learn” from an individual conversation (I don’t retain data or memories across sessions), in that specific moment, I can adapt my response based on your correction and provide the correct information going forward.

Errors happen, and I rely on feedback during the conversation to correct misunderstandings in real time.”

It also said part of the process is:

Check Against Known Data: When you offer a correction, I cross-reference it against the knowledge I was trained on. If your correction contradicts the information I “know” to be accurate, I wouldn’t automatically assume the new information is correct.”

More than simply an AI Hallucination, this is an example of ChatGPT extending it’s knowledge, another demonstration of intelligence.

This demonstrates both ChatGPT’s ability to “learn” and also OpenAI’s commitment to transparency. IT also suggests that Fichtner’s videos on Ethics, Responsible AI and Data Bias may be the most important in the course.


Lawrence J. Furman, MBA, PMP, is currently exploring leadership and innovation in “Adventures in Project Management,” which he plans to publish in 2025.  He did not use ChatGPT, Bard, or any other LLM in writing or editing this essay.

Big Data: AI Solutions to Intractable Problems (Part II)

Classic Big Data Analytics: Exploding Manholes and Fires in Illegal Conversions

Exploding Manholes

Consolidated Edison, “Con Ed,” an investor owned utility in New York City, has about 250,000 manholes in the City, about 50,000 of which are in Manhattan. The manhole covers are cast iron, a few inches thick, and 24 inches in diameter. Each weighs up to 300 pounds. Every year a few hundred catch fire, some of which explode up into the air. In 2007, Con-Ed went to the statisticians at Columbia University and asked them to identify which manholes were more likely to catch fire and explode, to try to find patterns in order to manage the problem.

Fires in Illegal Conversions

An illegal conversion is a one family house or apartment that houses three or four families or a community of 12 to 20 people. As noted above, New York City firefighters are 15 times more likely to die in fires in illegal conversions than in other types of residences. The FDNY knew that they could save lives by finding illegal conversions. But how? Using gut feelings and following their instinct inspectors had a 13% success rate in finding illegal conversions.

Finding the Needles in the Haystacks

Cynthia Rudin, statistician turned data scientist, led the team for Con Ed. They looked at “Trouble tickets” and found a mess. The term “Service Box” had 38 different variants, including “S, SB, S/B, S.B., SBX, S Bx, S Box, Serv Box, SERV/BOX, Service Box, etc. Database application designers should enforce quality control when and where the data are entered into the database. With 38 different terms for the same data point Con Ed’s data had no integrity.

But Cynthia and her team found 106 data points that they believed were reasonable predictors of a major manhole disaster. They condensed this to a smaller set of the strongest signals. By 2009 they were ready. They predicted problem spots. The top 10% of manholes on their list were 44% of the manholes that ended up with severe incidents.

Mike Flowers, hired by Mayor Bloomberg as the New York’s first head of Data Analytics, tackled the problem of tracking down illegal conversions. He and his team looked at records from the Dept. of Buildings, Housing Preservation, tax records, the NYPD. They looked at 911 calls, hospitalization data, reports on rodent infestations. They looked at construction and renovation permits, and Buildings violations because permits indicate careful and diligent homeowners and violations indicate issues. It took 2 years, but by 2011 they were ready. The methods were inexact, but the amount of data – on every residential property in New York City – compensated for the imperfections. The success rate finding illegal conversions went from 13% to 70%.

A House of Cards

In its early days Netflix subscribers would upload the titles of films that they wanted to watch and Netflix would mail a subscriber up to three (3) DVDs or Blu-Ray disks at a time. After watching the movies subscribers would mail the disks back. Netflix would then mail the next set of three movies.

Given the state of the art back then, we can assume that Netflix used a relational database like DB2, or MS SQL Server, or Oracle and the database contained information about subscribers and information about their stock of DVDs. The subscriber information was likely name, address, phone number, email, films watched, films out for watching, and films they want to watch. The information on movies may have been title, stars, director, genre, year made, number of copies on hand, number of copies with subscribers. After a while they probably started mining the database to determine most popular films, and trends.

In 2007 Netflix started streaming with its “Watch Now” service. To optimize this service they needed to store multiple copies of each movie in different data centers near their subscribers. Today Netflix is estimated to store 1,100 to 1,200 replicas1 of each film in different formats in order to stream to different screens, from TVs to computers, tablets, and phones, in different formats and in different resolutions, such as 4K and 1080P, across the world.

In 2011, Netflix took a big leap forward by leveraging information about the films and television shows subscribers watched to suggest what they might want to watch, asking questions like:

If Joe likes “Terminator,” and “Rambo,” would he like “Rocky”?

If Bob likes “Spy Game” and “Lara Croft” would he like “Mr. and Mrs. Smith”?

And how can we make money producing our own films?

Based on correlated viewing habits of their subscribers, Netflix determined that ninety percent of the 12 million people who liked the film “An American President” and television series “The West Wing” would watch one episode of “House of Cards” and, if it was compelling, most would watch additional episode. On February 1, 2013, Netflix premiered “House of Cards.” Their analysis was correct.

Search, Retail, and Social Media

Unlike Altavista, Ask Jeeves, Yahoo and its other competitors, Google approached Internet search from a data scientist’s perspective. Google’s engineers reasoned that just as the value of a scientific paper is in the number of other papers in which it is cited, the popularity of a website is the number of other websites that contain links to it. This worked, however, an unintended consequence paved the way for exaggerated, fictitious, or sensationalized stories designed to influence elections and political referenda. There are likely to be more searches for “Clinton and Monica” than “Clinton leads NATO against ethnic cleansing in Yugoslavia.”

Other examples of Big Data include Amazon using information on things people are looking for, purchasing, or streaming to recommend things they might want to buy or stream. Facebook and LinkedIn use information such as where users live, work, studied, and, of course, who they know, to identify other people they might know.

Correlating All These Data

While American Express probably evaluates terabytes2 of data each year, Cynthia Rudin at Columbia and Mike Flowers at the City of New York probably worked with much less data. However, they worked with multiple collections of data from different sources; with data that was not neatly organized in the tables of rows and columns of relational databases managed by DB2, Oracle, or SQL Server.

However, Rudin and her team didn’t do a concise statistical analysis. They didn’t analyze 1% or 2% of the 51,000 manholes in Manhattan and 250,000 manholes in the five boroughs. They looked at the entire dataset. The sample size was 100%. In statistical terms, “N = 100%” or “N = 1.”

Similarly, Flowers didn’t look at 1% or 2% of the roughly 4.4 million single family and two-family homes3 in NYC. He and his team looked at data on 900,000 units. Their sample size was roughly 20.45%.

This is a fundamental change in technique made possible by high capacity and inexpensive disk drives and high speed low cost computer memory, and graphics processing chips, GPUs, which are designed for simultaneous, or parallel execution of simple processes on large amounts of data.

As Viktor Mayer-Schonberger and Kenneth Cukier hammer home, science has been build looking for causation; using statistical samples of events to understand how and why they happen. Statistics are built on a representative sample. But things are different when your sample size is 100%. At that point you can look for correlations.

That’s what Cynthia Rudin and Mike Flowers did. They didn’t look for causation. They looked for correlation. And they found it. Netflix had a high degree of confidence that 11 million people would watch one or more episodes. Amazon and Apple realized that people who bought songs or CDs by certain artists, would buy songs or stream music by other artists performing in the same genre. American Express looks at purchases that make sense, and looks out for transactions that are out of sync. You can not use your credit cards in stores in Houston or San Francisco when you are in L. A. Correlation, not causation.

NOTES:

  1. Netflix Architecture. https://www.geeksforgeeks.org/system-design-netflix-a-complete-architecture/ ↩︎
  2. A “Petabyte” is a one thousand terabytes, one million gigabytes, one trillion megabytes, one thousand trillion kilobytes. For reference, one page of single space typed text may be four kilobytes, an image taken with a two to four megapixel digital camera, from 2010, will be roughly one megabyte. A petabyte, therefore could hold one trillion images or 250 trillion pages of text. ↩︎
  3. New York Housing Statistics, Info Please, viewed Sept 15, 2023. https://www.infoplease.com/us/census/new-york/housing-statistics ↩︎

Big Data: AI Solutions to Intractable Problems (Part I)

AI has become an overnight sensation. But like other overnight sensations, it has taken years to get there; 73 years since 1950 when Alan Turing described The Imitation Game, better known as the Turing Test:

“A computer would be said to be intelligent if and when a human evaluator, after a natural language conversation with the computer, would not be able to tell whether he or she was talking to another person or a machine.”

Microsoft has invested billions of dollars in ChatGPT, Open AI’s large Language Model, LLM. Microsoft is embedding ChatGPT into its search engine, and building and releasing “Copilots” for Word, Excel, and other software. Google announced “Bard,” its own LLM. Abnormal Technology, CrowdStrike, Egress, Riskified others are building AI into cybersecurity tools. Apple announced that it is embedding AI within iPhone, iPad, Mac. NVIDIA, which makes the chips and servers used in AI and was briefly the world’s largest company by market capitalization, leap frogging above Microsoft and Apple only to fall to $2.71 Trillion.

Today the relevant question is not, “Are these systems ‘intelligent’?” For the C-Suite, the questions is “What problems can we use AI to solve?” And for project managers the question is, “How do we plan and execute projects to leverage AI?”

Just as Project Managers in Information Technology have to understand networks, virtualization, and “The Cloud” in order to manage projects implementing or leveraging those technologies, and need to understand Waterfall and Agile for software engineering and infrastructure projects, we need to understand AI and Big Data in order to build and incorporate AI tools and manage Big Data projects. We don’t need to know the technical differences between Central Processing Units1, CPUs, like the Intel Xeon and AMD EPYC, and Graphics Processing Units2, GPUs, like the Nvidia RTX1, but we need to know that GPUs are used in Big Data and Machine Learning systems because they are designed for parallel processing; for simultaneous execution of large numbers of simple tasks such as rendering bit maps or comparisons of data points, such as representations of faces in an image recognition system.

The team members need to understand how to think analytically. They need to be creative, disciplined, and flexible enough to change their focus from causation to correlation, and to recognize patterns. It helps to have an understanding of statistics and the scientific method.

The simplest correlation works beautifully. Suppose you buy a ticket to fly from New York to LA. Two hours before the flight, you pay for a taxi and then, a few minutes later, baggage fees. Subsequently, you buy a book or magazine, a sandwich, and coffee in the airport. Hours later, after the flight, you pay for a taxi in LA and check into your hotel. The credit card company has a high degree of confidence that these transactions are legitimate. It knows you were traveling from New York to LA, knows you buy stuff in airports and use taxis to travel to and from airports. But if during the flight, or an hour after the plane lands in California, your card is used to attempt to buy something else on the east coast, or in any location far from LA, the credit card company will have a high degree of confidence that those charges are fraudulent. It will deny them or contact you.

American Express customers currently use their 114 million cards to purchase $1.2 Trillion of goods and services annually in 105 countries. That’s a tremendous amount of money, a staggering number of transactions, and an unbelievable number of opportunities for fraud. Amex was an early adopter of Big Data. In 2010, they began consolidating the knowledge gained in 150 different fraud detection models into one global model. They rolled this out in 2014 and made it their only model in 2015. Today, American Express’ fraud detection model delivers the best fraud detection and prevention statistics in the industry.

According to Anjali Dewan, VP of Risk Management at Amex:

“100% of our [fraud detection] models are AI-powered…. American Express for 13 years has come out as the lowest in the fraud space, and not by a little, but by half.” 3

Put another way, American Express has an army of virtual fraud investigators working around the globe and around the clock.

In “Big Data4,” Viktor Mayer-Schonberger and Kenneth Cukier explain how Consolidated Edison used “Big Data” to identify manholes likely catch fire and explode, sending 300 pound manhole covers into to sky, and how the City of New York wrestled with the fact that firefighters are 15 times more likely to die fighting fires in illegal residential conversions. The correlation was obvious. Prevent Fires in illegal conversions: save firefighters’ lives. In “Big Data @ Work5” Thomas Davenport describes how Netflix uses “Big Data” to know that “House of Cards” would be successful.

Davenport, Mayer-Schonberger, and Cukier describe the characteristics of “Data Scientists” – scientists who can code, hackers who understand the scientific method, who know how to develop and test a hypothesis, understand statistics, think independently, challenge assumptions, and find correlations, some of which may be non-obvious.

They distinguish “Big Data” from data warehouses and other decision support systems. It’s not simply the volume of data but more importantly the fact that data warehouse are internally derived and highly structured databases. Where data warehouses may be hundreds of gigabytes or terabytes managed in a relational database, a “big data” system may look at wildly unstructured data, multiply structured data or complex data some of which may be from external sources.

For example, telephone call records for a wireless carrier, transactions in a credit card processing system, multiple years accounting data would be contained in internal databases and managed by a relational database engine, an RDBMS. All the records are similar and highly structured; tuples6 of data with attributes like name, account number, phone number, address, etc. SQL, Structured Query Language, invented by Edgar F. Codd at IBM in 19707, is used to query the databases, whether managed by IBM DB2, Microsoft SQL Server, Oracle, or another database engine. These are internally managed proprietary databases.

Big Data, on the other hand, is unstructured or partially structured and may contain data that is external to the enterprise. Medical data, for example, will contain patient data, which can be managed within relational models, but will also contain diagnostic information, in the form of images, videos, hand-written notes, and the results of various tests, and these data may come from various sources.

Notes:

  1. CPUs are designed with a small number of cores, each with a relatively large amount of cache RAM, in order to run multiple applications, each with varying amounts of data, such as executing applications on a workstation or operating systems on a virtualization host. ↩︎
  2. GPUs are designed with a large number of cores, each with a relatively small amount of cache RAM, in order to simultaneously and repetitively execute the same process on a large number of data points. ↩︎
  3. Machine Learning Helps Payment Services Detect Fraud, AmericanExpress.com. How Amex Uses AI To Automate 8 Billion Risk Decisions (And Achieve 50% Less Fraud) John Koetsier, Forbes.com, Sept 21, 2020, ↩︎
  4. Mayer-Schonberger, Viktor, and Kenneth Cukier, Big Data, © 2013 ↩︎
  5. Davenport, Thomas, Big Data @ Work, © 2014, Harvard Business School Publishing Corp. ↩︎
  6. In database theory, a tuple is a single row in a table or index of a relational database. This comes from mathematics, where a tuple is an ordered list of mathematical objects, i.e. integers in a relation, such as coordinates in a Cartesian plane. ↩︎
  7. IBM Archives, “Edger F. Codd,” https://www.ibm.com/ibm/history/exhibits/builders/builders_codd.html ↩︎