PREVIOUS CHAPTER
C H A P T E R 4
Core AI
Techniques
How does one even start to capture the processes that lead to what we rec- ognize as “intelligent” behavior? More appropriately for our definition of AI, how does one begin to program machines so that they can make decisions instead of us?
As we already discussed in Chapter 3, there are two broad approaches. We can either develop explicit models that govern the behavior of our system or we can attempt to discover those models by analyzing data and looking for patterns. In this chapter we provide a very high-level overview of what the main techniques are. Understanding the thinking behind these core tech- niques enables you to reason about the pros and cons of different approaches and the needs they might place on your products and teams.
Artificial Intelligence is a fast-paced field and it feels like there are “new” things every week. This can make someone think that it is pointless trying to “catch up”; that it is best to leave everything to the experts. While it may be true that there are constant developments in the field, and setting yourself a goal of being always up to date is a losing battle, it is equally true that the core concepts have been around for decades. Whether it is semantic knowledge modeling or artificial neural networks, the ideas have been around since the
1960s. It is the specific algorithms and approaches that have evolved in the
Chapter 4 | Core AI Techniques
meantime. Having a clear understanding of the overarching concepts is what will provide the most value in the long term and will allow you to reason about what appropriate strategic direction to take.
Model-Driven Techniques
Model-driven techniques are a celebration of human ingenuity. It is us looking at the world and inside our brains, identifying the core pieces that are impor- tant and connecting them in a way that allows us to make predictions. Because they are the result of our own mind elaborating on concepts, we can also fully understand them and explain them to others, something that is incredibly valuable. When a model-driven technique is correct, it is often the most effi- cient path through a problem, as it captures just what it requires and allows us to build on solid foundations. Think of the core rules of thermodynamics, chemistry’s periodic table, or the three laws of biology: incredibly powerful statements that govern large swaths of how nature behaves.
The model-driven techniques we review in the following have been actively used in applications for decades now. They are the techniques that have allowed companies to optimize the coordination of their transport fleets, led to better search results on the Web, improved manufacturing processes, helped cure more people, and so much more. They are not the techniques that people typically refer to when talking about this current moment of AI renaissance, but they are nevertheless crucial in building sophisticated applica- tions that can efficiently and robustly make decisions.
In looking at model-driven techniques, we will examine three core aspects:
how to represent information, how to reason over it, and finally how to plan.
Knowledge Representation
The goal of knowledge representation is to provide us with tools and tech- niques to describe data in a way that allows us to more easily manipulate it with computers. That refers both to single items (e.g., how to describe a stand-alone document) as well as the relationship between items (e.g., how does a specific document relate to a project).
No matter what organizational context you operate in, you undoubtedly pro- duce documents about meetings, proposals for clients, project reports, and so on. Typically, you will have some sort of document management system where all this data gets stored. Maybe it is as simple as a shared hard drive on a local network or a shared Dropbox or Google Drive environment. Now, imagine that all the documents were dropped in the same folder, called “OUR STUFF” and were all named in inconsistent ways:
The AI-Powered Workplace 43
– OUR STUFF
• sales-rep.doc
• client-abc-preso.pdf
• my-cool-thoughts-on-stuff.pdf
• projections.xls
• …
As the documents grow to hundreds and then thousands, a significant amount of organizational effort would be going into simply searching through this unmanageable folder to find something.
Hopefully, in your organization things look a bit more like this:
– Documents
• Sales
• Client ABC
• Presentations
• Offers
• Final Contract
• Project Work
• Client ABC
• Project Reports
• Deliverables
Simply by managing folder structure and imposing some rules around where documents should go, you have introduced knowledge representation to your team. This simple hierarchical structure makes it easier for people and machines to find information.
■ In general, knowledge representation is the effort to identify an appropriate model to cap- ture what we know about the world, together with means to manipulate that model in order to infer new things.
44 Chapter 4 | Core AI Techniques
Let us consider another simple example. Suppose you are the HR depart- ment of a very large organization and you receive hundreds of CVs daily from people with all sorts of skills. You would like to be able to automatically categorize those CVs based on the skills that people mention, so that you can contact the appropriate subject matter experts within the organization who would need to evaluate them. You have five high-level groups with titles such as front-end engineers (people who specialize in building the user inter- faces and visual aspect of digital tools), back-end engineers (people who spe- cialize in data management, algorithms and systems integration), project managers, quality assurance and testing, and site reliability engineers (the ones who make sure all systems run smoothly).
However, you have a challenge. The terms people use in their CVs to describe these skills keep changing and, especially, the technologies that are related to these skills keep evolving. New programming languages, frameworks, and so on are constantly being introduced. How can you automate the process of sorting through CVs in an appropriate fashion?
You get together with your team and decide that you are going to build a tool that will capture terms that are used to describe these skills and relate them to your high-level groups. The subject matter experts across the organization will be able to use the tool to enter keywords that they are interested in and then software will refer to that “knowledge” to sort through CVs. Following is an example of the type of information captured:
– Front-end engineer
• Core Skills
• JavaScript
• HTML
• CSS
• Frameworks
• React
• Vue.js
• Node.js
• General Skills
• Version Control
• Testing / Debugging
The AI-Powered Workplace 45
Congratulations, you have just built a rudimentary knowledge graph1 or ontol- ogy! An ontology is a more structured description of the elements that make up a certain domain, together with relationships that connect those elements together and a way to reason about the implications of certain connections.
Ontologies capture our understanding of the world and they can come in many different forms, from simple thesauri to hierarchical taxonomies like the pre- ceding example to far more sophisticated networks of interconnected entities. Knowledge representation and knowledge reasoning techniques focus on for- malizing the way we build and describe things such as ontologies, so that we can capture increasingly more sophisticated types of information in a way that remains tractable for machines to reason over. The aim is to enable us, given a set of facts, to infer additional knowledge. If I know it is furry, and it has four legs, and it makes a purring sound, can I assume it is a cat? A good ontology should be able to tell you that there is actually a range of animals (very likely all feline) that would fit that description, so you cannot simply assume it is a cat.
We now have a rich and sophisticated toolset to work with; ontologies at scale and applications can be found across a variety of fields from medicine to e-commerce. Ontologies can be created and populated “manually” by subject matter experts, but they can also be created automatically using data-driven techniques to identify and extract the relevant entities and relationships and can then be further curated by experts.
Ultimately, any sufficiently complex automated system—whether through formal means or through informal, ad hoc implementations—will end up rep- resenting knowledge in a way that machines can manipulate. As such, knowl- edge representation and management becomes a core technique for most applications.
Logic
Logic is at the heart of everything we do with computers. The building blocks of the processors at the core of our machines are logic gates that combine in a variety of ways to gives rise to the complex behavior we need. When we program, we typically use predicate logic to define what should happen. Consider the following one-line statement:
1 In recent years, especially after Google launched what it calls its Knowledge Graph, people started referring to various forms of ontologies as knowledge graphs. The Google Knowledge Graph is what powers the answers you get to the side of Google search results in a standout box. Following an analysis of your query, if they are able to pinpoint a specific reply to your question (as opposed to a search result) they will provide that. Since then, the term knowledge graph has been creeping into literature, and you might find ontology mentioned in more formal settings and knowledge graphs elsewhere. Ultimately, it all points to the same end result: a structured representation of information that machines can reason over.
46 Chapter 4 | Core AI Techniques
“If temperature reading is over 25 degrees Celsius, switch off heating.”
This is a simple program using predicate logic to determine how to deal with heating in a certain environment. Now, imagine that in order to switch off something, say a component in a nuclear plant system, you would have to consider hundreds of statements (or propositions) that need to be satisfied and not just a single one. Further, suppose that those statements are not simple yes/no answers, that they in turn can kick off other processes, and that the order in which things take place is also important. How can you system- atically step through the reasoning process and come to an outcome that allows you to take an action. This is what logic systems are all about.
■ Logics, in the broadest sense, concerns itself with providing appropriate formal structures to reason about different situations in the world.
There are different forms of logic that tackle different aspects of reasoning about the world. For example, epistemic logic tries to tackle the problem of what is known, and especially what is known among a group of agents that are sharing statements and beliefs. Temporal logic helps us reason about temporal events and the nature of time. It is the type of logic a machine will need to be able to reason about statements such as “My alarm will ring for 30 seconds unless I stop it earlier and will start again after 5 minutes if I press the snooze button.” Deontic logic tackles the problems of what is appropriate, expected, and permitted. There are numerous other formal logic systems and combina- tions of them trying to capture and codify all the different aspects of life and what we as humans so effortlessly handle every day.
Logics will play a big part of providing the types of behaviors we will come to expect from more self-directed or autonomous software. Consider the fol- lowing example. A user visits the web site of a car manufacturer and engages an automated conversational agent (a chatbot) to determine what sort of car they should purchase. The user, prompted by the conversational agent, might provide information along the lines of: they like to go outdoors, they have a large family, they have a pet, and so on. The conversational agent in return starts offering some options of possible cars. There’s nothing particularly strange so far. The differentiation, however, comes when the prospective cli- ent rejects a proposed choice. Our logic-powered agent is able to ask why that choice was rejected. For example, the user might say: “Because, I don’t think it will be able to fit all our equipment for a trip to the mountains.” A simple agent would not be able to counteract that argument, and would sim- ply move on. An agent that uses logic, though, might be able to offer a coun- terargument. Something like: “Well, did you consider that you can fold down the back seats or add a roof rack and carry large equipment that way?” For an automated agent to achieve this, it needs knowledge of how a car behaves,
The AI-Powered Workplace 47
coupled with logic that will describe the effects of actions. This way it can deduce that folding seats creates more space, which is a valid counterargu- ment to present to the user. Software that is able to offer facts and counter- arguments can become a much more active assistant for us, not only helping us complete a task but also offering choices on how to complete the task.
Logics have another key role to play. As automation takes over more aspects of our lives, we will have to be able to offer more concrete assurances that they will behave in certain ways and that we can trust the decision-making done. Logic and model-based reasoning, in general, will play a large part in helping us ensure that systems are safe and that they can be trusted.
Planning
With knowledge representation we can describe our world and, using logics, we can reason about it. That is great, but how do we set in motion actions that will help us achieve a goal? This is where planning comes into play.
An easy way to conceptualize planning is to think of an actual robot trying to work out how to solve a problem. Imagine a robot that has a number of dif- ferent capabilities such as moving backward and forward (or even sideways), jumping, going up stairs, picking things up and moving them around and so on. These are the actions it can perform to change its state in the world, using sensors to understand what state the world is in.
Now, imagine that the robot is told that it has to move a chair from one point in a building to another. This represents its goal, the desirable state of affairs. It can’t just start aimlessly doing things until the chair is where it is supposed to be (which would be quite entertaining but not very useful). Instead, the robot needs to formulate a plan, potentially with intermediate goals and the actions that will achieve those intermediate goals until it completes the final goal. A plan would look something like this:
1. Locate chair
2. Move to chair
3. Pick up chair
4. Move chair to desired location
5. Place chair in desired location
Planning software would have to be able to come with this plan, monitor its progress, and replan as things change, such as someone moving the chair from its original location or something getting in the way of reaching a destination.
48 Chapter 4 | Core AI Techniques
■ Planning is the process of identifying what actions, and sequences of actions, will enable automated software to achieve a specific goal given its current context.
Anyone who has ever had the “joy” of having to schedule work, or plan activi- ties or a course of action is painfully aware of what a daunting task this can be as the number of activities increases, interdependencies emerge, and you are constantly having to replan. Planning techniques allow teams to handle large pieces of work and ensure adherence to constraints across thousands of indi- vidual items with complex constraint reasoning and hundreds or thousands of rules. Commercial software using these techniques plays a key role in building bridges, launching spacecraft, and producing airplanes.
Data-Driven Techniques
Discussions about data and what it can enable occupy the overwhelming majority of thinking in the AI techniques space. Visionary statements abound of how every action can be measured, stored, and then used to predict our desires, needs, and intentions and influence our next action. Sometimes the use of data feels almost child-like in its naïve simplicity. You liked a picture of a friend riding a bicycle? Here are ads so that you can buy a bicycle yourself! Visited a site that sells shoes? We shall “retarget” you and inundate you with ads from that very same site for the next few weeks (quite often even if you actually already bought those shoes from that very same site!). Have you walked enough steps today? If not, we might need to give you some gentle “nudges” tomorrow so you can catch up.
It’s easy to be cynical about the data age (especially if, like me, you tend to be cynical about most things!). However, it is important to not underestimate just how important data-driven techniques are for us now and in the future. While model-driven techniques can give us certainty and safety and demon- strate how human intuition can cut through the noise and focus on just what is really important with fundamental rules, data-driven techniques release us from the limitations of what our own mind can discover and give us the super- power of being able to create something without having had prior knowledge of how to create it. We build machines that explore and create for us. Perhaps within one of these machines there will eventually be large portions of the answers we so desperately need to fix our climate and heal our bodies.
In considering data-driven techniques I decided to avoid going through a long list of all the various architectures and approaches that are, anyway, in con- stant evolution. The Web is awash with information, and if one wants to delve more deeply, they can easily find a lot of great examples. Instead, what I want to highlight are the three core approaches and their relative differences.
The AI-Powered Workplace 49
We will look at how we can discover models through machine learning using supervised, unsupervised, and reinforcement learning. The one exception to this rule is a brief look at artificial neural networks and deep learning. Strictly speaking, they could be categorized under supervised or unsupervised learn- ing but since they are so often referred to, it is worth addressing them directly.
Supervised Learning
The bulk of applied machine learning is currently focused on supervised learn- ing techniques. Supervised learning attempts to build a model using data that is already labeled. The “supervision” consists of referring to those labels in order to indicate to an algorithm whether its prediction was correct or not. Through an iterative process during which the algorithm adjusts its decision- making process, you hope to arrive at a final model that will act as a reliable predictor.
■ Supervised learning refers to techniques wherein algorithms use annotated data (i.e., data with the “correct” desired answer already provided). In training, the responses are supervised, and the algorithm is informed on whether it got the right answer. This information is used to adjust the model.
Let’s work through an example to highlight the key phases of a supervised learning process. Assume that you are tasked with the problem of renewing the company document store. After several mergers, software upgrades, and personnel changes your document store is in a mess. You know there is valuable historical data there, but you cannot sort through documents appropriately. You decide that a first useful step would be to classify those documents along broad categories that would make sense for everyone (e.g., sales documents, project reports, team evaluations).
The phases you typically would need to go through in a supervised learning process are
• Gather and prepare data.
• Choose an appropriate machine learning algorithm and fine-tune it.
• Use the resulting model from the previous phase to predict.
Let’s consider each phase in turn.
50 Chapter 4 | Core AI Techniques
Gathering and Preparing Data
You’ve kicked up enough dust and leaned on enough people to get all the documents in a single place. Never underestimate just how complicated it can be to simply get to the data. Departmental processes, internal politics, fear of regulation, and so many other factors can easily spell an end to your automation dreams before you even get started. It is always worth carefully planning for this phase before committing other resources. There is nothing quite as inefficient as having a highly skilled machine learning specialist or data scientist sitting around while you need to have yet another meeting to determine who you need to talk with to get access to data. You are one of the lucky ones, however. Your data is all there. All the documents are ready to be classified.
In a supervised learning scenario you need to select a part of your data and annotate it appropriately so that you can use it in training. You need to identify key features (title, summary, author, department, date, word frequency,2 etc) that can help determine the type of document, and then you classify your data with the correct answer or target variable. Please note that each one of these decisions carries with it a complex set of con- siderations. Have you selected an appropriately representative subset of data? If not, then your model is not going to behave correctly with the entire dataset. You may have introduced a number of different biases, since your model is going to favor data similar to the one that was used to train it. Assuming the data is representative, have you selected an appropriate set of features to focus on? Choice of a wrong feature can once again lead to unwanted bias. The machine learning community as a whole is developing best practices in order to guard against some of these issues, but there is no fail-safe approach. It requires patience and experience, and you need to truly embrace failure as learning— you are exploring an unknown world and using software to help you craft a model that makes sense of it. Like any explorer and scientist, you need to embrace the inherent risk that comes with it. Of course, the payoff at the end of the process is huge. Having successfully automated a hard task, you give your organization a marked competitive advantage.
2 Please note that I am simplifying here considerably. Typically, for text classification word frequencies are the key feature, and the way you represent these frequencies (as mathe- matical vectors rather than actual text or sums) is quite sophisticated and is a field of study within natural language processing in and of its own accord.
The AI-Powered Workplace 51
Choosing a Machine Learning Algorithm, Training, and Fine-Tuning
With data in hand, your next task is to determine what machine learning algo- rithm you should use to help you build a prediction engine. As we’ve already mentioned, there is a wide range of choices coming from mathematics, statis- tics, or computer science, and new approaches and architectures are invented at a breathtaking pace. It is the task of the machine learning expert to identify what would be the most fruitful or promising approach given your specific problem and type of data. Once more, you need to keep in mind that there is no simple answer or simple set of steps to arrive at an answer. Choosing and refining a machine learning architecture is a process of experimentation. For our text classification problem, solutions can range from something as “sim- ple” as a Naive Bayes classifier to complex convolutional neural network archi- tectures or to something novel that is created exclusively for your dataset. A good rule of thumb is to try the simpler approaches first and only move up in terms of complexity if you are convinced that the additional effort is justified given the potential benefit gains. The typical question is whether the cost of getting a hypothetical additional 2% boost in performance will justify the costs it will take to get there.
Training is the process of feeding the algorithm data, allowing it to adjust its various “weights” as it searches for a “combination” that will enable it to pro- vide correct predictions. Annotated data is typically split between a training set (i.e., what will actually be used to develop a model), and a test set, which will be used to validate the model. Even here you can see how important it is to properly distribute your annotated data between the training set and the validation so that they are each a good representation of the mix of data that your model needs to be able to handle.
Fine-tuning the model (or parameter tuning as it is often called) is the process of adjusting factors that affect how the machine learning algorithm behaves, to identify what that might do to the results. You might change how many times you pass the training data through, or how significant a wrong prediction at any given point is considered and how much that should impact the change of parameters. Once more, we need to accept that this is a process of experi- mentation and you need to constantly be reassessing how much more effort it is worth investing in the overall process.
Predicting
Finally, with a working model now discovered through machine learning, we are ready to deploy it and do prediction on completely new data. There are two types of prediction that machine learning models tend to do. On the one hand, you have classification tasks, such as what we would need in order to
52 Chapter 4 | Core AI Techniques
classify our documents. The input document would be assigned a category based on what the model believes the document is discussing. On the other hand, you have what are termed regression tasks. A typical regression task is to predict the value of a specific item given some input characteristics, such as to predict the value of a house given its location, size, configuration, what other features it has, and so on.
Unsupervised Learning
What happens if you don’t have any labels for your data? Well, to start with there are some things that you will simply not be able to teach an algorithm. You can’t teach something what a cat is without actually showing it a cat. Having said that, there is a lot that algorithms can do to uncover potential correlations or groupings in our data that can teach us something.
■ Unsupervised learning analyzes data to uncover possible groupings or associations, without the need of any annotated data.
A typical application is to use it to segment or cluster datasets in closely related groups. Unsupervised learning can, for example, be applied to cus- tomer purchase data to identify if your customer base can be split into groups that can provide you with some insight about that group. Something along the lines of “customers who purchase product A tend to purchase product C as well” or “customers who purchase product D all tend to come from a specific geographical area.”
Unsupervised learning is also, at times, used in combination with supervised learning. Under appropriate conditions, a model can be generated using train- ing data that is a mixture of both labeled and unlabeled data. Simplistically, you can consider unsupervised learning to be doing some of the potential classifi- cation for us and that is then mixed with supervised learning. Although, in general, this should be considered a relatively risky and unreliable strategy, there is a promising growing body of research about it. In the coming years unsupervised learning will start playing an increasingly more important role, as machine learning engineers are constantly faced with the problem of having a lot of data in general but not enough labeled data.
Reinforcement Learning
Finally, we come to reinforcement learning, the fun cousin in this trio of machine learning approaches. Reinforcement learning is the closest to how one would intuitively think of training and learning in nature.
The AI-Powered Workplace 53
When we are training our pet to do something such as coming when called or sitting when instructed to do so, we don’t present it with lots of correct and wrong examples of what coming or sitting looks like! Instead, what we do is try to coax the pet into doing what we would like it to do and once it does it, reward the pet heavily. This rewarding reinforces that this is the correct behavior. We keep repeating the process until the pet clearly associates the specific command like “Come, Max!” to the reward and ultimately the desired behavior. Similarly, and hopefully very thoughtfully, when a wrong behavior is identified we punish the pet (ideally with not much more than the use of a firm voice or a sharp look). This teaches the pet what the undesired behaviors are, because they will lead to punishments and not rewards.
These are the principles that reinforcement learning takes into the digital world. The usual setup is that some sort of environment is defined in which an agent can act, and the environment designer provides punishments or rewards when desired states are achieved. There is a wide variety of approaches researchers can take in training an agent, such as constantly providing feed- back or simply providing a reward (or a punishment) at the end of a game. For example, you can have an agent playing chess, teach it nothing about how chess actually works (other than how it can move the pieces) and only pro- vide feedback at the end of a game. The fact that the agent can run through millions of games means that eventually it might just stumble on an interesting chess strategy that leads to winning even though it started out with no knowl- edge of the game.
The big moment for reinforcement learning came when Google managed to build a system, AlphaGo, that defeated the world champions in Go. Go is considered a much harder problem to solve than chess, since there are many more states that the agent can find itself in, making constant calculations and searching for the next optimal move an almost intractable problem. As such, after IBM’s Deep Blue conquered chess, many AI researchers turned their sights on Go. Ultimately, the team at Google DeepMind won. They used a combination of supervised learning and reinforcement learning to train deep neural networks alongside novel search strategies3 to deliver the winning approach. Interestingly, this combination of techniques meant that AlphaGo was able to be much more “strategic” than Deep Blue, which relied on more brute force techniques of evaluating all possible outcomes of a game from a specific position. In addition, AlphaGo discovered the correct ways to play through supervised and reinforcement learning, rather than having more explicit evaluation functions provided to it.
3 David Silver et al., “Mastering the game of Go with deep neural networks and tree search” Nature 2016, 529: 484-489.
54 Chapter 4 | Core AI Techniques
Reinforcement learning is still very fertile ground for artificial intelligence research and there is much for us to discover. While winning at games such as Go is about going after the grand challenges of AI research, there are very practical applications across a number of industries from robotics to manufacturing, transport, trading, and more.
Deep Learning and Artificial Neural Networks
Deep learning (DL) and artificial neural networks (ANNs) are terms that are mentioned heavily within the context of AI, so it is worth providing some clarity here as to exactly where they fit and what they are. To start with, let us clarify that ANNs are a way to achieve mostly supervised or unsupervised learning. There are several other ways to achieve that, but ANNs are the most exciting area of development and the source of much progress in recent years.
The fundamental premise of ANNs is that the way to reach a decision is by feeding data into a network of “neurons” connected in layers.4 Each neuron accepts an input, which it processes via an activation function associated to that input (an equation that, given a number of inputs, will give us an output) and will then fire off a subsequent input to the neuron or neurons it is con- nected with in the next layer along a weighted path. See Figure 4-1.
Figure 4-1. Single artificial neuron
4 I placed the word neurons in quotes because it is important to remember that these artificial neurons have very little to do with how neurons in our brain work. While brain neurons may have been the source of inspiration for artificial neurons, we now know enough about how the brain works to at least be absolutely sure that the functioning of ANNs bears little resemblance to the functioning of the brain.
The AI-Powered Workplace 55
Each layer, broadly, specializes in identifying some feature of the input informa- tion and that information is fed forward to subsequent layers. There may be any number of neurons and layers internally, but it will all eventually lead to an output layer where the final set of neurons that gets activated will provide us with the answer. See Figure 4-2.
Figure 4-2. Artificial neural network with multiple hidden layers
When we train an ANN, we are using a training model to manipulate parameters (or biases) associated with individual activation functions on each neuron, as well as on the connections between neurons, until the final layer starts providing the desired results. The way these parameters change after each training cycle and how it all leads to a good result in the end is what DL experts focus on.
DL is a catch-all term for techniques that use ANNs, typically in heavily multilayered architectures where layers can be connected both forward and backward and where multiple architectures can combine to form a whole.
The main advantage of DL techniques is that they significantly reduce the need to identify what features one should input into the ANN in order to train it. For example, if you are trying to train a model to correctly recognize a face, you might start by decomposing objects in an image into basic geometric shapes and input that information into an ANN. With DL, it is the network
56 Chapter 4 | Core AI Techniques
itself that will do the work. We just input all the raw data: the value of every single pixel in the image. The ANN will extract its own features based on its architecture and the training data, with each layer “learning” a feature and the subsequent layers aggregating those features into higher level concepts.
However, and here is the catch, we need a lot of data in order for appropri- ate features to be discovered. Furthermore, the resulting network is very opaque to us. We do not know exactly what it decided to “focus” on in order to classify an image or part of an image as a cat rather than a dog. In fact, AI literature is littered with fun examples of how ANNs can get it wrong or focus on just a very small number of features that lead to very brittle solutions. This is why input data is very important. For example, suppose you want to distinguish between different objects, say cars and bicycles. If all the pictures of cars you show your ANN are of cars in a city, whereas the bicycles are in the countryside, your ANN is likely not going to work if you show it a car in the countryside. It is just as likely to use the appearance of multiple trees or lots of green as an indicator that something is a bicycle as it will use features of the object itself.
The key is to remember that although ANNs may impress us with their results, they have no semantic understanding of the data they are processing. They are simply looking for any patterns that they can use to classify input data one way or another. We quite often mistakenly attribute meaning to results and assume that our ANN has discovered something relevant to what we asked. We should never assume that. Instead, we need to thoroughly test ANNs with an appropriate variety of data and put in place governance to limit the impact of wrong automated decision-making so that there is enough con- fidence that the overall system will work within satisfactory parameters.
From Techniques to Capabilities
In this chapter we reviewed the core artificial intelligence techniques that allow us to develop specific capabilities. These are the building blocks, the techniques that emerge out of research labs and can then be combined and applied to give us complete systems.
The most important takeaways are:
1. The core problem is that of finding a model that allows us to describe and predict what will happen in a given scenario so we can enable decision-making.
2. We should not place limitations in terms of where that model can come from. We can explicitly design it (model- driven) but we can also use data to helps us discover it (data-driven).
The AI-Powered Workplace 57
3. Although this is a fast-paced field, the core concepts do not change that quickly. Even ANNs, which are viewed as cutting-edge, have been around for decades. Having a basic understanding of the key principles behind different techniques helps when picking a tool or discussing potential solutions with a team.
4. Keep an open mind about how techniques can combine to lead to a final result. When evaluating potential tools for your own problems, don’t be distracted by discussions about the purity or authenticity of one approach vs. another. Focus instead on the quality of the final capability that you get.
As we will see in the next chapter and in subsequent sections, a complete application is always the combination of a number of different techniques.
C H A P T E R
5
Core AI
Capabilities
In the previous chapter we saw that there are lots of different techniques we can use and combine to model aspects of intelligent behavior. On their own, however, they will not get us far. These techniques only have value in as much as they allow us to do something specific and clearly identifiable: transcribing speech to text, classifying a document, or recognizing objects in an image. To achieve these tasks, we typically need to combine techniques into capabilities.
AI capabilities represent a concrete thing we can do to better understand the world and affect change within it. They are analogous to the human senses. Humans can see, hear, smell, touch, and taste. Each one of these senses involves a number of subsystems (techniques) that combine to provide the final result.
Take the ability to see, as an example. Light passes through the cornea and lens of our eyes to form an image on the photoreceptors. From there, via the optical nerve it reaches our brain to the primary visual cortex. Information there gets processed again, and is eventually mapped to specific concepts. We employ different techniques for collecting light, transforming it, and process- ing the results in support of a single capability: sight.
NEXT CHAPTER
Comments
Post a Comment