Intelligent Automation (IA), also known as hyperautomation, is a set of technologies and methods for automating the work of white-collar professionals and knowledge workers. Here, we present a framework for explaining its power in terms of four main capabilities — Vision, Execution, Language, and Thinking & Learning — and how they enable business transformations with people and business goals at their center.
Computer vision is an area of technology that’s progressing extremely rapidly with new breakthroughs coming all the time. It is coupled with deep learning to allow the computer to make sense of what it sees and make intelligent guesses about any missing visual information — just as the human brain does for the eyes.
Applications of computer vision in a physical environment include recognizing objects (e.g., a robot navigating its physical environment) and interpreting signs and road markings (e.g., a self-driving car). But it has even more applications in a digital environment.
Computer vision is used for intelligent character recognition (ICR), a more advanced descendant of optical character recognition (OCR). ICR can be used to digitize documents, such as invoices, contracts, or IDs, and extract and interpret the information in them.
Computer vision can automate the analysis of images and videos. This has a vast range of applications. It can automate medical diagnostics to improve outcomes and free up doctors’ time. It can provide retail store automation, such as Amazon Go, where cameras determine which items a customer has picked up and bills them accordingly without the need for human checkout assistance. It can be used in business process documentation to automate what is usually a lengthy and resource-intensive process by detecting the applications and objects a computer user interacts with and creating a flowchart of the process — complete with screenshots.
Finally, computer vision can be used for biometrics, with applications for identification, access control, and surveillance.
Execution involves doing things — accomplishing tasks — in digital environments. This can include clicking on buttons, typing text, logging in and out of systems, preparing reports, and sending emails.
The execution capability acts as a glue to connect other capabilities together in a streamlined way. It can, for example, collect sales data using the Vision or Language capabilities, automatically convey the data to the Thinking & Learning capability for analysis, and compile and send out reports on the findings with the help of the Language capability.
The key technologies supporting the Execution capability are smart workflow, low-code platforms, and robotic process automation (RPA). Smart workflow platforms help automate predefined standard processes. If existing processes are not yet documented, this can be achieved using IA-powered business process documentation, as discussed in the previous section. Low-code platforms allow business users without coding skills to develop automated programs. RPA, which is the most powerful of these technologies, is used to automate tasks a human can do on a computer, such as opening applications, clicking menu items, entering text, or copying and pasting. It learns by recording the actions of the human user and then automating them to save time.
The Language capability enables machines to read, write, listen, speak, and interpret the meaning of natural human language. It’s used to extract useful information from unstructured documents, to categorize text (e.g., spam filters), and to perform sentiment analysis. It enables text-to-speech, speech-to-text, and predictive text keyboards. It’s also used to power chatbots, such as ANZ Bank’s Jamie, which is used to onboard new clients and guide them through the bank’s services or Google Duplex, which can book restaurant tables and hair appointments over the phone, and machine translation, such as Google Translate, which is used by 500 million people each day to translate over 100 languages.
Natural language processing (NLP) used to be coded as a set of rules, but nowadays, it uses deep learning to read large amounts of text and notice correlations and patterns — similar to how humans learn languages.
Thinking & Learning
The Thinking & Learning capability is about analyzing data, discovering insights, making predictions, and supporting decision-making. It can work autonomously, triggering automated process activities, or it can be used to augment human knowledge workers by providing them with insights to guide their decisions and actions.
The key technology behind this capability is machine learning — primarily deep learning, the newest and most powerful component of machine learning. Deep learning uses neural networks with multiple layers. Inspired by how the human brain works, each layer processes and interprets the data at a different level. It learns autonomously, from large amounts of training data, to spot patterns and correlations without being explicitly taught or programmed with any rules. It excels when faced with complex, unstructured data with numerous features, so it’s used for image classification, natural language processing, and speech recognition.
The Thinking & Learning capability also covers data management. It acquires, validates, cleans, and stores the data needed for machine learning and provides data visualizations to help guide human decision-makers.
The impact of IA capabilities on your business
Our aim is for this conceptual framework to equip you with the information you need for selecting vendors, choosing technologies for your IA journey, and fitting them into your existing organizational and IT landscape.
Beyond the value these four capabilities can deliver individually though, their combination unlocks more impact than the sum of their parts. Combining the technologies broadens the scope of automation from isolated tasks to continuous, automated, and touchless end-to-end processes.