Research: 1st Quantifiable Method to Identify/Prioritize Candidates for RPA

By Carlos Alvarenga
IRPA AI Expert Contributor & Advisor



If you have not heard of RPA, you will soon

New research presents the first quantifiable method to identify and prioritize the best candidates for Robotic Process Automation (RPA)

If you spend time with an enterprise technology strategist today, you will soon hear about the promise of something called Robotic Process Automation (RPA). Depending on whom you speak with, RPA is, at least, a great set of new technologies or, at most, a revolution in enterprise operations. What is certain is that adoption is growing quickly. The RPA market was valued at USD 1.57 billion in 2020 and its projected compound annual growth rate (CAGR) is 32.8% from 2021 to 2028. Indeed, it is the fastest-growing category of enterprise software in the world.

In a basic sense, RPA is simply the latest generation of software for automation. The focus of RPA efforts in most companies is on processes (or part of processes) that require repetitive task sequences that can be mapped and encoded in software. Much as automated machines take over tasks on an assembly floor, the automated software “robots” of RPA take over tasks previously performed by human process workers. For example, consider the onboarding of a new hire into a company, which requires many repetitive steps as well as the recording of data in several systems. In a traditional model, a clerk moves a new hire through this process, recording data where needed over the space of several hours or even days. With RPA, however, a software robot can complete the same procedures in minutes.

The software itself falls into three categories. Desktop RPA applications are relatively easy to configure and run. Like “macros” in a spreadsheet, this form of RPA typically runs when a human user decides it should. The software robot operates simplistically with no goal beyond completing the assigned tasks. Think of this category as “basic” RPA: it does its job as simply and efficiently as possible. 

The second category is one we might term “advanced” RPA, because it is designed to take on more complex tasks and often incorporates some basic form of Artificial Intelligence (AI) or Machine Learning (ML) capabilities. This category of software again is often configured by end-users and may be focused on processes that cross functional or organizational boundaries. 

The third category is software that companies can use to make their own RPA applications. Like Software Development Toolkits (SDKs), this category of “Intelligent RPA” technology requires significant technical expertise to build and execute. Consequently, this last category can include much more sophisticated AI engines and techniques than the first two categories and can tackle enterprise-level problems that involve automating a series of interdependent processes that are complex to design and run.

Given the level of interest in RPA, it is remarkable that it has been the subject of very little formal research. Though it is early in the development of RPA tools, there are enough implementations in the field to begin to make conclusions about the technology, especially about the optimal settings for its use. This lack of research is especially unfortunate when one considers that when an organization moves into the third category of RPA, perhaps the most crucial choice it makes is the selection of the right processes for automation.

Because the wrong target for RPA can doom efforts, practitioners receive a lot of advice about their selection. Unfortunately, technology vendors and consultants often simplify the selection process with generic advice that can apply to almost any technology project in the last thirty years. The excerpt below is a good example:

First, you need to assemble a cross-functional team of people from business and IT departments who can garner executive buy-in and lay the foundation of your automation journey. This team should audit the processes where your employees spend most of their time or where the most money is being spent; they should also look at the processes that cause the most frustration for customers.

This lack of a systematic, quantitative process RPA selection model inspired a pair of researchers, Johannes Viehhauser and Maria Dörr of the Technical University of Munich, to develop one. Their work attempts to improve over the simple “rule of thumb” advice practitioners usually receive and was developed after extensive analysis of documented RPA projects and structured interviews with RPA professionals. As the authors note, “the existing research on process selection in RPA projects lacks robust, generalizable, and quantifiable selection criteria to identify suitable RPA processes.” Thus, they raise the following important question: How can organizations systematically identify and prioritize the most suitable process candidates for automation with RPA?

The Methodology

To answer the question posed by their study, the authors first completed an exhaustive review of RPA projects documented in the research literature. In all, they analyzed 24 case studies, the findings of which confirm not only the importance of process selection but also the absence of a detailed and universal approach for prioritizing RPA processes. Indeed, the authors found that within the case studies, process selection was often based on simple, qualitative criteria that varied inconsistently across projects. In no case did they come across a standardized, repeatable framework for future projects to employ.

In addition to a literature review, the authors conducted thirteen interviews with experts from RPA software providers (eight interviews) and RPA integrators (five interviews). RPA software providers, such as Automation Anywhere, Blueprism, or Uipath, were asked to contribute insights into the latest technologies, their requirements for application, and approaches for process selection. RPA integrators, such as FourNxt, Macros Reply, or Roboyo, were selected to contribute an application-driven perspective as well as experiences about implementation challenges.

Though all software companies provided process identification models, they varied by vendor. A common theme, however, was the concept of a feasibility analysis with which to gauge process suitability for RPA. Unfortunately, even though high-level approaches are presented, almost all RPA providers and integrators lack objective criteria and assessment models for the selection of process candidates. As a general rule, complexity serves as the main way to assess the process suitability. However, the authors note that “the operative assessment is, for the most part, not based on measurable criteria, but rather on subjective evaluations.” These subjective criteria include process complexity and the degree of standardization. Other criteria include the degrees of human judgment needed to run the process, the availability of structured data input, the number of interfaces, volumes, and repetitiveness, etc. The good news is that the expert interviews, again, validated the importance of picking the right targets for RPA.

Development of a Process Suitability Model

After integrating both the objective lessons learned from the documented cases and the subjective input of the experts, the authors propose an RPA process selection model that follows three phases (as shown in Figure 1 below).


Figure 1. Approach for process selection in RPA projects (Source: Authors)

Step 1: Objective definition

The proposed approach starts with the definition of the objectives for automation. Strategic decisions should be made about the preferred automation technology in consideration of a company’s goals. Moreover, the definition of objectives at an operational level also has to be defined once the strategic goals are set. Clearly-defined objectives are important because “they guide the overall selection process and serve as an appropriate baseline to evaluate the project success.”

Phase 2: Process preselection

This second phase consists of process identification, data collection, and process prioritization. This preselection can be carried out via workshops, suggestions by operational employees, or innovative methodologies such as process mining, which is a technique for using event data to generate process insights and improvement actions. Process preselection is followed by data collection to obtain reliable data points as a basis for the process prioritization model. Once a full data set is complete, the next step is to prioritize those process candidates that seem best suited to the application of RPA.

The research finds that the following six factors, presented from most to least importance, should be analyzed in phase 2:

  1. Standardization: Processes with a high degree of standardization reduce the implementation effort, increase the speed of implementation, and raise the overall probability for project success, which makes them the most promising RPA candidates.
  3. Volume: A high volume of processes in terms of execution and execution time is identified as the second most important process selection criterion, since the automation of high-volume processes helps to maximize the benefits of RPA, leverages the highest potential for cost reduction, and this provides the strongest economic argument for an RPA deployment.
  4. Automation Rate: Processes with a high share of manual activities offer greater and faster economic benefits compared to processes with a high degree of automation.
  6. Maturity: Mature processes possess high stability, low probability of exceptions, and predictable outcomes. The lower the chance of potential future process changes, the more the overall risk for adjustments or even the failure of an entire RPA project is reduced.
  7. Digitization Level: Processes are more suitable to RPA if the data required to run them are readily available in digital format.
  9. Rate of Failure: Finally, processes with a high rate of failure are identified as suitable RPA candidates because automating them reduces costs of quality and rework and therefore increases the overall business case logic.

For their model, the authors applied a sophisticated weighting regime based on a further set of 134 interviews with RPA developers, analysts, software developers, and researchers. The surveys covered the general descriptors of the participants (e.g., the type of company for which they worked, position help, experience with RPA, etc.), as well as a pairwise comparison of the perceived importance of each criterion when selecting RPA processes. Based on this final effort, the model assigned the highest weighing to high standardization, followed by high transaction volumes. The specific weightings of all model criteria are presented in Figure 2 below.


Figure 2: Process selection criteria and derived factor weights, a.k.a., eigenvectors (Source: Authors)

Phase 3: Selection

In the third and final phase, the selection addresses the most promising process candidates from phase two and is supplemented by an economic analysis. The authors’ model integrates the selection criteria with financial metrics that suggest the value that automating the selected processes will create.

To test their model, the authors applied it to an RPA selection process in the accounting department of an international technology company. Data for 102 sub-processes with 792 activities were collected and analyzed. As shown in Figure 3 below, the model scored the target processes and selected the month-end closing sequence as the optimal RPA choice for the first implementation. The model’s conclusion was validated not only by internal experts inside the company but also by a panel of external experts the authors convened to review the model’s recommendations. Overall, the model not only provided an objective assessment of how suitable the processes were to RPA conversion, it also supported and enhanced the subsequent Return on Investment analysis conducted by the company’s finance team.


Figure 3: RPA target processes and suitability values determined with authors’ model (Source: Authors)


This study provides the first mathematical model that technologists can use to evaluate RPA process candidates. The model is based on extensive empirical research across firms, vendors, and functions. Combined with the anecdotal guidance from the case study reviews, this effort provides useful guidance to leaders planning RPA initiatives at this early stage of the technology’s development.

Of course, as with any first study of its kind, there are some limitations to the work. For example, the selection criteria selected are general in nature and not specific to any industry. Additionally, the research looks at all processes ex ante and does not consider the possibility of process changes during the RPA deployment. Another challenge is that deployment teams need a sound understanding of — and good data about— target processes in order to adopt fully the authors’ approach. While these and other limitations are present, when one compares the authors’ methodology with the alternatives typically presented by vendors and consultants, the rigor and value of their research are clear.

For other RPA researchers, this effort provides empirical confirmation of selection criteria gleaned from the pre-existing literature and case studies. For technologists in the field, this study yields important practical implications by providing a universal selection methodology and reliable indicators of a project’s future success. Lastly, the quantification of process suitability allows business leaders to select the most promising process candidates and therefore increases the overall probability that RPA investments will bring tangible value to their organizations.

In closing, it is easy to see that future enhancements to the approach presented in this research could make the authors’ methodology a key part of any serious enterprise RPA effort. If one accepts analyst figures, the value of their approach is magnified by the fact that over half of all large companies are already experimenting with RPA. Indeed, with post-pandemic labor shortages present in many industries, RPA use will only expand in the next few years, and predictions that it will be near-universal by 2030 may well turn out to be true. These facts make the analytical rigor this team brings to the challenges of RPA deployment all the more welcome and useful.


The Research

Viehhauser J., Doerr M. (2021) Digging for Gold in RPA Projects – A Quantifiable Method to Identify and Prioritize Suitable RPA Process Candidates. In: La Rosa M., Sadiq S., Teniente E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science, vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_19

About Carlos Alvarenga


Featured Content

Latest News


Are you ready to take the first step and learn more about RPA?

Contact Us