Real-Time Workforce Scheduling in Multi-Process Environment

tarix	27.06.2016
ölçüsü	70.5 Kb.

Real-Time Workforce Scheduling

in Multi-Process Environment
Mike Marin, IBM, mikemarin@us.ibm.com

Michael Masin, IBM, michaelm@il.ibm.com

Segev Wasserkrug, IBM, segevw@il.ibm.com

Sergey Zeltyn, IBM, sergeyz@il.ibm.com

Dagan Gilat, IBM, dagang@il.ibm.com
Abstract

This research is motivated by a practical problem in real-time workforce scheduling. Scheduling is performed in a challenging environment that involves multiple instances of different business processes, general process structure that enables sequential and parallel step processing, stochastic service times and random routing. Resources are assumed multi-skilled and can have different service rates. For each process, a non-decreasing penalty function is defined. This function depends on end-to-end time of a process instance and provides relation between operational and business metrics. Based on several theoretical insights, borrowed from other application areas, such as call centers and jobshops, we design a decentralized dispatching rule that enables real-time implementation in large-scale business process environments. Given a set of idle resources and a set of waiting steps from all active process instances, a service index for each feasible resource/step combination is computed. Then the maximal service index is determined and the corresponding resource/step assignment is performed. The service index is computed as the product of several components, where each component corresponds to a specific scheduling principle. Our scheduling rule is tested using a broad set of practical examples, including those based on real-life scenarios. It implies significantly lower penalty costs than alternative well-established dispatching policies. In addition, we check that all components of the service index contribute to minimization of the penalty costs.

1. Motivation and Problem Formulation

Large service enterprises such as banks and insurance companies must accomplish a large number of concurrent business processes. Examples include managing a loan approval process in banks or issuing a new insurance policy in insurance companies. Typically, business processes comprise a set of pre-defined steps and routes connecting them, where routing can be stochastic and some steps can be processed in parallel. Knowledge workers are required to complete these steps and processes, where each worker (herein referred to as a resource) has a specific set of skills and roles.

In such settings, at each point in time, there are usually a large number of concurrent process instances in progress, where each process instance belongs to one of several process types (e.g., short-term loans or mortgages in a bank). In many cases, service level agreements or service level objectives (SLA/SLOs) are defined on these processes. Typical forms of such SLA/SLOs are deadlines on the sojourn time (i.e., end-to-end handling time) of the process instances or their parts, with predefined penalties on SLA/SLOs violations. For example, in the insurance domain, the processing of an insurance claim must be completed within some predetermined time, e.g., thirty business days.

Since there are many concurrent process instances at different stages of completion, each instance may have a different amount of time until the SLA/SLO is violated, and each instance step may require different skills, it becomes a challenge to assign resources to the individual steps. While such environments have similarities to manufacturing environments, there are significant differences which make business process environments more challenging. For example, multiple resource skills—in the sense that a resource can process steps of different types—are not typical in a manufacturing environment. Stochastic routing between process steps constitutes another important source of differences.

In this work, we present a real-time dispatching rule that assigns available workforce to the steps of process instances, with the goal of optimizing the global objective of improving the SLA/SLOs, subject to resource availability and process flow constraints. This rule is based on several theoretical principles that are known for other domains and correspond to more simple models. We validate our approach via simulation experiments comparing it with several well-established alternatives.

2. Model Description

Our model is composed of three main components, described in detail below: a mathematical model of the business process, a model of the resources, and a penalty function and optimality criterion that formally describe the SLA/SLO.

Mathematical model of the business process. Process models of the types described above are usually specified using a syntax that is suitable for execution in platforms known as Business Process Management (BPM) engines. These platforms manage many executing instances simultaneously and keep track of information on the state of these instances. We developed a convenient mathematical model of the business process for developing the dispatching rule. This mathematical model is consistent with the standard models used in the BPM engines.

In our model, each process consists of a finite number of steps, where the processing-time distribution of each step is known and has a finite mean. Two basic processes constitute the main building blocks of a process with a general structure:

Sequentially Executable (SE) process includes a number of steps that are processed one at a time. The first step of an SE process is chosen according to an initial probability distribution. In general, after completion of the current step, the next step (or process termination) is determined randomly according to a routing probability matrix of the process.
Basic Parallel (BP) process includes a number of steps that can be processed simultaneously. The set of steps that must be completed is determined according to some probabilistic model. A BP process is completed upon termination of all the steps that must be performed.

A general process is defined using these two basic building blocks and the following recursive mechanism. First, a main sub-process, either SE or BP, is defined. Its components are steps or other sub-processes (child processes). Each child process of the main process can include steps or child processes of a lower level, and so on. This recursive structure enables incorporation of very general business processes.

Model of the resources. Each resource has a set of skills that determine which steps a resource can process. In addition, an acceleration factor is determined for each feasible set of process, resource, and step. If the acceleration rate is equal to one, the processing-time distribution of a step is taken from the process definition above. Otherwise, the processing time accelerates (or slows down) by the corresponding factor.

Penalty functions and optimality criterion. For each process, the SLA/SLO needs to be modeled. The SLA/SLOs are modeled by defining a non-decreasing, piecewise-continuous penalty function. Negative values of this function are also feasible, corresponding to a bonus for early completion. The penalty per instance is incurred according to the penalty function of the corresponding process and instance end-to-end (sojourn) time. The objective function, subject to minimization, is the sum of the sojourn time penalties for all instances per time unit.

Such penalty functions are a valid way to model the SLA/SLOs, as in many cases, business processes have one or several deadlines, which are the natural time points for a change in the penalty function. For example, given a ten-day deadline, the penalty can be zero for an end-to-end time that is shorter than ten days, a two-unit penalty is incurred for each process that does not meet the deadline, and an additional one-unit penalty is incurred for each day of tardiness over ten days.

3. Real-Time Dispatching Rule Description

Due to the level of complexity of process models, it seems hard to derive a theoretically optimal scheduling rule. Moreover, since our scheduling algorithm should provide reasonable performance in different settings (for example, neither system size nor load regime are known in advance), it is not practical to pursue an approach that is oriented towards a specific performance regime or system size (for example, fluid or heavy-traffic models). Therefore, we searched for guiding theoretical principles in different domains and built upon these principles to design an efficient and flexible work assignment policy in our setting.

The work assignment policy we created is based upon a dispatching rule approach. We implemented this approach as follows: given a set of idle resources and a set of waiting steps from all active process instances, the Service Index (SI) for each feasible resource/step combination is computed. Then the resource/step assignment is performed, based on the optimal value of this SI. For example, when a resource becomes available, the resource is assigned the step whose SI is the highest for this resource.

A central concept to the SI calculation is the Expected Residual Processing Time (ERPT). The ERPT of a process instance is equal to the mean time required to complete an instance, given that no delays are encountered by the remaining steps of the instance. Therefore, we developed an algorithm for ERPT approximation that also provides exact ERPT value for some special cases.

In the main implementation of our algorithm, the SI is computed as the product of several components. Below we briefly describe them.

Penalty Index takes into account the penalties that are incurred if additional delays are encountered by the process instance. In essence, we provide the multi-stage generalization of derivative of the penalty function used in the scheduling rules for multi-type and multi-skill queues, as introduced in the papers of Van Mieghem, and Mandelbaum & Stolyar. In contrast to the single-station case, the derivative should not be estimated at the current sojourn time of the instance; rather, we estimate it at the predicted sojourn time in the “case without delays” using the ERPT concept above.
ERPT Index implements the following principle: instances that are closer to completion should have higher priority. It implies a decrease in the number of instances that currently reside in the system. This index is the normalized inverse of ERPT of the instance. The normalization is performed with respect to the expected processing time of the process, given no delays are encountered. ERPT index generalizes the service-rate index in many classical scheduling rules. In addition, similar rules are used in jobshops and stochastic processing networks.
Parallel Processing Index is included in the SI calculation if a resource can potentially process several steps of the same instance that belong to different parallel branches of the process. This index is based on the longest processing time rule that is optimal in some parallel processing setups. In our case, the expected processing times of parallel process branches are used in computations instead of processing step times.
Service Rate Index is equal to the resource/step acceleration factor mentioned above. It gives higher priority to “faster” resources.
Customer Preferences Index is included in the SI calculation if a manager of the system has her own preferences concerning resource/step assignment. For example, if a resource has a rare and important skill, low values of the index are assigned to steps that do not need this skill. We do not use this index in the validation experiments, described below.

4. Algorithm Validation

We performed extensive simulation experiments to validate our scheduling algorithm and fine-tune its parameters. The simulation scenarios were based on real-life business processes or specifically designed to test the algorithm's performance for different process structures. Both heavy-load regimes with high resource utilization and less loaded settings were considered for each scenario. We performed comparison of our scheduling rule with a number of well-established scheduling policies from the jobshop and queueing fields. For example, we considered the Shortest Processing Time first rule (SPT) and two versions of the First Come First Served disciplines: the step with the longest waiting time is served first (FCFSS) and the step that belongs to the process instance with the earliest arrival time is served first (FCFSI). For each scenario we created a policy that tried to mimic a Static Policy (SP) designed by an experienced human. In all cases, our scheduling rule significantly outperformed all alternatives. The results show that each component of the SI significantly improves algorithm performance in most—if not all—of the test examples.

The main results of three case studies appear in Table 1 below. Case 1 is based on an actual mortgage loan sequential process; three customer types (VIP, High Importance, and Regular) with different cost functions were considered. There were no parallel activities and resources with different service rates in this study, so the parallel and service rate indices could not be validated. Case 2 builds upon Case 1; an additional loan process is added, and parallel steps and resources with different service rates are incorporated. Case 3 provides an example of a process with a multi-level parallel structure.

Table 1 compares eight scheduling policies with our SI. The first four columns of Table 1 correspond to the results of the product-form SI where one of the co-products is omitted. For example, SI/Pen corresponds to the SI without the penalty index. The last four columns correspond to the alternative dispatching rules, described above. The numbers in the table show the increase in the penalty costs per unit of time with respect to the SI. For example, the SPT policy implies 150% higher penalties than SI in the heavy-load setting of Case 1.

Table 1: Penalty increase with respect to SI

Case study

SI/Pen

SI/ERPT

SI/Par

SI/Rate

SPT

FCFSI

FCFSS

SP

Case 1. Heavy load

60%

78%

---

---

150%

62%

190%

47%

Case 1. Light load

24%

59%

---

---

61%

25%

96%

36%

Case 2. Heavy load

93%

0.1%

3.3%

32%

67%

54%

61%

46%

Case 2. Light load

17%

1.6%

12%

97%

36%

103%

113%

115%

Case 3. Heavy load

20%

39%

5%

7%

39%

36%

107%

44%

Case 3. Light load

3%

12%

12%

22%

20%

34%

58%

47%

Remark. All values in the table—except those printed in italics—are statistically significant with a 95% confidence level.

Our experiments demonstrate the following: First, assigning steps to human workers according to the SI results in much lower penalty costs than all the available alternatives. The typical improvement with respect to the “best alternative” is 20-50%. (See the last four columns of Table 1.) Second, all SI components are important. Note that our scheduling rule is decentralized in the sense that it only needs the operational data of the resource and the instances that directly participate in the scheduling decision. Therefore, it enables real-time implementation in large-scale business process environments.

5. Conclusions and Challenges for Future Research

We created a dispatching rule built upon principles from manufacturing and call centers, and applied it to complex business process environments. Through extensive experimentation, we have shown that our rule provides significant benefits in managing the workforce in such environments, while significantly reducing SLA penalties.

We are currently pursuing several research directions that expand the presented work. First, we are testing the introduction of an additional work balance component of the SI, based on Winq dispatching rule used in jobshops. Second, we have promising initial results on other types of SLA/SLOs, such as “x% of process should be completed within time y”. Third, our approach can be generalized to SLA/SLOs on completion times of process parts.