The Optimal Scheduling of Parallel Jobs With Dependencies and Starting Times
In a data warehouse system that runs approximately 12,000 every day, there are bound to be some crashes. How should this be dealt with?
Join the DZone community and get the full member experience.
Join For FreeIntroduction
In our data warehouse system, approximately 12,000 jobs must run every day. These jobs may have some constraints such as starting time and predecessors (See Figure 1). Some jobs may have crashed due to data quality, connection, and system issues. More importantly, due to a lot of jobs being created every week, optimization is necessary. For this reason, our study is trying to minimize the total duration of jobs to be scheduled (makespan time). This problem is a version of a job shop scheduling problem. The objective of this problem (job shop scheduling) is to schedule jobs to parallel machines minimizing the total duration of jobs (makespan time) while noticing the constraints and using the average times.
This problem is comprised of a set of jobs (J) and a set of machines (M). Each job has only one operation. Some jobs have one or more predecessors. This means that it has to wait until the dependent jobs execute successfully. Another constraint is starting time. The job that has the starting time constraint has to wait until their starting time passes (see Table 1). All jobs have to be executed at the end of the plan. Each machine can only be executed one job at the same time.
Job Name |
Predecessors |
Average Duration (Min.) |
Starting Time |
RTX_EXTR_SDP_BONUS_ADJUSTMENT_DELTA |
SCH_RTX-EXTR_START RTX_EXTR_SDP_BONUS_ADJUSTMENT |
0,65 |
|
PRS_LOAD_SERVICE_CATEGORY_TP |
PRS_XFRM_SERVICE_CATEGORY_TP |
0,08 |
|
SCH_EOM_RUN_DUMMY |
0,01 |
0730 |
Table 1.
Figure 1: The portion of a job’s predecessors.
The JSS problem can be solved by exact methods, as summarized in Table 2. The small size instance can be solved by these exact methods in a reasonable amount of time. However, the computational time grows exponentially when the problem size increases. Thus, the Job Shop Scheduling (JSS) problem is NP-hard. Hereby, this problem is investigated in heuristic algorithms such as Simulated Annealing (SA), Genetic Algorithm (GA), Taboo Search (TS), Ant Colony Optimization (ACO), Neural Network (NN), Shifting Bottleneck Procedure, Guided Local Search, Parallel Greedy Randomized Adaptive Search Procedure (GRASP), and Constraint Propagation.
Reference |
Solution Strategy |
Year |
Applegate and Cook |
Branch and Bound |
1991 |
Carlier and Pison |
Branch and Bound |
1989 |
Table 2: Exact solution methods for the JSS problem.
We focus on two of the metaheuristic algorithms, Variable Neighborhood Search (VNS) and Genetic Algorithm (GA). VNS contains a multiple neighborhood structure. Its basic idea is changing the neighborhood structure systematically within a local search so that the search isn’t trapped on a local minimum. GAs uses the basic Darwinian mechanism of survival of the fittest and utilizes information in the solution population to generate new solutions with better performance.
In this technical report, we present an algorithm based on the philosophy of VNS and GA algorithm to solve the problem. First of all, each job converts into a time domain to have a start time using dependency. Each machine executes the jobs starting at 00.00 using dependency and adding the average duration to starting time so that each job has starting time at time domain (job ordered time space). Our approach is changing the neighborhood structure using job ordered time space in order to do shake step in VNS algorithm and evaluate the solution using job ordered time space in GA algorithm.
Background
Preprocessing, Initial Solution, and Problem Representation
There are J piece of jobs to be scheduled and M set of machines. Schedules are represented in a job name. They store in a database table (see Table 1). Each job has one row in the table and their predecessors are in the predecessor’s column with a comma-separated list once they are preprocessed to split the list and find all dependent jobs backward (See Table 3).
Job Name |
Predecessors |
Average Dur. (Min.) |
PAR_XFRM_PARTY_IND_ORG |
PENNA_EXTR_LOCATIONS_DELTA |
30,48 |
PAR_XFRM_PARTY_IND_ORG |
BIRLINK_EXTR_CORP_CORE_POSITIONS_DELTA |
30,48 |
PAR_XFRM_PARTY_IND_ORG |
PRM_EXTR_PRM_CORPORATE_DELTA |
30,48 |
Table 3: Pre-processed jobs.
After the preprocessing step, the initial solution is created using a greedy algorithm. The algorithm we used is shown in Figure 2. Candidate solutions to the problem are generated using the initial solution (See Table 4).
Figure 2: The pseudo-code of the initial solution’s algorithm.
Job Name |
Machine Code |
Average Duration (Min.) |
Starting Time |
SCH_YTS-EXTR_START |
1 |
0,01 |
01.12.2013 02:07:00 |
YTS_EXTR_YTHUKUKBUROSU |
2 |
0,05 |
01.12.2013 03:18:02 |
YTS_EXTR_YTHUKUKBUROSU_DELTA |
3 |
0,23 |
01.12.2013 04:11:50 |
Table 4: Portion of the initial solution table.
Let's focus on minimizing the fitness function (the total duration of jobs in makespan time), which can be formulated in Equation 1.
Jobs (J): = {J1, J2... Jj} | j = 1, 2, 3… n.
Machines (M): = {M1, M2… Mi}| i = 1, 2, 3… m.
Processing time for each job: Pij = {P11, P12… Pij} | i = 1, 2, 3…n; j , 2, 3…m.
Total duration time for each machine (makespan time): Mj = max { ∑Pi }- min { ∑Pi } | j = 1, 2, 3… n.
Fitness Function: Mmax = max { Dj } | j = 1, 2, 3… n (Equation 1).
Variable Neighborhood Search (VNS)
Neighborhood Structures
The neighborhood structure is one of the significant elements for performance. It is used to search different search spaces and try to find a better solution. In this neighborhood structure, two random jobs are selected. If the constraints are suitable for a swapping operation, do so.
For instance, let’s say we have five different neighborhoods. Assume that the YTS_EXTR_YTHUKUKBUROSU and YTS_EXTR_YTHUKUKBUROSU_DELTA jobs can be swapped. Their starting time is changed. Consequently, the dependent jobs’ starting time is changed with a difference in the average duration of these jobs after the procedure (See Table 5).
Job Name |
Machine Code |
Average Dur. (Min.) |
Starting Time (Before) |
Starting Time (After) |
SCH_YTS-EXTR_START |
1 |
0,01 |
01.12.2013 02:07:00 |
01.12.2013 02:07:00 |
YTS_EXTR_YTHUKUKBUROSU |
2 |
0,05 |
01.12.2013 03:18:02 |
01.12.2013 04:11:50 |
YTS_EXTR_YTHUKUKBUROSU_DELTA |
3 |
0,23 |
01.12.2013 04:11:50 |
01.12.2013 03:18:02 |
Table 5: Exchange example.
Although there other efficient neighborhood structures reported in the literature, we preferred this method to the simplicity and ease of use.
Variable Neighborhood Search Algorithm
The VNS algorithm gets an initial solution on the whole set of search space. After that, it manipulates in which core it changes and searches via two main functions called shake and local search. Local search explores an improved solution within the local neighborhood, while shake diversifies the solution by switching to another local neighborhood. We used the classical VNS algorithm. It consists of the following steps:
Initialization: Find an initial solution x.
Repeat the following steps until the stopping condition is met.
Shake Procedure: Generate at random a starting solution.
Local Search: Apply a local search from the starting solution x using the neighborhood structure (if it’s improved, then do x←x’).
Genetic Algorithm (GA)
Encoding
The proposed GA used permutation representation. It’s described as a set of permutation solutions where the individual is a result of the population. For each parent, the initial solution is run using these jobs. Each job is labeled as the job name. The solution is calculated using these names. For instance, let assume there are four jobs: A, B, C, D. The one solution may be “A-B-C-D.”
Selection
Selection operator forms a mating consisting of the average chromosomes of the population. The mating pool is used to generate good offspring chromosomes by the crossover and mutation operators. The roulette wheel selection is applied here for parent selection. According to the probability of chromosome’s distributions, it selects one chromosome from the population. The process is repeated at the beginning of each run.
Crossover and Mutation
We don’t use crossover because most of the jobs have predecessors or starting times. If there will be any crossover, then each starting time must be computed and each predecessor must be checked. If it doesn’t comply with the constraints, crossover can’t be applied. It doesn’t comply with the constraints most of the time. However, swap mutation is applied. It selects two random jobs. If it’s suitable to swap due to constraints, swap the range; otherwise, do nothing.
Stopping Criteria
Our study used the Generational GA Model. It’s based on a generation that is created with mutations. The stopping criterion is the maximum number of generation. The creation of generations continues until the maximum number of generation is satisfied.
Experimental Results
The optimization algorithms are implemented in Structured Query Language (SQL) and Procedural Language/Structured Query Language (PLSQL) and are tested. Experimentation is performed on an Oracle Exadata V2 and the Oracle 11g database. The problem is run 20 times. The stopping condition is 100 iterations for VNS and 100 generations for GA. Parameter settings are shown in Table 6. Results are based on computational time (CPU), average fitness function (minimum of makespan time), and the standard deviation of the fitness function and T-test (see Tables 7-8 and Figures 3-4). Since the optimal solution of this problem is unknown, the average relative percentage error (ARPE) and relative percentage error (RPE) can’t be computed.
Factor |
Parameter Setting |
Population Size |
10 |
Mutation Probability |
1,25 |
Table 6: Parameter settings for GA.
The following tables show the experimental results.
VNS |
GA |
|
Best |
686,8 |
743,05 |
Average |
763,51 |
776,97 |
Std. Dev. |
20,79 |
22.81 |
T-Test |
0,0091 |
Algorithm |
CPU Time (microseconds) |
GA |
6486114 |
VNS |
81824560 |
Trial |
VNS |
GA |
1 |
777,02 |
749,98 |
2 |
771,03 |
748,68 |
3 |
771,42 |
795,47 |
4 |
771,2 |
791,8 |
5 |
770,33 |
763,9 |
6 |
775,07 |
811,2 |
7 |
766,35 |
778,18 |
8 |
755,77 |
743,65 |
9 |
755,77 |
791,8 |
10 |
777,02 |
820,88 |
Trial |
VNS |
GA |
11 |
777,02 |
771,47 |
12 |
755,7 |
771,92 |
13 |
768,52 |
785,7 |
14 |
742,7 |
784,02 |
15 |
748,23 |
800,97 |
16 |
774,13 |
762,28 |
17 |
776,8 |
784,03 |
18 |
686,8 |
743,05 |
19 |
776,2 |
791,8 |
20 |
773,13 |
748,68 |
Figure 3: Progress plot for VNS (Fitness Function(y) – Iteration(x)) for a run.
Figure 4: Progress plot for GA (Fitness Function(y) – Generation(x)) for a run.
Conclusion
We examined the VNS algorithm and GA for a job shop scheduling problem with dependency and starting time constraints. It has been shown that the VNS and GA implementations have done well. GA is using less CPU time than VNS, so it’s indicated that GA is faster than VNS for this implementation. For the average case, VNS is better than GA, but GA reduces the fitness function lower than VNS does.
Opinions expressed by DZone contributors are their own.
Comments