International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1865

ISSN 2229-5518

A REVIEW ON SCHEDULING ALGORITHMS FOR RESOURCE MANAGEMENT IN DATA GRIDS

Dr. C. Chandrasekar Assistant Professor, Department of Computer Science, Sree Narayanaguru College of Arts and Science, Coimbatore.

ABSTRACT

V. Manuprasad
Research Scholar, Department of Computer Science,
Sree Narayanaguru College of Arts and Science, Coimbatore.
A very promising attractive computing model is Grid Computing. To provide high performance computing, grids aims to combine the geographically distributed, power of heterogeneous, multiple domain spanning computational resources. An efficient and effective scheduling system is important for achieving the capable potentials of grids. In grid environment, scheduling systems do not work for traditional distributed environments due to classes of environments which are completely dissimilar. Over various domains, making scheduling decisions involving resources is the process known as Grid Resource Scheduling. Across the management, grid scheduling must manipulate large-scale resources so there is high complication involved in it than the local resource scheduling. In this survey, discussion are made on grid scheduling algorithm with a view from different points which are objective function, QoS constraints, static vs. dynamic, strategies dealing with dynamic behavior of resources, application models, adaption and so on. This survey mainly focused a review of the subject from scheduling algorithm viewpoint rather than covering the entire grid scheduling area.

Keywords: Grid computing, scheduling algorithms, grid architecture

1. INTRODUCTION

A Grid is a system of high diversity, which is rendered by various applications, middleware components, and resources. The brief overview of the grid is given below

1.1 Grid Computing

Grid scheduling is a process responsible for assigning user’s jobs onto available grid resources. The main scope of this process is to maximization of optimization criteria such as flow time, machine usage and fairness or to promote the non trivial QoS (Quality of Service). When there is a dynamic change in the grid environment, the scheduling process should be able to react efficiently on this change and it should be fast and flexible. This scheduling process will also include multiple administrative domains searching for using single machine or for using multiple resources by scheduling single job at a multiple or single sites. The job may be anything when there is need for resource – to a set of applications, from a bandwidth application, to an application. The term resource is used to mean anything which can be disk space, machine, QoS network and so on, for scheduling. In general, for ease of use, in this chapter we refer to resources in terms associated with compute resources; however, nothing about the approach is limited in this way. The figure 1 shows the simple grid.

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1866

ISSN 2229-5518

Figure 1: A simple grid

1.2 Grid Scheduling Architecture

The scheduler phase is shown in Figure 2. There are three main phases in grid scheduling: resource discovery, which generates a potential resources list and gathers information about those resources, resource selection, selecting a good set from those resources and job execution, where cleanup and file staging are included.

Resource Discovery: The first phase, i.e. resource discovery. In the grid information system, the resource information’s are stored; to express application requirements this phase needed a standard way with respect to stored information. So, to describe the system attributes there is need for an agreed upon scheme to understand the value mean of different systems.

Resource selection: Resource selection usually occurs after resource discovery phase. While the first phase filters out unwanted resources, this phase should determine from this large list the best set of resource(s) chosen to map the application. This selection requires gathering dynamic information in detail from resources. To rank the resources this information should be used and it also allows the scheduler for choosing the one which should have high performance while executing the application. This selection will be complex for multi-component parallel applications while it will be simple for sequential job. This mechanism is considered inflexible and will not suitable for autonomous nature o grid resources.

Job Execution: The last phase of the scheduling architecture is job execution. In this phase, job preparation requires intermediary steps like advance reservation, file staging, etc., so it is very complex phase. When there is no sufficient progress in the job, a user may reschedule the job by returning to step 4 in figure 4. In multiple sites, rescheduling is harder for executing the parallel job. So, dynamic scheduling is required in that case.

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1867

ISSN 2229-5518

Figure 2: Architecture of a general grid scheduler

1.3 Grid Scheduling Algorithms

In this section, we provide a survey of scheduling algorithms in grid computing,

1.3.1 Taxonomy of Grid scheduling Algorithms

Since Grid is a special kind of systems, scheduling algorithms in grid fall into a subset of the following taxonomy.
1. Local vs. Global
2. Static vs. Dynamic
3. Optimal vs. Suboptimal
4. Approximate vs. Heuristic
5. Distributed vs. Centralized
6. Cooperative vs. Non-cooperative
A Hierarchical taxonomy for scheduling algorithms is shown in figure 3. Branches covered by Grid scheduling algorithms up to date are denoted in italics. Examples of each covered branch are shown at the leaves.

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1868

ISSN 2229-5518

Scheduling algorithm in distributed system

Local

Global


Static

Dynamic

Optimal

Sub-optimal

Physically

Distributed

Physically

Distributed

Approxi mate

Heuristic

Coopera tive

Non- cooperative

Optimal

Sub-optimal

Optimal

Sub-optimal

Optimal

Sub-optimal

Optimal

Figure 3: Hierarchical taxonomy for scheduling algorithms

1.3.2 Objective Functions

Sub-optimal

There are two major parties in grid computing which are resource consumers and providers. Resource consumers will submit various applications while resource providers share the resources. The object function will present the incentives in scheduling. Application centric and resource centric are the two categories in the objective function. The objective function is shown in figure 4.
Objective Functions

Application Centric Resource Centric

Figure 4: Objective function

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1869

ISSN 2229-5518

1.3.3 Adaptive Scheduling

According to the previous, current and future resource status, an adaptive solution in which the parameters and algorithm used is to make scheduling decisions when dynamically changes. In grid computing, the demand for scheduling adaptation comes from three points: the heterogeneity of candidate resources, the dynamism of resource performance, and the diversity of applications, as Figure 5 shows. Correspondent with these three points, there are three kinds of examples also. They are resource adaptation, dynamic performance adaptation and application adaptation.
Adaptive Scheduling

Resource Adaptation
Dynamic Performance
Adaptation
Application Adaptation

Figure 5: Taxonomy for adaptive scheduling in grid computing

2. RELATED WOKS

Using heuristic and meta-heuristic approaches Xhafa et al (2010) surveyed on computational model for their resolution and grid scheduling problem. Xhafa et al ‘s work shows the heuristic and meta- heuristic approaches usefulness for designing the efficient grid schedulers and reveals the scheduling problem complexity in grids when compared to parallel and distributed systems scheduling. For grid scheduling problem Xu et al (2011) proposed the Chemical Reaction Optimization (CRO) algorithm. This CRO is inspired between the molecules in chemical reaction by the interactions which is population based meta heuristic. In order to address the security requirements, the grid scheduling problem is formalized by Kołodziej et al (2011) as non-cooperative non zero sum game of the grid users. Evaluation of proposed model is done under large-scale, heterogeneity and dynamics conditions using grid simulator.
Sun et al (2010) proposed a Priority based Task Scheduling Algorithm (P-TSA). In this method, according to the priority tasks are scheduled. Users can easily access the resource without knowing physical location transparently in the grid environment. Scheduling algorithm classification is given by Vivekanandan (2011) which are applicable to grid environment in distributed computing. Workflow scheduling is an important area in grids which allows the users in an atomic way to process large scale. Workflow scheduling support is presented by Hirales-Carbajal et al (2010) as an extension for Teikoku Grid Scheduling Framework (tGSF). To minimize the turnaround time in heterogeneous environment Singh et al (2011) presented a scheduling algorithm. This algorithm is used in dynamic environment where the job arrives at different time interval which is based on greedy method.
In a smart grid for sharing different level of information Caron et al (2010) study Demand Response (DR) problems. To achieve the aggregate load profile suitable for utilities, a dynamic pricing scheme is introduced and studied the ability to get an ideal flat profile which depends on the information they share. Lee et al (2011) mainly focuses on computing grids. In this paper, propose a hierarchical framework and a job scheduling algorithm called Hierarchical Load Balanced Algorithm (HLBA) for Grid environment. A new scheduling algorithm is introduced by Delavar et al (2010) for independent task according to genetic algorithm and studied all the parameters in the grid. It is very efficient and dependable when compared to previous algorithm.
Taken from leading computational center, Mehmood et al (2010) proposed a grid scheduling algorithm using real workload traces. The performance of the proposed algorithm is evaluated using the real workload traces. A statistical analysis of workload traces is also included for presenting the nature

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1870

ISSN 2229-5518

behavior of jobs. An Adaptive Scoring Job Scheduling algorithm is proposed by Chang et al (2012) for the grid environment. By composing the computing intensive and data intensive jobs, it will decrease the time completion of submitted jobs. Ma et al (2011) proposed a new computing platform of distributed heterogeneous which aims to achieve the resource sharing and collaborative computing. A learning automata based job scheduling algorithms is introduced by Torkestani and Javad (2012) for grids. According to the grid constraints, workload on each grid node is proportional capacity and varies with time. A new task scheduling algorithm is proposed by Entezari et al (2010) for assigning the tasks to the grid and minimizing the tasks total makespan.

COMPARATIVE STUDY

AUTHOR

DESCRIPTION

PROS AND CONS

Xhafa et al

(2010)

Using heuristic and meta-heuristic

approaches they surveyed on computational model for their resolution and grid scheduling problem. Xhafa et al ‘s work shows the heuristic and meta-heuristic approaches usefulness for designing the efficient grid schedulers and reveals the scheduling problem complexity in grids when compared to parallel and distributed systems scheduling.

The survey provides the complexity of the scheduling problem and shows the approach usefulness for the efficient grid schedulers design.

Xu et al (2011)

For grid scheduling problem they

proposed the Chemical Reaction Optimization (CRO) algorithm. This CRO is inspired between the molecules in chemical reaction by the interactions which is population based meta heuristic.

Using the CRO power in solving the grid scheduling problem is the main benefit. But development of some heuristic components dedicated to the grid scheduling problem is to be implemented.

Kołodziej et al

(2011)

In order to address the security

requirements, the grid scheduling problem is formalized as non- cooperative non zero sum game of the grid users. Evaluation of proposed model is done under large-scale, heterogeneity and dynamics conditions using grid simulator.

This method is more resilient for the Grid users to pay some additional scheduling cost due to verification of the security conditions instead of taking a risk and allocating them at un-trustful resources.

Sun et al (2010).

They proposed a Priority based Task

Scheduling Algorithm (P-TSA). In this method, according to the priority tasks are scheduled. Users can easily access the resource without knowing physical location transparently in the grid environment.

Tasks are scheduled according to the priority order firstly which is the main advantage. But this method is not so effective.

Vivekanandan

(2011)

Scheduling algorithm classification is given which are applicable to grid environment in distributed computing.

Ii is observed that this algorithm most

suitable for scheduling the tasks in the grid environment. But it takes longer execution time

Hirales-Carbajal

Workflow scheduling is an important

This work also includes a usage case

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1871

ISSN 2229-5518

et al (2010)

area in grids which allows the users in

an atomic way to process large scale. Workflow scheduling support is presented as an extension for Teikoku Grid Scheduling Framework (tGSF).

scenario, which illustrates how this

extension can be used for quantitative experimental study.

Singh et al

(2011)

To minimize the turnaround time in

heterogeneous environment they

presented a scheduling algorithm. This algorithm is used in dynamic environment where the job arrives at different time interval which is based on greedy method.

The main advantage is minimized the average turnaround time of all submitted jobs in a time slot. But it is not cost effective

Caron et al

(2010)

In a smart grid for sharing different

level of information study Demand

Response (DR) problems. To achieve the aggregate load profile suitable for utilities, a dynamic pricing scheme is introduced and studied the ability to get an ideal flat profile which depends on the information they share.

This method provides distributed stochastic strategies that successfully exploit this information to improve the overall load profile in grid environment. Time consumption is high

Lee et al (2011)

They mainly focuses on computing

grids. In this paper, propose a hierarchical framework and a job scheduling algorithm called Hierarchical Load Balanced Algorithm (HLBA) for Grid environment.

The main benefits are system load balancing and makespan minimization.

Delavar et al

(2010)

A new scheduling algorithm is

introduced for independent task

according to genetic algorithm and studied all the parameters in the grid. It is very efficient and dependable when compared to previous algorithm.

The main advantage of this work is reducing the repeating of the generations for reaching higher speed and also considering the communications costs with maintaining the fitness efficiency.

Mehmood et al

(2010)

Taken from leading computational

center, they proposed a grid scheduling algorithm using real workload traces. The performance of the proposed algorithm is evaluated using the real workload traces. A statistical analysis of workload traces is also included for presenting the nature behavior of jobs.

This proposed scheduling algorithms support true scalability, that is, they maintain an efficient approach when increasing the number of CPUs or nodes which is the main benefit.

Chang et al

(2012)

An Adaptive Scoring Job Scheduling

algorithm is proposed for the grid

environment. By composing the computing intensive and data intensive jobs, it will decrease the time completion of submitted jobs.

The algorithm proposed in this research for the grid environment will decrease the completion time of submitted jobs, which may compose of computing- intensive jobs and data-intensive jobs

Ma et al (2011)

They proposed a new computing

platform of distributed heterogeneous which aims to achieve the resource sharing and collaborative computing.

By using this algorithm the performance

of execution is improved. But unable to adapt with the dynamicity of the resources and the network conditions

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1872

ISSN 2229-5518

Torkestani and

Javad (2012)

A learning automata based job

scheduling algorithms is introduced for grids. According to the grid constraints, workload on each grid node is proportional capacity and varies with time.

By proposing a learning automata-based job scheduling algorithm for grids, makespan is reduced, flow time is reduced, and load balancing is done effectively.

Entezari et al

(2010)

A new task scheduling algorithm is

proposed for assigning the tasks to the grid and minimizing the tasks total makespan.

The tasks which are assigned to the grid resources with minimum makespan of the tasks which the major benefit.

The above table shows the pros and cons of several literatures, different authors give various ideas for scheduling in grid. But there is no effective method for scheduling in grid environment. This motivates the new approach for determining the scheduling algorithm in grids.

3. PROBLEMS AND DIRECTIONS

A grid computing is a combination of software and hardware infrastructure which provides pervasive, dependable, inexpensive and consistent access to high-end computational capabilities. There are serious challenges in grid nature i.e. handling of wide spread resources, different organization control, heterogeneity of hardware and software. Simulations are done on many scientific problems where practical test are difficult. But there are no standard developed in grid for simulation, so this simulation is also challenge for grid. Although, simulation models which are developed for traditional hardware systems is not appropriate for modern systems. So, to rectify all the above challenges, one must analyze the challenges current difficulty in using, deploying, promoting and developing grid computing. There are also challenges in grid like cost, availability, managing grid, control protocols and much more.

4. CONCLUSION

The scheduling in grid computing is discussed in this survey, mainly from the aspect of scheduling algorithms. In grid environments, new challenges is still making many researches to do projects as it is an interesting topic even though there are intensive studies in scheduling for parallel and distributed systems. To implement the scheduling easy access is needed and if the data availability is high and resources are in demand, security is needed. The primary challenges concerned in grid computing scenario are identified in this survey that is dynamism, data separation, heterogeneity and computation. This survey also find that the evolution of grid infrastructures, e.g., supports for complex application models such as DAG, resource information services and job migration frameworks, provides an opportunity to implement sophisticated scheduling algorithms. There is need for developing the algorithm for complex scheduling in grid computing.

REFERENCES

1. Xhafa, Fatos, and Ajith Abraham. "Computational models and heuristic methods for Grid scheduling problems." Future generation computer systems26, no. 4 (2010): 608-621.
2. Xu, Jin, Albert YS Lam, and Victor OK Li. "Chemical reaction optimization for task scheduling in
grid computing." Parallel and Distributed Systems, IEEE Transactions on 22, no. 10 (2011): 1624-
1631.
3. Kołodziej, Joanna, and FatosXhafa. "Meeting security and user behavior requirements in Grid
scheduling." Simulation Modelling Practice and Theory 19, no. 1 (2011): 213-226.

IJSER © 2015 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1873

ISSN 2229-5518

4. Sun, Weifeng, Yudan Zhu, Zhiyuan Su, Dong Jiao, and Mingchu Li. "A priority-based task scheduling algorithm in grid." In Parallel Architectures, Algorithms and Programming (PAAP), 2010
Third International Symposium on, pp. 311-315. IEEE, 2010.
5. Vivekanandan, K. "A Study on Scheduling in Grid Environment." InInternational Journal on
Computer Science and Engineering (IJCSE). 2011.
6. Hirales-Carbajal, Adan, Andrei Tchernykh, Thomas Roblitz, and RaminYahyapour. "A grid simulation framework to study advance scheduling strategies for complex workflow applications." In Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, pp. 1-8. IEEE, 2010.
7. Singh, Saumitra, and Krishna Kant. "Greedy grid scheduling algorithm in dynamic job submission
environment." In Emerging Trends in Electrical and Computer Technology (ICETECT), 2011
International Conference on, pp. 933-936. IEEE, 2011.
8. Caron, Stéphane, and George Kesidis. "Incentive-based energy consumption scheduling algorithms
for the smart grid." In Smart grid communications (SmartGridComm), 2010 First IEEE international conference on, pp. 391-396. IEEE, 2010.
9. Lee, Yun-Han, SeivenLeu, and Ruay-Shiung Chang. "Improving job scheduling algorithms in a grid environment." Future Generation Computer Systems 27, no. 8 (2011): 991-998.
10. Delavar, A. Ghorbannia, Mohsen Nejadkheirallah, and Mehdi Motalleb. "A new scheduling algorithm for dynamic task and fault tolerant in heterogeneous grid systems using Genetic Algorithm." In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, vol. 9, pp. 408-412. IEEE, 2010.
11. Mehmood Shah, Syed Nasir, Ahmad Kamil Bin Mahmood, and Alan Oxley. "Analysis and evaluation of grid scheduling algorithms using real workload traces." In Proceedings of the International Conference on Management of Emergent Digital EcoSystems, pp. 234-239. ACM, 2010.
12. Chang, Ruay-Shiung, Chih-Yuan Lin, and Chun-Fu Lin. "An adaptive scoring job scheduling algorithm for grid computing." Information Sciences 207 (2012): 79-89.
13. Ma, Tinghuai, Qiaoqiao Yan, Wenjie Liu, Donghai Guan, and Sungyoung Lee. "Grid task scheduling:
algorithm review." IETE Technical Review 28, no. 2 (2011): 158-167.
14. Torkestani, JavadAkbari. "A new approach to the job scheduling problem in computational grids." Cluster Computing 15, no. 3 (2012): 201-210.
15. Entezari-Maleki, Reza, and Ali Movaghar. "A genetic-based scheduling algorithm to minimize the makespan of the grid applications." In Grid and Distributed Computing, Control and Automation, pp.
22-31. Springer Berlin Heidelberg, 2010.

IJSER © 2015 http://www.ijser.org