rvGAHP - Push-based job submission using reverse SSH connections

Scott Callaghan, Gideon Juve, Karan Vahi, Philip J. Maechling, Thomas H. Jordan, & Ewa Deelman

Submitted August 2017, SCEC Contribution #7540

Computational science researchers running large-scale scientific workflow applications often want to run their workflows on the largest available compute systems to reduce makespan. Workflow tools used in distributed, heterogeneous, high performance computing environments typically rely on either a push-based or pull-based approach for resource provisioning from these compute systems. However, many large clusters have moved to two-factor authentication for job submission, making traditional automated push-based job submission impossible. Pull-based approaches such as pilot jobs may lead to increased complexity and a reduction in node-hour efficiency. In this paper, we describe a new, efficient approach based on Condor-G called reverse GAHP (rvGAHP) that allows us to push jobs using reverse SSH submissions with better efficiency than pull-based methods. We successfully used this approach to perform a large probabilistic seismic hazard analysis study using SCEC's CyberShake workflow in March 2017 on Titan at Oak Ridge National Laboratory.

Key Words
scientific workflows, remote job submission, resource provisioning, seismic hazard analysis

Citation
Callaghan, S., Juve, G., Vahi, K., Maechling, P. J., Jordan, T. H., & Deelman, E. (2017, 08). rvGAHP - Push-based job submission using reverse SSH connections. Oral Presentation at International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2017). doi: 10.1145/3150994.3151003.


Related Projects & Working Groups
CyberShake