Thursday, June 30, 2011

LIGO Needs Your Opportunistic Cycles

OSG Resource Providers,

The LIGO VO (http://www.ligo.caltech.edu/) is looking for cycles. LIGO is currently running on approximately one-third of the OSG Resources and is ready to expand to more. Please take a look to see if your resource is enabled to allow LIGO VO jobs, and if it is not please consider enabling it. Below is a description of the application. LIGO has a strong application group that will work closely with resource providers to get the Einstein at OSG Application running. If you'd like to help please enable the LIGO VO and contact goc@opensciencegrid.org to let us know you'd like LIGO to run at your site. We are more than happy to provide assistance with enabling the VO or getting the application working on your site.

Thank you in advance for your participation,
Rob Quick
OSG Operations Coordinator
and
Robert Engel
LIGO Application Developer

About E@OSG
-----------

Einstein@OSG is software written to allow BOINC projects such as Einstein@Home to be executed in Grid environments. Initially developed at the Max-Planck-Institute for Gravitational Physics in Germany it is running today on the German D-Grid and the Open Science Grid. The software consists of a server scheduling and submitting jobs to grid resources and a client monitoring and executing the Einstein@Home client. Einstein@OSG has completed hundred's of million cpu hours on a vast number of heterogeneous grid resources mainly by utilizing available idle cycles.


E@OSG Execution on the OSG
--------------------------

The client supports following important features:

* ability to run on heterogeneous clusters providing i386 and x86_64 worker node architectures
* execution in $OSG_WN_TMP to avoid utilizing the network file system
* checkpointing on external SRM resources as well as network file systems such as NFS
* full support for eviction of jobs, automatic suspend and restart of the job
* automatic error detection and recovery of jobs
* support for GPU acceleration when available
* fine grained control about the client run-time (minutes to days) and scheduler (1 to 1000s of jobs)

which allows LIGO to utilize all compute elements currently available to LIGO to run jobs:

https://twiki.grid.iu.edu/bin/view/Main/RobertEngel#EinsteinHome_at_the_Open_Science

Benefits of running E@OSG on your Site
--------------------------------------

E@OSG continuously runs on many OSG resources mainly utilizing idle cycles that otherwise might be wasted. Our automatic error detection also informs us quickly about errors on grid resources, such as:

* lost mounts on worker nodes
* misconfigured firewalls on worker nodes
* failing GRAM and GridFTP services
* wrong set permissions on SRM and NFS directories

By opening GOC tickets we can quickly inform the resource provider about problems on the resource that otherwise might be unnoticed.