1st Workshop on Large-Scale Computing (LASCO'08)

June 23rd 2008 Boston, MA, USA

LASCO'08 - Programme

Detailed program and invited speakers bio can be checked at USENIX's LASCO'08 website 

8:30 Invited Speakers

Performance and Forgiveness

 

XtreemOS: a Linux-based Operating System for Large Scale Dynamic Grids

 

10:30 Coffee break

 
11:00 Regular papers

XOS-SSH: A Lightweight User-Centric Tool to Support Remote Execution in Virtual Organizations

  • An Qin, Haiyan Yu, Chengchun Shu, and Bing Xu, Institute of Computing Technology, Chinese Academy of Sciences (cancelled)
    • Paper in HTML | PDF
      Presentation of this paper has been cancelled due to unforeseen circumstances.

Improving Scalability and Fault Tolerance in an Application Management Infrastructure

  • Nikolay Topilski, University of California, San Diego; Jeannie Albrecht, Williams College; Amin Vahdat, University of California, San Diego (presentation slides)
 

The XtreemOS JScheduler: Using Self-Scheduling Techniques in Large Computing Architectures


12:15 Lunch

13:45 Regular papers

 

A Multi-Site Virtual Cluster System for Wide Area Networks

  • Takahiro Hirofuchi and Takeshi Yokoi, National Institute of Advanced Industrial Science and Technology (AIST); Tadashi Ebara, National Institute of Advanced Industrial Science and Technology (AIST) and Mathematical Science Advanced Technology Laboratory Co., Ltd.; Yusuke Tanimura, Hirotaka Ogawa, Hidetomo Nakada, Yoshio Tanaka, and Satoshi Sekiguchi, National Institute of Advanced Industrial Science and Technology (AIST)
 

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

  • Zoe Sebepou, Kostas Magoutis, Manolis Marazakis, and Angelos Bilas, Institute of Computer Science (ICS), Foundation for Research and Technology—Hellas (FORTH) (presentation slides)
 

Striping without Sacrifices: Maintaining POSIX Semantics in a Parallel File System

  • Jan Stender, Björn Kolbeck, and Felix Hupfeld, Zuse Institute Berlin (ZIB); Eugenio Cesario, Institute High Performance Computing and Networks of the National Research Council of Italy (ICAR-CNR); Erich Focht and Matthias Hess, NEC HPC Europe GmbH; Jesús Malo and Jonathan Martí, Barcelona Supercomputing Center (BSC) (presentation slides)

 

15:00 Coffee break

 
15:30 Invited speakers 

Large Scale in What Dimension?

  • Miron Livny, University of Wisconsin—Madison

Computing systems are complex entities with a diverse set of dimensions. The scale of such a system in each of these dimensions is likely to have a profound impact on how the system is designed, developed, deployed, maintained, supported, and evolved. Today, we find systems that support a large number of users deployed at a large number of sites. These systems support a large suite of applications, consist of a large number of software components, are developed by a large community, manage a large number of computer and storage elements, operate over a large range of physical distances, and/or evolve over a large number of versions and releases. In the more than two decades that we have been working on the Condor distributed resource management system, we have experienced a dramatic change in its scale in all of these dimensions. We will discuss what we learned from dealing with these changes, and what we do to prepare our project for the never-ending stream of scale changes.

Miron Livny received a BSc degree in Physics and Mathematics in 1975 from the Hebrew University and MSc and PhD degrees in Computer Science from the Weizmann Institute of Science in 1978 and 1984, respectively. Since 1983 he has been on the Computer Sciences Department faculty at the University of Wisconsin—Madison, where he is currently a Professor of Computer Sciences, the director of the Center for High Throughput Computing, and leader of the Condor project.

Dr. Livny's research focuses on distributed processing and data management systems and data visualization environments. His recent work includes the Condor distributed resource management system, the DEVise data visualization and exploration environment, and the BMRB repository for data from NMR spectroscopy. (presenttaion slides)

 

Experiences in Developing Lightweight Systems Software for Massively Parallel Systems

  • Arthur Maccabe, University of New Mexico

The goal of lightweight system software is to get out of the way of the application, while providing isolation between applications. Resource management is, for the most part, left to the application. Lightweight approaches have been successfully used in the some of the largest systems, including the first Teraflop system (ASCI/Red at Sandia National Laboratories), the Cray XT3, and IBM's Blue Gene system. This talk considers our experience developing lightweight systems software for massively parallel systems and contrasts the lightweight approach to other approaches, including Linux, Hypervisors, and microkernels.

Barney Maccabe received his BS in Mathematics from the University of Arizona and his MS and PhD degrees from the Georgia Institute of Technology in Information and Computer Sciences. He currently serves as the Interim CIO for the University of New Mexico. Professor Maccabe has held a faculty appointment in the Computer Science department at UNM since 1982. From 2003 to 2007, he served as director of UNM's Center for High Performance Computing. Professor Maccabe's research focuses on scalable systems software. He was a principal architect of a series of "lightweight" operating systems: SUNMOS for the Intel Paragon, Puma/Cougar for the Intel Tflop, and most recently Catamount for the Cray XT3. In addition to developing system software for MPP systems, Professor Maccabe has projects to apply lightweight approaches to large scale sensor networks, high performance I/O systems, and virtual machine monitors.(presentation slides)