Introduction

Here's the dream:

Computers have revolutionized research, and that revolution is only beginning. Every day, scientists and engineers all over the world use them to study things that are too big, too small, too fast, too slow, too expensive, too dangerous, or just too hard to study any other way.

Now here's the reality:

Every day, scientists and engineers all over the world waste time wrestling with computers. Tasks that should take a few moments take hours or days, and many things never work at all. And even when things do work, most scientists have no idea how reliable their results are.

Most of the pain that researchers feel stems from not knowing how to develop software systematically, how to tell if their programs are working correctly, how to share their work with others (except by mailing files to one another), or how to keep track of what they've done. This sorry state of affairs persists for four reasons:

  • No room, no time. Everybody's curriculum is full—there's simply not space to add more about computing without dropping something else.
  • No standards. Reviewers and granting agencies don't check whether software is correct, ask how long it took to write, or count it toward tenure, so there's no incentive for scientists to do better.
  • The blind leading the blind. Senior researchers can't teach the next generation how to do things that they don't know how to do themselves.
  • The cult of big iron. Attention and funding mostly goes to things that politicians and university presidents can brag about on opening day, rather than to the basic skills that almost everyone uses.

Our goal is to show scientists and engineers how to do more in less time and with less pain. Our lessons have been used by more than four thousand learners in over a hundred two-day workshops since the spring of 2010. Here's how they can help:

  • If you've ever overwritten the wrong file, we'll show you how to use version control.
  • If you've ever spent hours typing the same commands over and over again, we'll show you how to automate those tasks using simple scripts.
  • If you've ever spent an afternoon trying to figure out what the program you wrote last week actually does, we'll show you how to break your code into modules that you can read, debug, and improve piece by piece.
  • If you've ever spent days copying and pasting data in text files and spreadsheets, we'll show you how a database can do the work for you.
  • If you have to access millions of CPU hours for your project without adding cost overhead, we'll show you how to do that via grid computing.
  • If you don't know what is open science grid, grid computing or distributed high through put computing, we'll show you what they are.

About the Open Science Grid

The OSG provides common service and support for resource providers and scientific institutions using a distributed fabric of high throughput computational services. The OSG does not own resources but provides software and services to users and resource providers alike to enable the opportunistic usage and sharing of resources. To find out more, please visit the Open Science Grid consortium page.

OSG Connect

The OSG Connect service provides a login service to the Open Science Grid infrastructure. It includes a job submission server based on HTCondor, a local scratch data service Stash, and is integrated with Globus family of services for data transfer and publication. User documentation including examples of running scientific applications on the OSG is available in the OSG "ConnectBook".

OSG Acknowledgement

The Open Science Grid is jointly funded by

Whenever you make use of Open Science Grid resources, services or tools, we would be grateful to have you acknowledge OSG in your presentations and publications.

Acknowledgement section:

For example, you can add in your acknowledgement section:

  • This research was done using resources provided by the Open Science Grid.

Citations:

We recommend the following references

  • Pordes, R. et al. (2007). "The Open Science Grid", J. Phys. Conf. Ser. 78, 012057.doi:10.1088/1742-6596/78/1/012057.

  • Sfiligoi, I., Bradley, D. C., Holzman, B., Mhashilkar, P., Padhi, S. and Wurthwein, F. (2009). "The Pilot Way to Grid Resources Using glideinWMS", 2009 WRI World Congress on Computer Science and Information Engineering, Vol. 2, pp. 428–432. doi:10.1109/CSIE.2009.950.

About Software Carpentry

Software Carpentry is an open source project. Our instructors are volunteers, and all of our lessons are freely available under the Creative Commons - Attribution License, so you can re-use and re-mix them however you want so long as you cite us as the original source.

Like all volunteer projects, Software Carpentry needs your help to grow. If you find a bug, please file a report in our GitHub repo. If you would like to host a workshop, please get in touch; if you'd like to teach, we run an instructor training course; and if you'd like to write lessons or exercises, please let us know.

To find out more, please visit the http://software-carpentry.org or read these papers or our most popular blog posts.

Acknowledgments

Software Carpentry has been made possible by the generous support of:

SWC thanks Brent Gorda for helping to run the first version of SWC course.

This book is dedicated to Betty Jennings, Betty Snyder, Fran Bilas, Kay McNulty, Marlyn Wescoff, and Ruth Lichterman,
the original programmers of the ENIAC.