GridSpace 2.7.21 User Guide

Welcome to GridSpace! This document is meant as a user guide for first-time users, providing basic information which will enable you to begin using the GridSpace platform in your research environment.

GridSpace enables users to develop and execute virtual experiments on the underlying computational and storage resources. If you are a scientist involved in computationally-demanding research activities then GridSpace may just be the right answer for you! Read on to find out how to begin using GridSpace in support of your research.

This document is divided into the following sections:

  1. A (very) brief introduction to GridSpace
  2. What you will need
  3. Accessing the Experiment Workbench
  4. Your first experiment
  5. Basic Workbench features
  6. Advanced Workbench features

A (very) brief introduction to GridSpace

The GridSpace platform was designed with simplicity of use in mind. We have extensively analyzed the ways in which scientists conduct their day-to-day research activities and tried to come up with a framework which closely mimics this experience while still providing access to powerful distributed computing architectures. Thus, GridSpace is founded on the following principles:

What you will need

Getting to know a new software platform is always a hurdle; however we've tried to reduce the learning curve for new users and enable them to hop right in and try their own hand at developing experiments using the GridSpace Workbench. You will probably be happy to hear that you do not need to install any software on your computer - all you need is the following:

Accessing the Experiment Workbench

Simply point your browser to GridSpace Experiment Workbench instance (or click here), scroll down to the section labeled This is GridSpace2 installation for ... and click the Login » button. The system will respond by displaying a login screen where you need to select the machine on which your user account has been activated and input your username and password. Once authenticated, you will be presented with the basic view of Experiment Workbench, which looks a bit like this:

EW basic view

Before we get into a discussion on what's important and what is less so, why not try your hand at writing and executing your very first experiment?

Your first experiment

See the big blank area under the New snippet label of your Workbench? This is where you should input your experiment code. Let's take the Workbench for a spin by writing a simple snippet in bash. The list of available interpreters will depend on the executor. In our case, we need to click the name of the interpreter and select Bash from the pull-down menu, as shown below:

EW interpreter menu

As you can see, GridSpace offers a plethora of interpreters, both general-purpose and application-specific. For now we will resort to writing a shell snippet; thus we select Bash from the list and can now type in our code in the snippet window:

echo 'Hello, world!'

Now position your mouse next to the Actions menu and select the Run snippet icon image (this orders the Workbench to execute the snippet code on the host machine and retrieve its results). You should see a similar output in the console window:

$ echo 'Hello, world!'
Hello, world!
$ exit

As you can see, the Workbench has executed your snippet and reported its results. Note that as long as the snippet remains in execution mode, a red square button will be displayed in the console window.

Speaking of multiple-snippet experiments, let's try adding another snippet to our experiment. Click the Add new snippet icon button in the snippet control panel. This will add another snippet at the end of the experiment where we again can select an interpreter and enter the code we wish to execute (optionally, you can insert a new snippet before the current one by clicking Actions and selecting Insert snippet icon). If you click the interpreter tab, the Workbench will display a list of available interpreters. Let's try selecting Ruby 1.8.7 (or any other version of Ruby) for a change. Once that's done, type in the following Ruby code:

puts 'Hello again!'

You can now execute the new snippet by clicking the Run snippet icon icon inside it. In our example, after the execution you should see the following new ooutput in your console:

puts 'Hello again!'
 
Hello again!

Once your experiment is ready, you may want to save it. Clicking the Save experiment icon button enables you to store the experiment in your home directory on your server account. Optionally, you can also specify a name, a description and some comments regarding the contents of your experiment by clicking the Experiment metadata icon button (see here for a more involved description of experiment metadata). To save the experiment you just need to provide a file name and click OK.

Basic Workbench features

The Workbench automates common actions associated with developing experiments. Let's take a look at what you may do with the Workbench:

Adding and removing snippets

Each snippet window contains a button labeled Actions where you can activate actions relating to the current snippet. This includes running the snippet, adding another snippet either as a direct successor/predecessor of the current snippet, removing the snippet or merging its contents with the previous snippet.

Saving your experiment

Once your experiment is ready, you can save it by clicking the Save Experiment icon in the experiment toolbar (Save experiment icon). If this is a new experiment, it may also be useful to provide some metadata describing its purpose and usage by clicking Experiment metadata icon. Providing such metadata is optional, though it may enable your collaborators to rerun and extend the experiment with little difficulty (see here for a description of the metadata mechanism).

Saving the experiment serializes it to a file in the current directory on the server machine, the contents of which are displayed in the left-hand panel in the Workbench. Experiment serialization is XML-based and covers all aspects of the experiment.

Once serialized, the experiment may be reopened in the Workbench window or shared with other users of the platform. In order to open an existing experiment in the Workbench simply click the name of the experiment file in the Files panel.

Naturally, you can also close the current experiment by clicking the Experiment close icon icon in its tab above the snippet window. If you wish to start a new experiment, click the New experiment button adjacent to open experiment tabs.

Changing execution directory

In order to allow you for changing execution directory of snippets we prepared a textbox in the right corner of the snippet panel. Directory should be a path relative to your home directory. Using the Context, GridSpace can easily determine where exaclty files used in the snippet are located, and seamlessly transfers them between execution sites and different execution directories. Context directory is automatically created by GridSpace so you don't have to bother if it exists or not. Execution context

Advanced Workbench features

In addition to basic experiment operations, the Workbench also provides some advanced features which may prove helpful. You may want to make use of the following tools:

File management

The Workbench enables you to manage the contents of your server home directory. This directory can be used to store experiments as well as any arbitrary files you may deem useful while working with GridSpace. A sample view of the GridSpace directory browser is presented below.

EW files panel

The following operations are available in the browser:

The Menu link provides additional operations, including the ability to create new files, subdirectories and experiments. Moreover, a pull-down operations menu is available for each listed item. You can access it by clicking the small arrow next to the selected filename. This menu contains the following options:

Publishing experiments with Collage

GridSpace comes integrated with a sophisticated mechanism for publishing the contents of your experiments. Create real live papers by embedding your results and code directly within the scientific publication and enabling readers to interact with it! It's easy - here's how:

To show you how it works, let's start with a very simple bash experiment which produces an output file. The experiment consists of a single snippet:

echo 'Hello, world!' > output_file.txt

As is easy to predict, executing this experiment will produce a file called output_file.txt in your current working directory. This file is then the result of your experiment. Let's say we want to publish it as an asset. In order to do so, we click the menu bar at the bottom of the snippet window, where it currently says No outputs defined for this snippet. This will bring up a menu from which we select Add new simple output. The Workbench will grace us with the following window where we can define the properties of our new output asset:

Output asset panel

The name is an arbitrary element of the asset's metadata record and can be set to whatever value you feel best describe the asset in question. Clicking the Pencil edit icon icon next to the URL field will bring up the list of executors you're currently logged into and enable you to select a single file from any available executor as the asset payload. The sendable option enables your copy of the output file to be viewed by external clients (otherwise any user who reenacts your experiment will obtain a separate copy of its output assets.)

Once you're satisfied with the contents of our experiment and have defined the relevant input and output asset, the experiment can be released. Releasing an experiment enables other user to interact with its contents. To do so, open the drop-down menu next to the Save experiment icon button in the top right-hand corner of the Workbench and select the release scope (you can release the experiment to members of specific groups on the Executor host, to yourself only or to everyone). Selecting one of the options will save the experiment in its current state and enable its publication through embedding asset URLs in external HTML documents. The Workbench also creates a "sample" view of your publication, with all assets embedded (remember to enable popups for the Workbench site in your browser!). In order to display the "sample" view go to the Releases panel on the left of the Workbench and click the top-most item. Your sample view should look a bit like this:

Sample view of the released experiment

The goal of this page is to showcase how your assets will look when embedded in a real document. Following some basic information about the experiment, the Workbench displays a numbered list of assets, headed by the so-called Master Widget, which is used to authenticate users and review the status of all payload (input/output/snippet) assets. Since you are already logged into the Workbench, the Master Widget will not require you to log in again - instead, it will simply display your username and the Executor you're using. The icons below your username correspond to experiment assets - if they are overlayed with a sunglass image it means a corresponding asset is being loaded, otherwise it is already loaded.

The View embed code button displays an HTML code fragment which you need to paste into your external publication in order to embed a given asset. Assets can be embedded into any HTML document, just by pasting in their corresponding embed code fragments - thus, Collage can be used to easily enrich HTML documents, wikis, blogs, fora etc. with interactive content!

Exclamation icon Important note: All external documents to which Collage content is to be added must embed the Master Widget in order to enable external users to log into the Executor and display the payload of other assets. Unauthenticated users will not be able to view interactive content.

You will notice that each asset type (input, output, snippet) comes with its own set of controls displayed at the bottom of the asset frame. These buttons enable users to interact with Collage content by uploading files, rerunning code and collecting output:

Additionally, snippet assets display an Output tab above the main text area - clicking this tab will visualize any standard output that may have been generated by the snippet during execution.

Taken together, the three types of assets (input, output and snippet) facilitate broad interaction with the published experiment, enabling users to review and reenact your code, as well as perform computations on their own datasets. Results can be visualized directly in the browser window or retrieved as files.

Exclamation icon Note: When interacting with Collage, each user will receive their own temporary copies of all input, output and snippet assets. Modifying any of these does not impact the original contents of the published experiment. Interaction with assets is therefore transparent from the point of view of other users and does not carry the risk of corrupting the original experiment.

Subsnippets

Subsnippet is a piece of code embedded in another snippet. GridSpace allows to mix together different programming languages by adding a snippet in the middle of another. In order to do that choose New subsnippet from the Actions menu or simply use the icon New subsnippet located in the top-right corner of a snippet panel. Each subsnippet may be assigned to be interpreted by any of the available interpreters and executed on any available execution site. GridSpace integrates with master snippet by injection of simple code in the chosen programming language. Execution is performed by GridSpace seamlessy for the user. Adding new subsnippet Subsnippet can either be executed synchronously SYNC or asynchronously ASYNC. Synchronous invocation blocks the master snippet untill the subsnippet is finished. An asynchronous subsnippet does not block, so there is a need to provide a method for synchronization with the end of running subssnippets. The BARRIER element can be used for this purpose. Each Barrier has to be bound with asynchronous subsnippet. Barrier blocks untill all the subsnippet instances are finished. Adding new subsnippet Each subsnippet can be executed in a directory specified by in the Context textbox. The value of the context can be any string in the master snippet's programming language. In the example above we are using value of i to sweep through execution directories (sub/dir#{i}) inside the for loop. Different directories may be used to separate instances of a subsnippet. Another method of separating execution outputs is using environment variable GS2_SNIPPET_SESSION_RANK to recognize file outputs.

In order to communicate between the master- and sub-snippet we use input/output files assigned to the subsnippet. Subsnippet's output assets are copied by GridSpace to the directory automatically created on the master snippet executor. Output file location on the executor of a master snippet is a directory <subsnippet_context> located in the <master_context> folder. Inputs are taken from the same directory, but you need to create it first manually in order to put your input file there.

There is no preventyou from embedding subsnippet in another subsnippet so the subsnippet hierarchy can be as deep as you need.

WebGUI mechanism

GridSpace Experiment Workbench experiments sometimes require more advanced user interaction. For such cases the WebGUI mechanism was introduced to the platform. It enables the use of any web application (after a few simple tweaks) to be a part of an experiment. This comes in handy when intermediary experiment results have to be investigated or the experiment execution course has to be maintained by the user. For simple interaction cases an out-of-the-box WebGUI implementation is provided while for more complex an integration schema is provided. The two approaches are described in the sections below.

Using embedded WebGUI implementation

To build simple web forms and present them to the user who is executing an experiment and afterwards retrieve the results provided by the user you need to first describe the form and then with the help of a REST call notify the Workbench about the request (details about the REST calls are given in the next section). To describe the form a JSON notation should be used like in the example below:

{webguiDocType: 'request', label: 'Sample Input', data:[
  {name: 'title', label: 'Title', pref: 'text'},
  {name: 'text', label: 'Sample text', pref: 'richTextArea'}
]}

In the example above two fields were requested - a text and a rich text inputs. The rendered form should be similar to the one in the picture below:

WebGUI sample form

The following types of fields are supported by the out-of-the-box WebGUI implementation:

If applicable, a default value for a given input can be defined by using the value (for label, text, textArea, richTextArea and radio inputs) or values (for check and select inputs) elements. Each input is decorated by a label which can be defined by using the label element. For inputs with multiple selection values options element should be used (see below for an example). Let us assume that the developer wants to use the out-of-the-box WebGUI web application to request user gender (via radio inputs, with the female option checked by default), age (via single-line text input) and CV (via rich text input). The following JSON definition would be valid:

{webguiDocType: 'request', label: 'User data', data: [
  {name: 'gender', label: 'Gender', pref: 'radio', options: [
    {label: 'Male', value: 'male'},
    {label: 'Female', value: 'female'}
  ], value: 'female'},
  {name: 'age', label: 'Age', pref: 'text'},\
  {name: 'cv', label: 'CV', pref: 'richTextArea'}
]}

If the presented forms are too simple to implement your case you can always write your own web application and integrate it with Workbench by following the instructions in the next section.

Implementing your own WebGUI application

The WebGUI protocol is quite simple (an external web application has to comply with it to be used by the WebGUI mechanism). First, from within a snippet of an experiment a POST call needs to initialize the WebGUI process by sending the following POST parameters to the ${GS2_WEBGUI_ENDPOINT}/start endpoint:

After sending the above POST request the status of data retrieval from the user should be checked by constructing the following GET request:

${GS2_WEBGUI_ENDPOINT}/checkStatus?gs2ExperimentSessionId=${GS2_EXPERIMENT_SESSION_ID}

If the data is not yet available the response body will hold the IN_PROGRESS value. Otherwise a JSON response will be sent. If the user cancels the dialog the CANCEL value will be returned instead.

Experiment metadata

The Experiment metadata icon button placed at the top of the experiment window brings up a menu which enables you to manipulate experiment metadata and define its properties. Experiment properties consist of the following items:

Most of these properties can be manipulated by the Workbench - a sample properties window is presented below.

Experiment metadata panel

GridSpace provides a handy tool with which to prepare experiment manuals (if needed) - clicking Add manual displays another window where you can type your experiment manual with the use of basic formatting tools. Once you're satisfied with the properties you have defined for your experiment, click OK. These properties will be serialized along with the experiment and stored in the experiment file.

Grid jobs

Experiment workbench is equipped with a module simplifying gLite job submission. As the job submission feature is available as a regular executor it can be acccessed through the login page. The access is granted to the users who are members of virtual organisation vo.plgrid.pl. In order to facilitate the usage two authentication methods were introduced. Authentication may be performed either by uploading previously generated PROXY (voms extension is required) or using proxy generation applet. The browser applet uses key and certificate (.pem) to securely create a proxy on the client side and upload it to the server.

After logging-in the Workbench shows the contents of the user's LFC home directory lfn://lfc.grid.cyf-kr.edu.pl/grid/<user>. Grid executor can be used to submit snippets for execution in the grid. Previously created snippets may be easily enabled to to run on grid by selecting appropriate interpreter/executor pair. Analogously to any other method of execution supported by GridSpace, files in the user directory (remember - this applies to user's LFC files) can be made accessible to the snippet after specifying Input. Before the job starts default SRM replica of the marked LFC file will be copied on behalf of the user to the worker node. Similarly, any files specified as an output will be registered in LFC after the job is completed.

How to use Input/Output in the grid executor

There are two fields that has to be specified when creating Input/Output - Path and URL. Analogously to other executors Path is the string that is used in the snippet to point to the file. Remember that while specifying Input, Path can contain ONLY the name of the input file. The reason of this limitation results from the flat structure of the working directory on the worker node before the job run. URL field points out the file in LFC. When specifying URL three different variants are allowed:

RESTful invocation

Experiment Workbench provides a REST programming interface for running code on available executors. It allows for asynchronous task submission, management and execution information retrieval. A user is allowed to invoke REST services after logging to the executor to be called through a REST interface. Experiment session identifier plays a role of an authentication token as long as the session is active on the server side. Additionally, the Workbench allows to use the REST services after the user has logged out if the session is persisted when exiting GridSpace (a user is asked to do so while trying to log out).

Exposed services

Programmatic interface is exposed as a series of services accessible either by POST or GET HTTP methods. Services accept and produce messages containing request bodies encoded in one of these three mime types: application/json, application/xml and text/xml. URL template presented below should help to understand the invocation mechanism:

<workbench-base-url>/restful/<executor-id>/<service>/<experiment-session-id>[/<controller-id>]

Sample URL of a service listing submitted tasks in the current session on the grid is the following: https://gs2.plgrid.pl/workbench/restful/grid-executor-0.1.1/list/13b4c7aba29

The list of exposed REST services is presented below.

RUN service

Name: RUN
Http method: POST
Returns: execution controller id (field controllerId in JSON message, controller-id in XML message)
Url template: <workbench-base-url>/restful/<executor-id>/run/<experiment-session-id>
Description: Service allows to run execution by specifying execution code, input and output files.

Request message fields:

Listing below presents sample request message body in JSON format. This sample shows all of the fields present in the run-execution object structure. Cmd, stageIn and stageOut fields are not compulsory.

{
"cmd":"/bin/bash",
"code":"cat file1.txt > file2.txt",
"input":[{
	"path":"file1.txt",
	"url":"lfn://abc.def.ghi/file1.txt"
	}],
"output":[{
	"path":"file2.txt",
	"url":"null
	}]
}

And adequate success response:

{
"responseStatus":"SUCCESS",
"errorMessage":null,
"controllerId":"601e8c2f-e172-45e2-b642-d9b6e1d46695"
} 

Error message response body:

{
"responseStatus":"ERROR",
"errorMessage":"Sample error cause",
"controllerId":null
}

Exactly the same request in XML format:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<run-execution api-version="v1">
    <cmd>/bin/bash</cmd>
    <code>cat file1.txt > file2.txt</code>
    <input>
        <path>file1.txt</path>
        <url>lfn://abc.def.ghi/file1.txt</url>
    </input>
    <output>
        <path>file2.txt</path>
    </output>
</run-execution>

Corresponding response on success:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<execution-run api-version="v1">
    <response-status>SUCCESS</response-status>
    <controller-id>601e8c2f-e172-45e2-b642-d9b6e1d46695</controller-id>
</execution-run>

Error message response body:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<execution-run api-version="v1">
    <response-status>ERROR</response-status>
    <error-message>Sample error cause</error-message>
</execution-run>

LIST service

Name: LIST
Http method: GET
Url template: <workbench-base-url>/restful/<executor-id>/list/<experiment-session-id>
Description: Service allows to list all running executions.

CANCEL service

Name: CANCEL
Http method: POST
Url template: <workbench-base-url>/restful/<executor-id>/cancel/<experiment-session-id>/<controller-id>
Description: Service allows to cancel running execution.

STATUS service

Name: STATUS
Http method: GET
Url template: <workbench-base-url>/restful/<executor-id>/status/<experiment-session-id>/<controller-id>
Description: Service allows to get status of running execution.

INFO service

Name: INFO
Http method: GET
Url template: <workbench-base-url>/restful/<executor-id>/info/<experiment-session-id>/<controller-id>
Description: Service allows to get information about running execution. Information contains status, time of submission and time of execution end.

OUTPUT service

Name: OUTPUT
Http method: GET
Url template: <workbench-base-url>/restful/<executor-id>/output/<experiment-session-id>/<controller-id>
Description: Service allows to get standard output of an execution. Failure message is returned when the output is not ready yet.

ERROR service

Name: ERROR
Http method: GET
Url template: <workbench-base-url>/restful/<executor-id>/error/<experiment-session-id>/<controller-id>
Description: Service allows to get standard output of an execution. Failure message is returned when the standard error is not ready yet.