Supercomputing Networking Research Education Ohio Supercomputer Center Site Map Staff Directory Support
Supercomputing image

Galaxy

Introduction

Galaxy is a program for dynamic transitive closure analysis. The purpose of the program is to perform a "dynamic transitive closure analysis" on correlations of input data sets. Transitive closure is the concept of clarifying the existence of meaningful connection or relationship (direct or indirect) between two system elements. The dynamic aspect of the dynamic transitive closure analysis is simply that the analysis is conducted dynamically, by progressively eliminating from consideration direct relationships which do not meet minimum criteria. By examining a range of minimum criteria and recording the value at which the transitive closure is lost between two elements, the strength of any transitive relationship between two system elements can be characterized.

Version

Version 64 is currently available at OSC.

Availability

Galaxy is available on the Altix.

Usage

You must name your data set galaxy.in and it must be in the directory where the executable is invoked.

The following run-time parameters must be indicated:

  1. number of rows to analyze (number of data rows to read in the galaxy.in file) : depends on size of file
  2. number of cell lines to read (number of data columns to read in the galaxy.in file; max value is 64 ): depends on columns in file
  3. enter resolution of DTCA scan (the number of points between 0 and 1 to examine for transitive closure; 1000 will generate 1/1000 th interval accuracy; max value is 10,000) : recommended value of 1000
  4. enter output level (0 for minimal; 3 for debug output): recommended value of 0
  5. enter path tracking value (1 to track paths; 0 to omit path tracking): 0

The following is a sample galaxy.in file.

  Cell1 Cell2 Cell3
Drug1 0.5 0.4 0.1
Gene2 0.3 0.9 0.5
Gene3 1.0 0.0 0.9

Two output files are generated:

  1. dtca.dat : this files contains the cumulative transitive closure analysis thresholds between all pairs of elements (indexed by row number in the input). The smaller values in the dtca.dat file identify the presence of close edges which must be removed to break transitive closure, indicating a strong co-correlation present in the overall graph. Large values indicate a weak co-correlating relationship and that eliminating long edges is sufficient to break transitive closure between the two elements.
  2. medianpath.dat : this file contains the DTCA profile of the graph, identifying median transitive closure path lengths for the nodes present in the graph. It is used to describe the overall character if the relationships in the graph. In the example provided, a resolution of 1/25 th was used. The first value in each row identifies the resolution used for the analysis. The second value identifies the median path distance (-1.0 indicates no transitive paths present). The third value identifies the number of transitive paths defined in the system at that threshold.

Documentation

No further documentation is presently available.