BBSRC logo
PaccanaroLab


What is GOssToWeb?

GOssToWeb calculates the semantic similarity between genes or terms in the Gene Ontology

If you found GOssTo useful, please cite the following publications:

  • Haixuan Yang, Tamás Nepusz and Alberto Paccanaro, Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty, Bioinformatics, vol. 28, iss. 10, pp. 1383-1389, 2012.
  • Horacio Caniza, Alfonso E. Romero, Samuel Heron, Haixuan Yang, Alessandra Devoto, Marco Frasca, Giorgio Valentini and Alberto Paccanaro, GOssTo: a user-friendly stand-alone and web tool for calculating semantic similarities on the Gene Ontology., Bioinformatics (2014), doi: 10.1093/bioinformatics/btu144.

Note that you can also download the software and run it locally on your machine. Please visit the GOssTo webpage at PaccanaroLab for more details. A copy of our disclaimer and general information about GOssToWeb can be found here.

Expand for more information

GOssTo and GOssToWeb are tools to calculate semantic similarity in the Gene Ontology (GOssTo stands for Gene Ontology Semantic Similarity Tool).

GOssTo is a command-line tool that can be run through an intuitive command-line interface or easily integrated as a library into a pipeline. GOssTo is very simple to use and it runs on Windows, GNU/Linux and MacOS. No installation is required. To use it, just download the file and run it. GOssTo is available here.

GOssToWeb calculates semantic similarities without the need to download and install the standalone version. However, fewer parameters are customizable than in the stand-alone version.

GOssToWeb runs best on Mozilla Firefox and Google Chrome.

STEP 1 - Organisms and Gene Ontology Annotations

Select the desired organism -- just start typing the name in the text box below.

Select the evidence codes you want to consider -- we have pre-selected the experimental evidence codes for the annotations.

In order for the semantic similarity calculations to be meaningful, the number of annotated genes cannot be too small. For GOssTo we have set this number to 10 – that is, we need at least 10 annotated genes to compute semantic similarities.

Using the parameters that you have selected, fewer than 10 genes have an annotation, and therefore calculations cannot be carried out.

Try selecting a different set of evidence codes – each annotation is assigned an evidence code, and very likely the codes you have selected exclude most annotations for this organism.

Note that, for less well studied organisms, you might need to include “non-experimental” evidence codes, as these organisms have few experimental annotations. Details about GO evidence codes can be found here.

Expand for more information

You can enter the name of any organism which appears in UniProt GOA. Please keep in mind that we download the annotation files automatically from UniProt GOA. This means that the names and files are exactly those provided by UniProt GOA.

To help select the desired organisms, the names are listed followed by their corresponding Taxon ID. For example: Drosophila melanogaster Berkeley (7227). In this case, 7227 is the Taxon ID for this particular organism.

The annotation files are fetched regularly from UniProt GOA. Some model organisms are also available through their common names as listed here. If you want to run GOssTo with a specific annotation file, you will have to download the stand alone version and run it locally. For more details see the GOssTo website at PaccanaroLab.

Evidence codes. Every annotation matches a gene product to one or more Gene Ontology terms. This matching can be supported by different types of evidence, which is reflected by the use of different evidence codes (for more information see the Gene Ontology site). GOssTo will consider only the evidence codes selected in the list above, ignoring all annotations with evidence codes that were not selected. In case the organism selected has zero annotations with the selected evidence codes, GOssTo will not be able to produce any results, and the user will receive the result files indicating this problem.

We have pre-selected the experimental evidence codes for the annotations. These evidence codes should provide the highest quality to the annotations.

STEP 2 - Gene Ontology

Select the relations between terms that you want to consider -- we will fetch the most recent version of the Gene Ontology.

Expand for more information

This step deals with the configuration of the the Gene Ontology. We automatically provide the most up-to-date ontologies from UniProt GOA. Here you have to select the relations you want to consider.

There are several possible relations between the different GO terms, such as is_a, has_part and part_of (see the Gene Ontology website on relations for more information). We have pre-selected the ones most often used when calculating semantic similarity. However, you can select any combination of relations from the list above.

STEP 3 - Semantic similarity

Select what to calculate -- semantic similarity calculations can be performed on genes or on GO terms. Please select the appropriate option for your experiment.

Optional: Select a specific set of genes.

You can select a specific set of genes from the annotation file by inputting the list of desired genes using their UniProt IDs. Enter one UniProt identifier per line and we will retrieve them.

If you are interested in the semantic similarity of the entire set of genes, leave the box empty.

For more information on this process, please refer to the manual.

Select the semantic similarity you want to calculate -- we will also compute the corresponding Improved Semantic Similarity, which considers the Random Walk Contribution, as decribed in Yang et al, Bioinformatics, vol. 28, iss. 10, pp. 1383-1389, 2012

Expand for more information

We can calculate two different types of semantic similarities on the Gene Ontology:

  • Term wise: similarities between GO terms for a given organism (GO terms to which some of the organism's genes are annotated)
  • Gene wise: similarities between genes in an organism according to their GO annotations.

Note that simUI and simGIC are designed exclusively for Gene wise calculations and will, therefore, not be available when the Term wise option is enabled.

STEP 4 - Experiment information

Expand for more information:

Once you click on "Submit" you will be redirected to a page containing a link to a page that will be available upon job completion. Keep in mind that the link will not be active until the results are ready

If you provide us with an email address, we will also notify you when the results are ready with an email containing a link to your results.

When will my results be ready?

Running times depend on several factors: the selected organism, the number of annotations available for the organism, the selected relations, the selected evidence codes and the load of the server. Times can range from a few minutes to a few hours.

Once you have submitted your job, we will be able to provide a more accurate time estimate for the its completion.