Results

Opening the Result Table

To view the results of an analysis job:

  1. Select a completed analysis job in the Jobs tab.
  2. Under Edit Job:, next to the text Result:, click the View... button.
  3. A browser tab will open with the results displayed as a table. Alternatively, you can download a CSV file containing the corresponding results by clicking the link next to View....

Important Table Contents

The table in the browser contains an entry for each organism that had at least one hit. A “hit” refers to a base sequence of length 31 that uniquely belongs to the genome of the corresponding organism.

The table includes the following important columns:

  • Pos.: Used to sort organisms primarily by relatedness.
  • Level: The depth of the entry in the taxonomy tree.
  • Name: The name of the organism.
  • Rank: The biological rank of the organism. Typically, the ranks genus, species, and strain are required for unambiguous identification of organisms. Higher ranks such as order, class, or phylum typically group too many related species together. An exception are viruses, which are sometimes not (clearly) structured via ranks.
  • Tax Id: The unique identifier of an organism from the globally used NCBI taxonomy.
  • k-mers: The number of hits as so-called k-mers for the corresponding organism. Important: Low values usually do not indicate the presence of the organism in the sample. Low values can instead be caused by errors or artifacts in the DNA sequencing process. For “real hits”, the number of k-mers is typcially at least 2 to 3 orders of magnitude above “false hit values”.
  • Unique k-mers: The number of distinct hits for the corresponding organism. In contrast to k-mers, repeated counts of the same base sequence as hits are eliminated here. The value should (except for viruses) not be considerably smaller than the value for k-mers. Otherwise, the two values are inconsistent, and no conclusions about the presence of the organism in the sample can be drawn.
  • U. k-mers / Exp.: A measure assessing the consistency of the values Unique k-mers and k-mers. The measure usually resides between 0 and 1. A result close to 1 indicates high consistency. A result far below 1 indicates low consistency — this also applies to viruses.
  • Max C. Length: The length of the longest sequence of k-mers (a “contig”) that can be uniquely assigned to the organism. A value of 1 corresponds to 31 bases, i.e. one k-mer, 2 to 32 bases, i.e. 2 consecutive k-mers and so on. Values significantly greater than 1 further support the presence of the corresponding organism in the sample.

All About the Table Columns

The Genestrip README file contains a
detailed, technical description of all table columns for the corresponding CSV file.

Continue to “Further Information” …