Output files

In this section, the various final and intermediate output files of gapseq are going to be described.

gapseq find

*-Pathways.tbl

This file contains detailed information for all pathways that were considered by the gapseq find command. It is a tab-separated text file with the following columns:

ID Name Prediction Completeness VagueReactions KeyReactions KeyReactionsFound ReactionsFound
Pathway ID Pathway Name Inferred presence true/false Ratio of found/total reactions (%) Number of reactions without available sequences Total number of important key reactions Number of found key reactions Names of found reactions

*-Reactions.tbl

This tab-separated text file contains detailed information about all checked reactions.

rxn name ec bihit qseqid pident evalue bitscore qcovs stitle sstart send pathway status pathway.status dbhit complex exception complex.status
Reaction ID Reaction name EC number Bidirectional hit Query Seq-id Percentage of identical matches Expect value Bit score Query Coverage Per Subject Subject Title Start of alignment End of alignment Associated pathway Blast status of reaction* Status of associated pathway** Mapped model reactions Detected protein complex Higher identity cutoff used*** Status of protein complex
  • *The blast status of a reaction informs about the result of the homology search. It is defined to be: bad_blast (blast hit with lower quality, i.e. lower bitscore, coverage, or identity than needed for cutoffs), good_blast (all cutoffs satisfying blast hit), no_blast (no blast hit found), no_seq_data (no sequence data available), spontaneous (no enzyme needed).

  • **The status of the associated pathway provides background to the criteria by which a pathway was predicted. The following values are possible: full(All reactions were found), keyenzyme (Found key reactions indicate pathway presence (at least 66% of the pathway reactions are present)), NA (The pathway is not predicted to be present), threshold (Pathway is present because at least 80% of its reactions are present)

  • ***For enzymes which have a high similar sequence to other enzymes with different function a higher identity cutoff is used for the blast search (this exceptions are defined in gapseq/dat/exceptions.tbl)

gapseq find-transport

*-Transporter.tbl

Data about found transporter is listed in this tab-separated text file.

id tc sub exid rea qseqid pident evalue bitscore qcovs stitle sstart send
Transporter ID TC number Transported substance Exchange ID Associated model reaction Query Seq-id Percentage of identical matches E value Bit score Query Coverage Subject Title Start of alignment End of alignment

gapseq draft

*-rxnWeights.RDS

Reaction weights table (temporary file needed for gapseq fill).

*-rxnXgenes.RDS

Table with gene-X-reaction association (temporary file needed for gapseq fill).

*-draft.RDS

Model draft file as R object.

*-draft.xml

Draft model in SBML format (only created if sybilSBML is installed).

gapseq fill

*.RDS

Final model saved as R object.

*.xml

Final model in SBML format (only created if sybilSBML is installed).