Difference between revisions of "Arbor"

From PROTEOMICA
Jump to: navigation, search
(Created page with "Arbor v1.05 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to generate the tree graph of a s...")
 
Line 67: Line 67:
 
                       The outStats file from a SanXoT integration (optional,
 
                       The outStats file from a SanXoT integration (optional,
 
                       see above).
 
                       see above).
 
 
 
   --selectednodecolor=#rrggbb, --selectednodecolour=#rrggbb
 
   --selectednodecolor=#rrggbb, --selectednodecolour=#rrggbb
 
   --defaultnodecolor=#rrggbb, --defaultnodecolour=#rrggbb
 
   --defaultnodecolor=#rrggbb, --defaultnodecolour=#rrggbb

Revision as of 16:24, 3 October 2017

Arbor v1.05 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to generate the tree graph of a set of categories, showing the position of a given list of categories in the tree, along with category information.

Arbor needs four input files:

  • a stats file, the outStats file from SanXoT (using the -z command); if this is omitted, then the tree will only distinguish the categories in the list from the other categories above them (not showing the values of the protein within the category).
  • a higher level list to graph (using the -c command)
  • a relations file (using -r command)
  • a list of links between higher level elements, such as the table_allPaths.xls from GOconnect (using the -b command)

And delivers three output files:

  • the graph in PNG format (default suffix: "_outTree.png")
  • the DOT language text file used to generate the graph (default suffix: "_outTree.gv")
  • a log file (default suffix: "_logFile")

Usage:

arbor.py -z[stats file] -r[relations file] -c[higher level list file] -b[links file] [OPTIONS]

Arguments:

  -h, --help          Display this help and exit.
  -a, --analysis=string
                      Use a prefix for the output files. If this is not
                      provided, then the prefix will be garnered from the
                      stats file.
  -b, --biglist       A list of links between higher level elements, such as
                      the table_allPaths.xls from GOconnect. It must be a tab
                      separated values text file, containing any identifier
                      in the first column (this column will not be imported,
                      but originally it was intended to contain protein
                      identifiers for each path), containing in each row (from
                      the second column on) a possible path from top to the
                      most specific element.
  -c, --list=filename The text file containing the higher level elements whose
                      categories we want to relate. If the first element is
                      not taken, it might help saving the file with ANSI
                      format. If a header is used, then it must be in the form
                      "id>n>Z>FDR" or "id>Z>n" (where ">" means "tab").
  -d, --dotfile=filename
                      To use a non-default name for the text file in DOT
                      language, which is used to generate the graph.
  -g, --graphformat=string
                      File format for the similarity graph (default is "png").
  -G, --outgraph=filename
                      To use a non-default name for the graph file.
  -l, --graphlimits=integer
                      To set the +- limits of the most intense red/green
                      colours in the graph (default is 6).
  -L, --logfile=filename
                      To use a non-default name for the log file.
  -N, --altmax=integer
                      Maximum number of lower level elements that the alt text
                      of the higher level node will show per side. For
                      instance, for N = 3, alt text will show all the elements
                      up to six; beyond this, only the first and last three
                      will be shown. (Default is N = 5.) (Note that this will
                      have effect if the SVG format is used.)
  -p, --place, --folder=foldername
                      To use a different common folder for the output files.
                      If this is not provided, the the folder used will be the
                      same as the stats file folder.
  -r, --relfile, --relationsfile=filename
                      Relations file, with identificators of the higher level
                      in the first column, and identificators of the lower
                      level in the second column.
  -z, --outstats=filename
                      The outStats file from a SanXoT integration (optional,
                      see above).
  --selectednodecolor=#rrggbb, --selectednodecolour=#rrggbb
  --defaultnodecolor=#rrggbb, --defaultnodecolour=#rrggbb
  --errornodecolor=#rrggbb, --errornodecolour=#rrggbb
  --mincolor=#rrggbb, --mincolour=#rrggbb
  --middlecolor=#rrggbb, --middlecolour=#rrggbb
  --maxcolor=#rrggbb, --maxcolour=#rrggbb