SanXoTSieve

From PROTEOMICA
Revision as of 14:35, 3 October 2017 by Mtrevisan (talk | contribs) (Created page with "SanXoTSieve v0.14 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to perform automatical remo...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

SanXoTSieve v0.14 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to perform automatical removal of lower level outliers in an integration performed using the SanXoT integrator.

SanXoTSieve needs

  • the two input files of a SanXoT integration (see SanXoT's help): commands -d and -r, respectively.
  • and the resulting variance of the integration that has been performed: commands -V (assigned from the info file of the integration.) or -v.

... and delivers two output files:

  • a new relations file (by default suffixed "_tagged"), which is identical to the original relations file, but tagging in the third column the relations marked as outlier.
  • the log file.

Usage:

sanxotsieve.py -d[data file] -r[relations file] -V[info file] [OPTIONS]

Arguments:

  -h, --help          Display basic help and exit.
  -H, --advanced-help Display this help and exit.
  -a, --analysis=string
                      Use a prefix for the output files. If this is not
                      provided, then the prefix will be garnered from the data
                      file.
  -b, --no-verbose    Do not print result summary after executing.
  -d, --datafile=filename
                      Data file with identificators of the lowel level in the
                      first column, measured values (x) in the second column,
                      and weights (v) in the third column.
  -D, --removeduplicateupper
                      When merging data with relations table, remove duplicate
                      higher level elements (not removed by default).
  -f, --fdrlimit=float
                      Use an FDR limit different than 0.01 (1%).
  -L, --infofile=filename
                      To use a non-default name for the log file.
  -n, --newrelfile=filename
                      To use a non-default name for the relations file
                      containing the tagged outliers.
  -o, --outlierrelfile=filename
                      To use a non-default name for the relations responsible
                      of outliers (note that outlier relations are only saved
                      when the --oldway option is active)
  -p, --place, --folder=foldername
                      To use a different common folder for the output files.
                      If this is not provided, the folder used will be the
                      same as the input folder.
  -r, --relfile, --relationsfile=filename
                      Relations file, with identificators of the higher level
                      in the first column, and identificators of the lower
                      level in the second column.
  -u, --one-to-one    Remove only one outlier per cycle. This is slightly more
                      accurate than the default mode (where the outermost
                      outlier of each category with outliers is removed in
                      each cycle), but usually exacerbatingly slow.
  -v, --var, --varianceseed=double
                      Variance used in the concerning integration.
                      Default is 0.001.
  -V, --varfile=filename
                      Get the variance value from a text file. It must contain
                      a line (not more than once) with the text
                      "Variance = [double]". This suits the info file from a
                      previous integration (see -L in SanXoT).
  --oldway            Do it the old way: instead of tagging, create two
                      separated relation files, with and without outliers.
  --outliertag=string To select a non-default tag for outliers (default: out)
  --tags=string       To define a tag to distinguish groups to perform the
                      integration. The tag can be used by inclusion, such as
                           --tags="mod"
                      or by exclusion, putting first the "!" symbol, such as
                           --tags="!out"
                      Tags should be included in a third column of the
                      relations file. Note that the tag "!out" for outliers is
                      implicit.
                      Different tags can be combined using logical operators
                      "and" (&), "or" (|), and "not" (!), and parentheses.
                      Some examples:
                           --tags="!out&mod"
                           --tags="!out&(dig0|dig1)"
                           --tags="(!dig0&!dig1)|mod1"
                           --tags="mod1|mod2|mod3"