SanXoTSieve
SanXoTSieve v0.14 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to perform automatical removal of lower level outliers in an integration performed using the SanXoT integrator.
SanXoTSieve needs
- the two input files of a SanXoT integration (see SanXoT's help): commands -d and -r, respectively.
- and the resulting variance of the integration that has been performed: commands -V (assigned from the info file of the integration.) or -v.
... and delivers two output files:
- a new relations file (by default suffixed "_tagged"), which is identical to the original relations file, but tagging in the third column the relations marked as outlier.
- the log file.
Usage:
sanxotsieve.py -d[data file] -r[relations file] -V[info file] [OPTIONS]
Arguments:
-h, --help Display basic help and exit. -H, --advanced-help Display this help and exit. -a, --analysis=string Use a prefix for the output files. If this is not provided, then the prefix will be garnered from the data file. -b, --no-verbose Do not print result summary after executing. -d, --datafile=filename Data file with identificators of the lowel level in the first column, measured values (x) in the second column, and weights (v) in the third column. -D, --removeduplicateupper When merging data with relations table, remove duplicate higher level elements (not removed by default). -f, --fdrlimit=float Use an FDR limit different than 0.01 (1%). -L, --infofile=filename To use a non-default name for the log file. -n, --newrelfile=filename To use a non-default name for the relations file containing the tagged outliers. -o, --outlierrelfile=filename To use a non-default name for the relations responsible of outliers (note that outlier relations are only saved when the --oldway option is active) -p, --place, --folder=foldername To use a different common folder for the output files. If this is not provided, the folder used will be the same as the input folder. -r, --relfile, --relationsfile=filename Relations file, with identificators of the higher level in the first column, and identificators of the lower level in the second column. -u, --one-to-one Remove only one outlier per cycle. This is slightly more accurate than the default mode (where the outermost outlier of each category with outliers is removed in each cycle), but usually exacerbatingly slow. -v, --var, --varianceseed=double Variance used in the concerning integration. Default is 0.001. -V, --varfile=filename Get the variance value from a text file. It must contain a line (not more than once) with the text "Variance = [double]". This suits the info file from a previous integration (see -L in SanXoT). --oldway Do it the old way: instead of tagging, create two separated relation files, with and without outliers. --outliertag=string To select a non-default tag for outliers (default: out) --tags=string To define a tag to distinguish groups to perform the integration. The tag can be used by inclusion, such as --tags="mod" or by exclusion, putting first the "!" symbol, such as --tags="!out" Tags should be included in a third column of the relations file. Note that the tag "!out" for outliers is implicit. Different tags can be combined using logical operators "and" (&), "or" (|), and "not" (!), and parentheses. Some examples: --tags="!out&mod" --tags="!out&(dig0|dig1)" --tags="(!dig0&!dig1)|mod1" --tags="mod1|mod2|mod3"