Klibrate
Klibrate v1.14 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to perform the calibration of experimental data, as a first step to integrate these data into higher levels along with the SanXoT program.
To perform the calibration two parameters have to be calculated: the k (weight constant), and the variance. They are calculated iteratively using the Levenberg-Marquardt algorithm, starting from the seeds the user introduces (it is possible to perform the calculation without the iterative calculation by forcing both parameters with the -f option). In the integration that follows the variance can be recalculated.
Klibrate needs two input files:
- the original data file, containing unique identifiers of each scan, such as "RawFile05.raw-scan19289-charge2" or "File05B_scannumber12877_z3", the Xi which corresponds to the log2(A/B), and the Vi which corresponds to the weight of the measure).
- the relations file, containing a first column with the higher level identifiers (such as the peptide sequence, for example "CGLAGCGLLK", or the protein, if you wish to directly integrate scans into proteins, such as the Uniprot Accession Numbers "P01308" or KEGG Gene ID "hsa:3630"), and the lower level identifiers within the abovementioned original data file (such as "RawFile05.raw-scan19289-charge2").
And delivers the output calibrated file:
- the calibrated data file, containing the same information as the original data file, but changing the values of the third column (containing the weights) to adapt the information to the calibrated weights that can be used as input in the SanXoT program.
Usage:
klibrate.py [OPTIONS] -r[relations file] -d[original data file] -o[calibrated output file]
Arguments:
-h, --help Display this help and exit. -a, --analysis=string Use a prefix for the output files. If this is not provided, then the prefix will be garnered from the data file. -b, --no-verbose Do not print result summary after executing. -d, --datafile Input data file with text identificators in the first column, measured values (x) in the second column, and uncalibrated weights (v) in the third column. -D, --outgraphdata=filename To use a non-default name for the data used to create calibration graph files. -f, --forceparameters Use the parameters (k and variance) as provided, without using the Levenberg-Marquardt algorithm. -g, --no-showgraph Do not show the rank(V) vs 1 / MSD graph after the calculation. -G, --outgraphvvalue=filename To use a non-default name for the graph file which shows the value of V (the weight) versus 1 / MSD. -k, --kseed Seed for the weight constant. Default is k = 1. -K, --kfile=filename Get the K value from a text file. It must contain a line (not more than once) with the text "K = [float]". This suits the info file from another integration (see -L). -L, --infofile=filename To use a non-default name for the info file. -m, --maxiterations Maximum number of iterations performed by the Levenberg- Marquardt algorithm to calculate the variance and the k constant. If unused, the default value of the algorithm is taken. -o, --outputfile To use a non-default output calibrated file name (see above for more information on this file). -p, --place, --folder=foldername To use a different common folder for the output files. If this is not provided, the the folder used will be the same as the input folder. -r, --relfile, --relationsfile Relations file, with identificators of the higher level in the first column, and identificators of the lower level in the second column. -R, --outgraphvrank=filename To use a non-default name for the graph file which shows the rank of V (the weight) versus 1 / MSD. -s, --no-showsteps Do not print result summary and steps of each Levenberg- Marquardt iteration. -v, --var, --varianceseed Seed used to start calculating the variance. Default is 0.001. -V, --varfile=filename Get the variance value from a text file. It must contain a line (not more than once) with the text "Variance = [double]". This suits the info file from another integration (see -L). -w, --window The amount of weight-ordered lower level elements (scans, usually) that are taken at a time to calculate the median of the weight, which is compared to the fit; default is 200.
examples:
- To calculate the variance and k starting with a seed v = 0.03 and k = 40, printing the steps of the Levenberg-Marquardt algorithm and results, showing the rank(Vs) vs 1 / MSD graph afterwards:
klibrate.py -gbs -v0.03 -k40 -dC:\temp\originalDataFile.txt -rC:\temp\relationsFile.txt -oC:\temp\calibratedWeights.xls
- To get fast results of an integration forcing a variance = 0.02922 and a k = 35.28:
klibrate.py -f -v0.02922 -k35.28 -dC:\temp\originalDataFile.txt -rC:\temp\relationsFile.txt -oC:\temp\calibratedWeights.xls
- To see the graph resulting from a calculation with variance = 0.02922 and a k = 35.28:
klibrate.py -gf -v0.02922 -k35.28 -dC:\temp\originalDataFile.txt -rC:\temp\relationsFile.txt -oC:\temp\calibratedWeights.xls