Difference between revisions of "Unit tests for QuiXoT"

From PROTEOMICA
Jump to: navigation, search
m (Test 1: 18O quantification and statistical analysis from Proteome Discoverer results)
Line 1: Line 1:
We present here four experiments you may use to check [[QuiXoT]], as well as testing it is working as expected in your machine:
+
We present here four experiments you may use to check '''[[QuiXoT]]''', as well as testing it is working as expected in your machine:
  
 
=== Test 1: 18O quantification and statistical analysis from Proteome Discoverer results ===
 
=== Test 1: 18O quantification and statistical analysis from Proteome Discoverer results ===
Line 30: Line 30:
 
:* A binStack folder should appear, containing several BFR files and one ''index.idx'' file. As they are binary files, so they are not usable with text editors.
 
:* A binStack folder should appear, containing several BFR files and one ''index.idx'' file. As they are binary files, so they are not usable with text editors.
 
:* A ''*_QuiXML_bs.xml'' file, which is the same as the input QuiXML file, but containing indexation of spectra (it is written with a new name, instead of being overwritten, in order to prevent information loss in case the previous steps go wrong).
 
:* A ''*_QuiXML_bs.xml'' file, which is the same as the input QuiXML file, but containing indexation of spectra (it is written with a new name, instead of being overwritten, in order to prevent information loss in case the previous steps go wrong).
9) Now we can open this output file with QuiXoT: execute QuiXoT.exe, and then drag and drop the ''*_QuiXML_bs.xml'' file. Choose the strategy (for this example, ''18O, HR, SEQUEST'').
+
9) Now we can open this output file with '''[[QuiXoT]]''': execute QuiXoT.exe, and then drag and drop the ''*_QuiXML_bs.xml'' file. Choose the strategy (for this example, ''18O, HR, SEQUEST'').
 +
10) To quantify, select all the spectra (double click on the upper-left corner of the data grid, left of the headers), and click on the button ''quantitate''. With this dataset, the quantification should take less than a minute, but with other experiments this might take anything between few seconds and some hours.
 +
11) You can save the quantified results to compare your data with the results in the unit test: change the name by hand in the ''QuiXML File'' field (for example, changing it to ''VSMC_QuiXML_bs_quant.xml''), and click on the ''write XML'' button.
 +
12) Now we can make the statistic analysis. We need first the variances and the calibration constant (K). We can either include previously estimated values (by clicking on ''change values''), or calculate them from scratch.
 +
13) To calculate them:
 +
:* Click on the ''var calc'' button,
 +
:* In ''Choose the field to be used as Xs'', write ''q_log2Ratio'' (which contains the fold changes)
 +
:* In ''Choose the field to be used as Vs'', write ''Vs'' (which contains the weight associated to every fold-change)
 +
:* Add K = 40 (as [[Recommended parameters for QuiXoT strategies|recommended for 18O, HR experiments]], or alternatively calculate the calibration constant with an independent program)
 +
:* In ''Filter'', write ''q_A > 0 and q_B > 0'' (this ensures that both, the labelled and the unlabelled peptide, have been found). More complex filters might be needed for other experiments (for example, for spectra with lots of noise, which tend to be of bad quality when the peaks are not intense, it is good filtering out the corresponding low-weight quantifications, by adding ''... and Vs > 100''; some recommendations for the lower Vs-threshold can be found [[Recommended parameters for QuiXoT strategies|here]]).
 +
:* Press the ''OK'' button
 +
14) You should obtain, using this example, these results:
 +
:* sigma2s = 0.1692
 +
:* sigma2p = 0.0029
 +
:* sigma2q = 0
  
 
=== Test 2: 18O quantification statistical analysis starting from QuiXML file + binStack ===
 
=== Test 2: 18O quantification statistical analysis starting from QuiXML file + binStack ===

Revision as of 11:17, 25 April 2017

We present here four experiments you may use to check QuiXoT, as well as testing it is working as expected in your machine:

Test 1: 18O quantification and statistical analysis from Proteome Discoverer results

1) Download the zip with the files from here.

2) After unzipping the files, you should have three folders: dir, inv, and raw.

  • The dir folder should contain an XML file called modifications.xml and and MSF file called 110112_VSMC_EN_OG21.msf (the latter being an SQLite file containing identifications of the target SEQUEST search from Proteome Discoverer 1.4; the decoy search).
  • The inv folder should contain an MSF file called 110112_VSMC_EN_OG21-01.msf (an SQLite file containing the corresponding decoy SEQUEST search).
  • The raw folder should contain a Thermo RAW file called 110112_VSMC_EN_OG21.RAW, containing all the spectral information saved by the spectrometre.

3) After opening pRatio, drag and drop the dir and inv folders in the Target search and Decoy search fields, respectively. Press the button Run!.

4) After finishing, seven files should be saved in the dir folder, including four tab-separated text files (with XLS ending), and XML file (the QuiXML file), and two TXT files.

5) Now we need to generate the binStack folder, extracting the information from all spectra. Using Thermo RAW files, you just need to open RAWToBinStack, and then drag and drop three different files:

  • The QuiXML generated in step #4
  • The QuiXML schema you are going to use to add the quantitative and statistical information. This is the XSD file within QuiXoT's conf folder, corresponding to the quantification method you are going to use. In this example, this is an 18O experiment using a high resolution mass spectrometre (Orbitrap), so you will need to drag and drop the file identifications_schema_18O_HR.xsd.
  • The folder where the RAW files are. In this example, is the folder raw, containing the file 110112_VSMC_EN_OG21.RAW (note that you must drag and drop the folder, not just the file).

6) Fill the remaining information:

  • Spectrum type: the type of the spectrum where the quantitative information is. In the case of 18O this information is in the Full or ZoomScan; as this is a high resolution spectrometre, we should select the Full spectrum (ZoomScan is a type of scan used in low resolution machines). Note that for other techniques, such as iTRAQ, the quantitative information is in the MSMS spectra instead.
  • The position of the spectra containing quantitative information, relative to the spectra used to identify the peptide. For 18O experiments, this should be set to previous, as for these strategies the quantitative information is in the first Full scan prior to the identification (first a Full scan is taken, and then each of the most intense peaks are selected for fragmentation and identification). Note that other strategies, such as iTRAQ, are quantified in the same spectrum where peptides are identified.
  • Usually, importing the whole spectrum leads to huge files difficult to manage, so, for 18O experiments, we recommend checking the option import only window around parental mz with 12 m/z (which should be enough to cover the 4 isotopologues of the non-labelled feature + 4 isotopologues of the labelled feature + 4 more m/z to get some context for possible artefacts). Other strategies such as iTRAQ would require checking import between these mzs (for iTRAQ 8plex, importing between 112 and 122 should be enough to cover the 113-121 range).
  • In this example, we don't need advanced options, so we leave unchecked the average spectra feature.

7) Press create binStack. For this small example, the generation of the binStack should be fast (few seconds), but for normal or large experiments this might take between few minutes to some hours, depending on the experiment.

8) The output of the RAWToBinStack is:

  • A binStack folder should appear, containing several BFR files and one index.idx file. As they are binary files, so they are not usable with text editors.
  • A *_QuiXML_bs.xml file, which is the same as the input QuiXML file, but containing indexation of spectra (it is written with a new name, instead of being overwritten, in order to prevent information loss in case the previous steps go wrong).

9) Now we can open this output file with QuiXoT: execute QuiXoT.exe, and then drag and drop the *_QuiXML_bs.xml file. Choose the strategy (for this example, 18O, HR, SEQUEST). 10) To quantify, select all the spectra (double click on the upper-left corner of the data grid, left of the headers), and click on the button quantitate. With this dataset, the quantification should take less than a minute, but with other experiments this might take anything between few seconds and some hours. 11) You can save the quantified results to compare your data with the results in the unit test: change the name by hand in the QuiXML File field (for example, changing it to VSMC_QuiXML_bs_quant.xml), and click on the write XML button. 12) Now we can make the statistic analysis. We need first the variances and the calibration constant (K). We can either include previously estimated values (by clicking on change values), or calculate them from scratch. 13) To calculate them:

  • Click on the var calc button,
  • In Choose the field to be used as Xs, write q_log2Ratio (which contains the fold changes)
  • In Choose the field to be used as Vs, write Vs (which contains the weight associated to every fold-change)
  • Add K = 40 (as recommended for 18O, HR experiments, or alternatively calculate the calibration constant with an independent program)
  • In Filter, write q_A > 0 and q_B > 0 (this ensures that both, the labelled and the unlabelled peptide, have been found). More complex filters might be needed for other experiments (for example, for spectra with lots of noise, which tend to be of bad quality when the peaks are not intense, it is good filtering out the corresponding low-weight quantifications, by adding ... and Vs > 100; some recommendations for the lower Vs-threshold can be found here).
  • Press the OK button

14) You should obtain, using this example, these results:

  • sigma2s = 0.1692
  • sigma2p = 0.0029
  • sigma2q = 0

Test 2: 18O quantification statistical analysis starting from QuiXML file + binStack

In the following examples, we start with QuiXML/binStack files, so we can skip some of the initial steps.

  1. Download the zip with the files from here.

...

Test 3: iTRAQ statistical analysis starting from QuiXML file + binStack

  1. Download the zip with the files from here.

...

Test 4: SILAC statistical analysis starting from QuiXML file + binStack

  1. Download the zip with the files from here.

...