Difference between revisions of "QuiXoT"

From PROTEOMICA
Jump to: navigation, search
Line 80: Line 80:
 
:b) non homogeneous methionine oxidation in the samples to be compared (this may be easily checked by filtering out by the ''st_Meth'' field)
 
:b) non homogeneous methionine oxidation in the samples to be compared (this may be easily checked by filtering out by the ''st_Meth'' field)
 
:c) partially labelled peptides (with 18O-labelling this is indicated by ''q_f'')
 
:c) partially labelled peptides (with 18O-labelling this is indicated by ''q_f'')
 +
 +
If any of these artefacts are encountered, outliers should be eliminated from variance calculation or statistics by using the filter tool.
 +
 +
A further inspection of proteins showing significant expression changes (low ''FDRq'' values) is recommendable at this step, since keratins and other external contaminants like trypsin may not be well-balanced in the two samples and introduce an artefactual variance at the protein level. Eliminate all the quantifications related to these contaminants from the statistics by applying an appropriate filter (consult applying filters to the data).
 +
  
  
 
[[Category:QuiXoT]]
 
[[Category:QuiXoT]]

Revision as of 15:53, 12 September 2013

QuiXoT
Screenshot QuiXoT general.PNG
Screenshot of QuiXoT, depicting different spectra and graphs used.
Last release: v.1.4.00
Release date: 20th Aug 2013
Download link: [[{{{link}}}]]
Source code: QuiXoT at GitHub
Licence: Please read Licencing
Requirements


QuiXoT is an open source software created for the quantitation and statistical analysis of quantitative proteomics experiments. It has been developed at the Cardiovascular Proteomics Laboratory of Prof Jesús Vázquez, at the Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain.

It has been developed in Visual C#, hence users must install the .NET Framework 2.0 or higher (not necessary for Windows 7 users), which can be downloaded from this link.

Using QuiXoT

See also the article: DataGrid information in QuiXoT.

Checking an existent QuiXoT analysis

The QuiXML files

QuiXoT makes use of a QuiXML files, which is an ad hoc XML format created to manage the three levels of information treated: identification, quantitation and statistical information. To check a list of the different fields used in QuiXML files (i.e., the columns appearing in the main window of QuiXoT), you can check the article DataGrid information in QuiXoT.

After dragging and dropping the QuiXML file, you will have to choose the quantitation method.

If you just want to see an existing QuiXoT analysis, you only need the corresponding QuiXML file. Just drag and drop that file on the main form, and select the quantitation method used, which will depend on the SIL method used (such as 18O, SILAC, etc) and the spectrometre used (sich as high or low resolution).

The binStack folder

The spectra are saved in a folder called binStack, which contains one or more .bfr files and one index.idx file. You do not need this folder if you just want to check the results of a QuiXoT analysis (such as the statistics, identifications or the quantitative information).

However, you will need it if you want to requantitate a spectrum, or see the spectrum itself (for example to compare the theoretical and experimental isotopic envelope, which are respectively in red and blue colours). If this is your case, then you should always have the QuiXML file and its corresponding binStack in the same folder (do not forget to move them together).

As far as the binStack and the QuiXML file are in the same place, you do not need to do anything else to load the spectral information.

The configuration files

In the location where you have copied your version of QuiXoT you will find a conf folder containing the configuration files. It contains three kinds of file:

  • the QuantitationMethods.xml file, which contains the parameters of the different methods used. Here it is specified which labelling is associated to a method, or which is the spectrum type that contains the quantitative information (for instance, SILAC quantitation is performed in the full scan if a high resolution spectrometre has been used, but if it is a low resolution machine, then the quantitation is performed in the zoom scan immediately previous to the MS2 scan). Examples of other parametres that can be determined using this file are:
  • width: the tolerance for the high resolution peaks
  • deltaR: the mass difference between 16O and 18O
  • sumSQtolerance_NG, the tolerance accepted for the sum of squares when comparing the theoretical and the experimental spectrum
  • the iTRAQ masstags and their corrections
  • several xml files which containg information such as the weights of each isotope, the composition of each amino acid or their posttranslational modifications. They also include the correspondence between the different residues and their symbols; for instance, "Y" means "tysorine", while "*" may refer to an oxidation in methionine, or a SILAC label on arginine. Examples of these files are:
  • isotopes.xml
  • aminoacids.xml
  • aminoacids_SILAC.xml
  • several xsd files, which are the XML schemas that contain the structure of the QuiXML file depending on each quantitation method. Some examples of these files are
  • identifications_schema_18Ohighres.xsd
  • identifications_schema_mascot_SILAC.xsd

Checking spectra by weight

Inspect quantifications with low Vs values. Sort the table by Vs and inspect the spectra by using the spectrum button. At very low Vs values you will find completely useless spectra (bad fittings, mixtures, high background, etc). You can choose whether eliminating these spectra from the statistics by marking them with numLabel1 = 0, or filtering by a minimum Vs value (for instance Vs > 3). Non-quantifiable peptides (i.e. peptides not containing basic N-terminal residues in 18O-labeling or in SILAC) must also be excluded when calculating variances or performing the statistics.

Labelling efficiency for 18O labelling

If you have used 18O-labelling, you can check the labelling efficiency, prior to other analyses. Plot q_f versus Xs (consult how to create graphs). Since this plot does not differentiate between good and bad quantitations and hence plots together more and less accurate estimations of q_f, it is a good idea to eliminate bad quantitations from the plot by filtering out the data that do not have an arbitrary minimum Vs value (for instance Vs > 30 in ZoomScan-quantitated spectrum). Labelling efficiency must be above 0.8 for the vast majority of peptides. A cloud of points with q_f below 0.7 tending to curve towards the right (increasing Xs values) are indicative of a poorly-labeled experiment.

Performing statistics

Introduce an initial set of statical parameters (k and variances) for the null-hypothesis model by using the change_values link. You can find a list of typical values for these parameters. Make an initial estimation of variances by pressing the var calc button. At this step you will have to tell QuiXoT which columns are going to be used as Xs and Vs (this is useful for multichannel labelling approaches such as, for instance, iTRAQ data, which contains several Xs and Vs values depending on the labels that are to be compared). Accept the newly calculated variances and perform the statistical analysis by pressing the stats button.

Inspecting spectra and peptides

A high resolution spectrum from an 18O-labelled experiment. Notice the light species (the four peaks at the left) and the heavy species (the peaks 5th to 8th). The theoretical peaks are red colour, while the original, experimental spectrum is blue colour. You can see a contaminant (or perhaps another less abundant peptide) on the right side of the spectrum (of course, only blue colour, as it does not match a theoretical spectrum in this case).

Inspect the presence of outliers at the scan and peptide levels by using the graphs button and setting Ws (or Wp) as X, vs Xs (or Xp) as Y, to check whether these data are influencing variance calculation. Sort out the data by FDRs (or FDRp) and check the rows having low FDR values (below 0.05 they are statistically considered as outliers). Typically a negligible proportion of outliers may be found (less than 1% of total); this is normal. However, if the number of outliers is too high, it may be indicative of quantification artefacts and/or problems in the labelling protocol.

Common artefacts at the scan level are rare and may be produced by

a) problems in mass calibration (spectra cannot be fitted to the theoretical mass envelope)
b) excesive noise and/or fluctuations in the detector
c) inadequate fitting parameters in the configuration files.

Common artefacts at the peptide level are, however, much more frequent when peptides are post-digestion labelled (which does not include SILAC). They include:

a) incomplete digestion of one of the samples (this may be easily checked by selecting peptide subpopulations using the st_PartialDig field and the filter tool in Vs versus Xp plots)
b) non homogeneous methionine oxidation in the samples to be compared (this may be easily checked by filtering out by the st_Meth field)
c) partially labelled peptides (with 18O-labelling this is indicated by q_f)

If any of these artefacts are encountered, outliers should be eliminated from variance calculation or statistics by using the filter tool.

A further inspection of proteins showing significant expression changes (low FDRq values) is recommendable at this step, since keratins and other external contaminants like trypsin may not be well-balanced in the two samples and introduce an artefactual variance at the protein level. Eliminate all the quantifications related to these contaminants from the statistics by applying an appropriate filter (consult applying filters to the data).