Difference between revisions of "Exploring QuiXoT features"
Line 17: | Line 17: | ||
[[File:QuiXoT analysis 1b.png]] | [[File:QuiXoT analysis 1b.png]] | ||
− | The experimental spectrum is drawn blue colour, while the red colour is the theoretical prediction (taking into account the peptide sequence, the isotope distribution, the labelling, and the labelling efficiency). The "lids" of the quantified peaks depict the tolerance used to consider if the theoretical peak matched (or not) the experimental one (note that there is a minimum "lid size" to visualise it). We can enlarge it to see better: | + | The experimental spectrum is drawn blue colour, while the red colour is the theoretical prediction (taking into account the peptide sequence, the isotope distribution, the labelling, and the labelling efficiency). The horizontal "lids" of the quantified peaks depict the tolerance used to consider if the theoretical peak matched (or not) the experimental one (note that there is a minimum "lid size" to visualise it). We can enlarge it to see better: |
[[File:QuiXoT analysis 1c.png]] | [[File:QuiXoT analysis 1c.png]] | ||
Line 45: | Line 45: | ||
(Note the warning on the top: the precursor is outside this window.) | (Note the warning on the top: the precursor is outside this window.) | ||
− | This is a clear example of a co-eluted peak, so we can label | + | This is a clear example of a co-eluted peak, so we can label the corresponding peptide to filter it out of the statistics later. We can go to the Label4 column, and directly write down ''s_coelution'' (or any other tag we prefer, as far as we can filter it easily later; starting the tags related to scans as ''s_'', those related to peptides as ''p_'', and those related to proteins as ''q_'' is a good practice to distinguish between filters): |
[[File:QuiXoT analysis 1j.png]] | [[File:QuiXoT analysis 1j.png]] | ||
Line 71: | Line 71: | ||
we see it looks like the non-labelled species (first sample) is slightly more abundant than the labelled one (second sample). However, QuiXoT analysis shows that in this spectrum, the labelling efficiency was f = 0.8989. This means that there is a 10% of peptides coming from the second sample which failed to be labelled, and hence are added up to the non-labelled peptides. Correcting this, QuiXoT assigns a A = 100,501, and B = 101,294, which leads to a ratio of 0.992 and a log2Ratio = -0.113 (so, actually the second sample is slightly more abundant than the first one). | we see it looks like the non-labelled species (first sample) is slightly more abundant than the labelled one (second sample). However, QuiXoT analysis shows that in this spectrum, the labelling efficiency was f = 0.8989. This means that there is a 10% of peptides coming from the second sample which failed to be labelled, and hence are added up to the non-labelled peptides. Correcting this, QuiXoT assigns a A = 100,501, and B = 101,294, which leads to a ratio of 0.992 and a log2Ratio = -0.113 (so, actually the second sample is slightly more abundant than the first one). | ||
− | A more extreme case is shown for FirstScan = 3041: also here looks like the two samples are | + | A more extreme case is shown for FirstScan = 3041: |
+ | |||
+ | [[File:QuiXoT analysis 1o.png]] | ||
+ | |||
+ | also here looks like the two samples are almost equally abundant. However, it seems that the identified peptide is very poorly labelled, with labelling efficiency f = 0.3925. This means that most of the peptide in the second sample has not been labelled, adding itself up to the first, non-labelled sample. The QuiXoT analysis shows that abundance in the first sample is A = 4775, and for the second is B = 28,740, so that the log2Ratio = -2.59. | ||
This is shown in the bottom-central bars: | This is shown in the bottom-central bars: | ||
+ | |||
+ | [[File:QuiXoT analysis 1p.png]] | ||
+ | |||
+ | The meaning of these bars is: | ||
:* the first bar (bluish green) shows the ''corrected'' abundance of the '''first sample''', | :* the first bar (bluish green) shows the ''corrected'' abundance of the '''first sample''', | ||
:* the second bar diplays the ''corrected'' abundance of the '''second sample''', split in three colours: | :* the second bar diplays the ''corrected'' abundance of the '''second sample''', split in three colours: |
Revision as of 13:48, 27 April 2017
.Here we describe some of the features of QuiXoT you can use in everyday's work analysing quantitative proteomics experiments. If this is the first time you run the program, you might be interested first in checking the unit tests for QuiXoT.
Contents
Analysis 1
We will check some features using the 18O experiment using high resolution spectrometry, which we can see in the Test 1 of the unit test. If you didn't generate the QuiXML file (including quantification and statistics) following those steps, you can use the file VSMC_QuiXML_bs_quant_stats.xml in the VSMC_result folder (remember to move this file together with the binStack if you want to see the spectra; or just use the QuiXML alone if you do not need them).
1.1 Getting started
Open QuiXoT.exe, drag and drop anywhere on its window the abovementioned QuiXML file, and select the 18, HR, SEQUEST strategy. You should see this window:
1.2 Managing spectra
We might want to start checking how are the spectra (if you did not copy the binStack, you can skip this and go to generating graphs). Click on button spectrum, and then click on any row in the datagrid. For the fourth row (with FirstScan = 8502) you should see something like this
The experimental spectrum is drawn blue colour, while the red colour is the theoretical prediction (taking into account the peptide sequence, the isotope distribution, the labelling, and the labelling efficiency). The horizontal "lids" of the quantified peaks depict the tolerance used to consider if the theoretical peak matched (or not) the experimental one (note that there is a minimum "lid size" to visualise it). We can enlarge it to see better:
Or select the first four quantified peaks with the mouse (click and drag horizontally):
In this view we can see that apparently no co-eluted peaks are present.
Note that on the top of the screen we have an indicator of which is the precursor mass of the peptide that has been matched:
Let's look at a specific spectrum. To make it faster, we will filter the spectrum with FirstScan = 3017 (write the filter in the filter field and then click on the filter button:
The open the spectrum, enlarge
And then select the third quantified (red) peak:
(Note the warning on the top: the precursor is outside this window.)
This is a clear example of a co-eluted peak, so we can label the corresponding peptide to filter it out of the statistics later. We can go to the Label4 column, and directly write down s_coelution (or any other tag we prefer, as far as we can filter it easily later; starting the tags related to scans as s_, those related to peptides as p_, and those related to proteins as q_ is a good practice to distinguish between filters):
We can right-click on the spectrum to either zoom out or export data to a text file that can be treated by another software.
1.3 Checking information
Close the spectrum-window, and remove the filter in the filter field (delete and click on filter button).
You can see that in the lower left corner there is a lot of information about every quantification. You can click with the mouse on any row, and go down with the arrow button: this information shows the data of each scan-peptide-protein of every row:
There are two panels, one for the quantitative information of spectra, peptides and proteins, and another with the identification information (prior to QuiXoT analysis). The details of this are explained in DataGrid information in QuiXoT[1].
Next to these panels there are three bars with important information for 18O experiments:
It is easier to explain what they are showing by an example. For example, if we select the spectrum with FirstScan = 1492,
we see it looks like the non-labelled species (first sample) is slightly more abundant than the labelled one (second sample). However, QuiXoT analysis shows that in this spectrum, the labelling efficiency was f = 0.8989. This means that there is a 10% of peptides coming from the second sample which failed to be labelled, and hence are added up to the non-labelled peptides. Correcting this, QuiXoT assigns a A = 100,501, and B = 101,294, which leads to a ratio of 0.992 and a log2Ratio = -0.113 (so, actually the second sample is slightly more abundant than the first one).
A more extreme case is shown for FirstScan = 3041:
also here looks like the two samples are almost equally abundant. However, it seems that the identified peptide is very poorly labelled, with labelling efficiency f = 0.3925. This means that most of the peptide in the second sample has not been labelled, adding itself up to the first, non-labelled sample. The QuiXoT analysis shows that abundance in the first sample is A = 4775, and for the second is B = 28,740, so that the log2Ratio = -2.59.
This is shown in the bottom-central bars:
The meaning of these bars is:
- the first bar (bluish green) shows the corrected abundance of the first sample,
- the second bar diplays the corrected abundance of the second sample, split in three colours:
- purple for the amount that has been fully labelled, i.e., with two 18O (both available 16O oxygen atoms in the carboxylic group have been replaced by 18O), so the labelled peptides have been added in the spectrum counts with a 4 Dalton separation
- yellow for the amount that has been partially labelled, i.e., labelled with only one 18O (only one of the two available 16O oxygen atoms in the carboxylic group has been replaced by 18O), so the labelled peptides have been added in the spectrum with a 2 Dalton separation (being superimposed to two of the non-labelled isotopologues)
- red, for the amount that has been fully non-labelled, i.e. despite the peptides come from the second sample, they are superimposed to the peptides from the first sample
- the third bar shows labelling efficiency f, which is a ratio between 0 and 1; when f > 0.6, it is green colour, and when it is <= 0.6 it is red
Analysis 2
Notes
- ↑ Fields containing NaN (or similar, as NeuN, depending on your system language) mean the contents of the field is not a number (this happens with proteins having only one peptide or peptides identified only with one scan, as the calculation of Z involves divisions by zero for these cases