Getting Started#

Launch on Binder

pyopenms_viz allows users to visualize mass spectrometry data with minimal effort, making it ideal for software developers who do not want to focus on plotting, bioinformaticians wanting to explore their data in a python setting or experimental scientists who want more control of their data.

This is accomplished by providing a plotting interface ontop of the already widely used pandas dataframe.

This tutorial demonstrates the basic usage of pyopenms_viz, a plotting interface built on top of pandas for generating mass spectrometry plots like spectra, chromatograms, and peak maps.

Tutorial Requirements:#

  • pyopenms for .mzML data loading

  • alphatims for .d data loading

[1]:
!pip install pyopenms alphatims --quiet
!pip install pandas==2.1.4 --quiet # this version avoids warning messages with alphatims
[2]:
import pyopenms
pyopenms.__version__
[2]:
'3.2.0'
[3]:
import alphatims
alphatims.__version__
[3]:
'1.0.8'
[4]:
import pandas as pd
pd.__version__
[4]:
'2.1.4'

Overview#

PyOpenMS-Viz Figure 1

pyopenms-viz is a plotting framework supporting multiple plot common mass spectrometry plot types including the Spectrum, Chromatogram, PeakMap and Mobilogram. Plotting is achieved by calling the plot() method on a pd.DataFrame object containing mass spectrometry data in long format. For more information on how to format your data into a pd.DataFrame please see Data Formatting

The plot() Method#

The only function required for users of pyopenms_viz is the plot() method. This method is called directly on a pandas dataframe storing mass spectrometry data in the long format. This means that each row represents a single peak.

Required Arguments#

  1. x - The column name of the x-axis

  2. y - The column name of the y-axis

  3. z (only peakmaps) - The column name of the z-axis - represented as colour in the peakmaps

  4. kind - The kind of plot, options include:

    • "spectrum" - mass-to-charge ratio vs intensity vertical line plot

    • "chromatogram" - retention time vs intensity line plot

    • "peakmap" - scatterplot with z axes represented by color

    • "mobilogram" - ion mobility vs intensity line plot

  5. Backend - The backend to plot. can be one of “ms_matplotlib”, “ms_plotly” or “ms_bokeh”

    • ms_matplotlib - Standard python plotting backend, good for generating publication quality static plots

    • ms_bokeh - Lightweight interactive plotting

    • ms_plotly - Feature rich interactive plotting, also supports 3D plots, good integration with streamlit

Note

Backend can also be set globally using

pd.options.plotting.backend = 'ms_bokeh'

Optional Arguments#

Plots are highly customizable with additional arguments passed to the plot() method. For a full list of supported. A few notable optional arguments are mentioned below.

  1. by - Grouping of the data, functionality depends on the plot type.

    • Spectrum = colors peaks based on grouping

    • Chromatogram/Mobilogram = Draw a separate line trace for each grouping

    • PeakMap = Different markers (e.g. “+”, “.” ..) for peaks of different groups

  2. show_plot (default = True) - If True, the method returns None and the plot output. If False the plot is not shown and the function returns the underlying figure/axes object (depending on backend)

For a full list of arguments please see Parameters

Plot a Spectrum#

The Spectrum is one of the most commonly used plots in mass spectrometry. It contains mass-to-charge ratio on the x-axis and intensity on the y-axis.

Downloading the Data#

In this example, we will use pyopenms to load the .mzML file into a pandas dataframe. Examples using different packages can be found here.

[5]:
from urllib.request import urlretrieve

gh = "https://raw.githubusercontent.com/OpenMS/pyopenms-docs/master"
urlretrieve(
       gh + "/src/data/PrecursorPurity_input.mzML", "test.mzML"
   )
urlretrieve(
       gh + "/src/data/YIC(Carbamidomethyl)DNQDTISSK.mzML", "YIC(Carbamidomethyl)DNQDTISSK.mzML"
   )
[5]:
('YIC(Carbamidomethyl)DNQDTISSK.mzML',
 <http.client.HTTPMessage at 0x7f93eff32660>)

Reading the data into a pd.DataFrame with pyopenms#

[6]:
import pyopenms as oms

# Load the raw mass spectrometry data
exp = oms.MSExperiment()
oms.MzMLFile().load("YIC(Carbamidomethyl)DNQDTISSK.mzML", exp)

# Fetch the first spectrum
spectrum = oms.MSSpectrum(exp.getSpectrum(0))

# Export the spectrum to a pandas dataframe
spectrum_df = spectrum.get_df()
spectrum_df.head(5)
[6]:
mz intensity ion_mobility ion_mobility_unit ms_level precursor_mz precursor_charge native_id sequence ion_annotation base peak m/z base peak intensity total ion current lowest observed m/z highest observed m/z filter string preset scan configuration
0 212.012451 6.041170 NaN <NONE> 2 722.324707 2 spectrum=2624 584 906 7561 212 1297 ITMS + c NSI d w Full ms2 722.32@cid35.00 [185... 2
1 217.039612 2.125546 NaN <NONE> 2 722.324707 2 spectrum=2624 584 906 7561 212 1297 ITMS + c NSI d w Full ms2 722.32@cid35.00 [185... 2
2 230.094986 6.776063 NaN <NONE> 2 722.324707 2 spectrum=2624 584 906 7561 212 1297 ITMS + c NSI d w Full ms2 722.32@cid35.00 [185... 2
3 231.250870 3.464486 NaN <NONE> 2 722.324707 2 spectrum=2624 584 906 7561 212 1297 ITMS + c NSI d w Full ms2 722.32@cid35.00 [185... 2
4 232.206757 2.642101 NaN <NONE> 2 722.324707 2 spectrum=2624 584 906 7561 212 1297 ITMS + c NSI d w Full ms2 722.32@cid35.00 [185... 2

Plot Using pyopenms_viz#

Now that we have the spectrum in a Pandas Dataframe, we can plot it using pyopenms_viz. Note that we must specify that the x-axis is the column labelled mz and the y-axis is the column labelled intensity. Furthermore, a backend must also be specified.

[7]:
spectrum_df.plot(kind='spectrum', x='mz', y='intensity', backend='ms_bokeh')
Loading BokehJS ...

As this plot is interactive we can zoom in using the bokeh toolbar and also hovering over the peaks will provide their annotations.

We can instead plot using matploltib by adjusting the backend.

[8]:
spectrum_df.plot(kind='spectrum', backend='ms_matplotlib', x='mz', y='intensity')
_images/Getting_Started_33_0.png

Further Customization#

pyopenms_viz contains many options for plot customization. Some options are shown below. For a full list of options for customizing the spectrum please see the Spectrum Parameters

Base Customization Examples (Avalible for all graph types)#

Below are examples of customizations avaliable for all plot types. For a full list of customization options, please see Parameters

[9]:
base_kwargs = dict(kind='spectrum', x='mz', y='intensity', backend='ms_bokeh')

#spectrum_df.plot(grid=False, **base_kwargs) # no grid

#spectrum_df.plot(relative_intensity=False, **base_kwargs) # relative intensity

#spectrum_df.plot(xlabel='m/z', **base_kwargs) # change x label
Spectrum Specific Customization Examples#

These customizations below are specific to the spectrum plot

[10]:
base_kwargs = dict(kind='spectrum', x='mz', y='intensity', backend='ms_bokeh')


#spectrum_df.plot(bin_peaks=False, **base_kwargs) # no binning of peaks
spectrum_df.plot(annotate_top_n_peaks=0, **base_kwargs) # no peak annotation

Loading BokehJS ...

Even Further Customization#

By default pyopenms_viz plot function returns nothing and just displays the plot. In the case where further customizations are required, by setting show_plot=False pyopenms_viz will return the underlying figure/axes object and further customizations can be performed with the underlying backend.

[11]:
import matplotlib.pyplot as plt
# Save plot in matplotlib axes object
ax = spectrum_df.plot(kind='spectrum', backend='ms_matplotlib', x='mz', y='intensity', relative_intensity=True, show_plot=False)

# add a custom text annotation
ax.text(1000, 60, "a custom annotation")

# display plot
plt.show()
_images/Getting_Started_44_0.png

Saving the Plot#

If you want to save the plot, you need to set show_plot=False, and then use the respective backend’s save method.

  • For matplotlib, you can use plt.savefig(). For more information on saving matplotlib plots, see here.

p = spectrum_df.plot(
    kind="spectrum",
    x="mz",
    y="intensity",
    backend="ms_matplotlib",
    show_plot=False
)
# Save a png file
plt.savefig('spectrum.png')
# Save an svg file
plt.savefig('spectrum.svg')
# Save a pdf file
plt.savefig('spectrum.pdf')
  • For bokeh, you can use output_file method to save an html file of the iteractive plot. To save a static plot you can use the tool bar in the interactive plot to save the plot as a png. Or you can use the export_png method. However, this requires additional dependencies: pip install selenium geckodriver firefox. For more information on saving bokeh plots see here.

from bokeh.io import output_file
p = spectrum_df.plot(
    kind="spectrum",
    x="mz",
    y="intensity",
    backend="ms_bokeh",
    show_plot=False
)
# Save a html file
output_file("spectrum.html")
  • For plotly, you can use fig.write_html() method to save an html file of the interactive plot. To save a static plot you can use the tool bar in the interactive plot to save the plot as a png/svg/pdf. Or you can use the write_image method. However, this requires additional dependencies: pip install -U kaleido. For more information on saving plotly plots see here.

p = spectrum_df.plot(
    kind="spectrum",
    x="mz",
    y="intensity",
    backend="ms_plotly",
    show_plot=False
)
# Save a html file
p.write_html("spectrum.html")

Annotating A Spectrum#

If there is a peptide spectrum match, it is useful to annotate the spectrum with the fragments of the expected peptide so that one can manually evalute the match. Here, we demonstrate how to do this with pyopenms_viz and pyopenms

Create a mirror reference spectrum using pyopenms#

Using pyopenms we can create a theoretical spectrum on the peptide YIC(Carbamidomethyl)DNQDTISSK to compare with the experimental spectrum shown above. This is adapted from the pyopenms tutorial here

[12]:
# Generate theoretical spectrum
tsg = oms.TheoreticalSpectrumGenerator()
theo_spec = oms.MSSpectrum()
peptide = oms.AASequence.fromString("YIC(Carbamidomethyl)DNQDTISSK")
p = tsg.getParameters()
p.setValue("add_y_ions", "true")
p.setValue("add_b_ions", "true")
p.setValue("add_metainfo", "true")
tsg.setParameters(p)
tsg.getSpectrum(theo_spec, peptide, 1, 2)
theo_spec_df = theo_spec.get_df()

Plot the Spectrum and Mirror Spectrum#

To plot with pyopenms_viz, all that is required is the addtional reference_spectrum parameter which contains a pandas dataframe of the theoretical spectrum and to change mirror_spectrum to True

[13]:
spectrum_df.plot(kind='spectrum',
                 backend='ms_bokeh',
                 x='mz',
                 y='intensity',
                 annotate_top_n_peaks=0,
                 reference_spectrum=theo_spec_df,
                 title='Peptide YIC(Carbamidomethyl)DNQDTISSK',
                 title_font_size=12,
                 mirror_spectrum=True)
Loading BokehJS ...

Annotating the Experimental Spectrum#

We can annotate the experimental spectrum by aligning it to the theoretical spectrum and taking the labels of the fragment ions from the theoretical spectrum.

First we perform spectrum alignment using pyopenms

[14]:
alignment = []
spa = oms.SpectrumAlignment()
p = spa.getParameters()
# use 0.5 Da tolerance for m/z (Note: for high-resolution data we could also use ppm by setting the is_relative_tolerance value to true)
p.setValue("tolerance", 0.5)
p.setValue("is_relative_tolerance", "false")
spa.setParameters(p)
# align both spectra
spa.getSpectrumAlignment(alignment, theo_spec, spectrum)

Next we update the spectrum_df with the ion annotations

[15]:
for theo_spec_idx, exp_spec_idx in alignment:
    spectrum_df.loc[exp_spec_idx, 'ion_annotation'] = theo_spec_df.loc[theo_spec_idx, 'ion_annotation']

Now that we have annotated the spectrum we can plot it. To include annotations in the plot pyopenms_viz expects a column name for the ion_annotation argument.

[16]:
spectrum_df.plot(kind='spectrum',
                 backend='ms_bokeh',
                 x='mz',
                 y='intensity',
                 annotate_top_n_peaks=3,
                 ion_annotation='ion_annotation',
                 title='Peptide YIC(Carbamidomethyl)DNQDTISSK',
                 title_font_size=12)
Loading BokehJS ...

By default when we set ion_annotation the y ions are colored in red and the b ions are colored in blue. Howevering over the peaks will now show the matched ion annotation

We can also combine the reference spectrum with the ion annotation in one plot.

[17]:
spectrum_df.plot(kind='spectrum',
                 backend='ms_bokeh',
                 x='mz',
                 y='intensity',
                 annotate_top_n_peaks=0,
                 ion_annotation='ion_annotation',
                 title='Peptide YIC(Carbamidomethyl)DNQDTISSK',
                 title_font_size=12,
                 mirror_spectrum=True,
                 reference_spectrum=theo_spec_df)
Loading BokehJS ...

Plot a PeakMap#

A peakmap is a useful plot for generating an overview of the entire profile (MS1) data. In this representation all analytes can be visualized across time.

Download Data and Load Data#

Here, we are using a slice of 100 seconds of a metabolomics dataset which only contains MS1 information. In peakmap plots, it does not make sense to plot both MS1 and MS2 data in a single plot.

[18]:
gh = "https://raw.githubusercontent.com/OpenMS/pyopenms-docs/master"
urlretrieve(gh + "/src/data/FeatureFinderMetaboIdent_1_input.mzML", "test.mzML")

# open .mzML file using pyopenms
exp = oms.MSExperiment()
oms.MzMLFile().load("test.mzML", exp)

# convert pyopenms object to pandas dataframe
exp_df = exp.get_df(long=True)
exp_df
[18]:
RT mz inty
0 200.232544 250.117401 1314.365601
1 200.232544 250.117935 2756.825439
2 200.232544 250.118469 4262.159180
3 200.232544 250.118988 4490.361816
4 200.232544 250.119522 3139.247803
... ... ... ...
117987 299.812897 274.114136 7654.166992
117988 299.812897 274.114746 10584.880859
117989 299.812897 274.115356 9032.423828
117990 299.812897 274.115967 5456.675293
117991 299.812897 274.116577 2176.139404

117992 rows × 3 columns

[19]:
exp_df.plot(kind='peakmap', backend='ms_bokeh', x='RT', y='mz', z='inty')
Loading BokehJS ...

PeakMap Specific Customization examples#

Below are some customizations that are specific to the "peakMap" plot. Note that some of these customizations, such as binning are also available for the "spectrum" plot however they are not avaliable for all plot types (e.g. "chromatogram"). For a full list of customizations please see the PeakMap Parameters.

[20]:
base_kwargs = dict(kind='peakmap', backend='ms_bokeh', x='RT', y='mz', z='inty')

#exp_df.plot(bin_peaks=False, **base_kwargs) # no binning

#exp_df.plot(num_x_bins=20, num_y_bins=20, **base_kwargs) # manual binning

exp_df.plot(z_log_scale=True, **base_kwargs) # log z scale

Loading BokehJS ...

PeakMap with Marginals#

One useful customization is to plot the peak map with marginal plots. The marginal plots sum the x and y axis respectively to provide a supplemental view of the data.

[21]:
base_kwargs = dict(kind='peakmap', backend='ms_bokeh', x='RT', y='mz', z='inty')

exp_df.plot(add_marginals=True, **base_kwargs)
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/pyopenms_viz/_bokeh/core.py:601: UserWarning:
You are attempting to set `plot.legend.orientation` on a plot that has zero legends added, this will have no effect.

Before legend properties can be set, you must add a Legend explicitly, or call a glyph method with a legend parameter set.

  y_fig.legend.orientation = "horizontal"
Loading BokehJS ...

Chromatogram Plot#

A chromatogram plot is useful for visualizing intensity across retention time. This can either be the total ion current across retention time or the ion curent from a specific region across m/z as used in targeted apporaches.

In this example, we will visualize the total ion current of the metabolomics dataset above. By default, pyopenms_viz does not manipulate the supplied

[22]:
exp_df.plot(kind='chromatogram', backend='ms_bokeh', x='RT', y='inty')
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/pyopenms_viz/_core.py:250: UserWarning: Duplicate data detected, data will not be aggregated which may lead to unexpected plots. To enable aggregation set `aggregate_duplicates=True`.
  warnings.warn(
Loading BokehJS ...

Note

By default pyopenms_viz does not manipulate the supplied dataframe meaning that there might be multiple points plotted at a single retention time value leading to unexpected plots. Peaks can be automatically summed up using aggregate_duplicates=True

[23]:
exp_df.plot(kind='chromatogram', backend='ms_bokeh', x='RT', y='inty', aggregate_duplicates=True)
Loading BokehJS ...

In this dataset different features are resolved nicely as indicated by nicely seperable peaks.

Chromatogram Specific Customizations#

For a full list of chromatogram specific customizations please see Chromatogram Parameters

Case Example: Inspecting a Peptide in DIA#

In this example, we will show how pyopenms_viz can be used to inspec a target peptide in Data Indepdent Acquisition. For this, we will be using a timsTOF dataset and show how pyopenms_viz integrates well with alphatims

Download the Data#

[24]:
import requests
import zipfile
import os

# Define the URL and file name
url = 'https://github.com/MannLabs/alphatims/releases/download/0.1.210317/20201207_tims03_Evo03_PS_SA_HeLa_200ng_EvoSep_prot_high_speed_21min_8cm_S1-C8_1_22474.d.zip'
file_name = '20201207_tims03_Evo03_PS_SA_HeLa_200ng_EvoSep_prot_high_speed_21min_8cm_S1-C8_1_22474.d.zip'
extract_dir = './'  # Directory to extract the contents

# Download the file
response = requests.get(url)
with open(file_name, 'wb') as file:
    file.write(response.content)

print(f'File {file_name} downloaded successfully!')

# Unzip the file
with zipfile.ZipFile(file_name, 'r') as zip_ref:
    zip_ref.extractall(extract_dir)

print(f'File extracted to {extract_dir}!')

# Optionally, remove the zip file after extraction
os.remove(file_name)
print(f'Zip file {file_name} removed.')
File 20201207_tims03_Evo03_PS_SA_HeLa_200ng_EvoSep_prot_high_speed_21min_8cm_S1-C8_1_22474.d.zip downloaded successfully!
File extracted to ./!
Zip file 20201207_tims03_Evo03_PS_SA_HeLa_200ng_EvoSep_prot_high_speed_21min_8cm_S1-C8_1_22474.d.zip removed.
[25]:
import alphatims.bruker
bruker_dia_d_folder_name = "./20201207_tims03_Evo03_PS_SA_HeLa_200ng_EvoSep_prot_high_speed_21min_8cm_S1-C8_1_22474.d"
dia_data = alphatims.bruker.TimsTOF(bruker_dia_d_folder_name)
100%|██████████| 11868/11868 [00:06<00:00, 1702.74it/s]

Extracting Data for peptide#

Let’s modify the inspect_peptide method from alphatims/nbs /tutorial.ipynb to return a dataframe with extracted data for the target precursor and fragemnt ions.

[26]:
def inspect_peptide(
    dia_data,
    peptide,
    ppm=50,
    rt_tolerance=30, #seconds
    mobility_tolerance=0.05, #1/k0
):
    precursor_mz = peptide["mz"]
    precursor_mobility = peptide["mobility"]
    precursor_rt = peptide["rt"]
    fragment_mzs = peptide["fragment_mzs"]
    rt_slice = slice(
        precursor_rt - rt_tolerance,
        precursor_rt + rt_tolerance
    )
    im_slice = slice(
        precursor_mobility - mobility_tolerance,
        precursor_mobility + mobility_tolerance
    )
    precursor_mz_slice = slice(
        precursor_mz / (1 + ppm / 10**6),
        precursor_mz * (1 + ppm / 10**6)
    )
    precursor_indices = dia_data[
        rt_slice,
        im_slice,
        0, #index 0 means that the quadrupole is not used
        precursor_mz_slice,
        "raw"
    ]
    prec_df = dia_data.as_dataframe(precursor_indices)
    prec_df['Annotation'] = 'prec'
    # print(f"Info: Shape of prec - {prec_df.shape}")

    all_dfs = []
    all_dfs.append(prec_df)
    for fragment_name, mz in fragment_mzs.items():
        fragment_mz_slice = slice(
            mz / (1 + ppm / 10**6),
            mz * (1 + ppm / 10**6)
        )
        fragment_indices = dia_data[
            rt_slice,
            im_slice,
            precursor_mz_slice,
            fragment_mz_slice,
            "raw"
        ]
        frag_df = dia_data.as_dataframe(fragment_indices)
        frag_df['Annotation'] = fragment_name
        all_dfs.append(frag_df)

    all_df = pd.concat(all_dfs, axis=0)
    all_df['ms_level'] = all_df['Annotation'].apply(lambda x: 1 if x == 'prec' else 2)

    return all_df

Let’s create a peptide dictionary with target coordinates for our target peptide

In order to extract a peptide, we need information on the peptide. For this example we will use peptide YNDTFWK below.

[27]:
peptide = {
    "sequence": "YNDTFWK",
    "mz": 487.22439,
    "mobility": 0.81,
    "rt": 9.011 * 60, #seconds
    "charge": 2,
    "fragment_mzs": {
        "y7": 973.44145,
        "y6": 810.37812,
        "y5": 696.33520,
        "y4": 581.30825,
        "y3": 480.26057,
        "y2": 333.19216,
        "y1": 147.11285,
        "b1": 164.07065,
        "b2": 278.11358,
        "b3": 393.14052,
        "b4": 494.18820,
        "b5": 641.25661,
        "b6": 827.33592,
        "b7": 955.43089,
    }
}

[28]:
dia_df = inspect_peptide(dia_data, peptide, 50, 70, 0.1)
dia_df
[28]:
raw_indices frame_indices scan_indices precursor_indices push_indices tof_indices rt_values rt_values_min mobility_values quad_low_mz_values quad_high_mz_values mz_values intensity_values corrected_intensity_values Annotation ms_level
0 146798839 4438 694 0 4119158 152514 470.926503 7.848775 0.865903 -1.0 -1.0 487.228624 54 54 prec 1
1 146799536 4438 695 0 4119159 152515 470.926503 7.848775 0.864769 -1.0 -1.0 487.232119 68 68 prec 1
2 146802782 4438 700 0 4119164 152516 470.926503 7.848775 0.859099 -1.0 -1.0 487.235614 61 61 prec 1
3 146809110 4438 711 0 4119175 152519 470.926503 7.848775 0.846621 -1.0 -1.0 487.246098 144 144 prec 1
4 146809639 4438 712 0 4119176 152519 470.926503 7.848775 0.845486 -1.0 -1.0 487.246098 94 94 prec 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
151 255542671 5684 671 4 5275423 264156 602.997099 10.049952 0.891960 475.0 500.0 955.470987 145 145 b7 2
152 257300501 5702 746 4 5292202 264157 604.905047 10.081751 0.806864 475.0 500.0 955.475880 158 158 b7 2
153 260842198 5738 715 4 5325579 264145 608.719248 10.145321 0.842081 475.0 500.0 955.417157 79 79 b7 2
154 261754335 5747 749 4 5333965 264155 609.673048 10.161217 0.803453 475.0 500.0 955.466093 93 93 b7 2
155 262643510 5756 730 4 5342298 264152 610.627758 10.177129 0.825048 475.0 500.0 955.451412 152 152 b7 2

19375 rows × 16 columns

Globally Set Plotting Backend#

Let’s globally set the plotting backend so it does not have to be specified in each function call.

[29]:
pd.options.plotting.backend = 'ms_bokeh'

Chromatogram Plotting#

Let’s plot an extracted ion chromatgram, but before we do that, let’s preprocess the data and integrate the intensity values across ion mobility dimension to get an integrated extracted ion chromatogram per precursor and fragment ions.

If we plot the dia_df chromatograms at baseline, we will have multiple points per RT point (due to multiple points across ion mobility). To address this we can use the aggregate_duplicates=True argument. Furthermore, setting by='Annotation tells pyopenms_viz to plot each annotation as a seperate chromatogram.

[30]:
dia_df.plot(x='rt_values',
            y='intensity_values',
            kind='chromatogram',
            by='Annotation', # each annotation as seperate chromatogram
            aggregate_duplicates=True,
            width=700,
            legend={'title':'Annotation'},
            title="Extracted Chromatogram for YNDTFWK_2",
            title_font_size=14)
Loading BokehJS ...

Based on the plot above, it looks like this peptide precursor elutes from 594.40 to 605 seconds. We can draw these boundaries on the chromatogram by specifiying the annotation_data dataframe. We can also extract across retention time to better see the feature.

[31]:
annotation_data = pd.DataFrame(dict(leftWidth=[594.40], rightWidth=[605]))

# filter df across rt
dia_df_small = dia_df[dia_df['rt_values'].between(500, 700)]

dia_df_small.plot(x='rt_values',
            y='intensity_values',
            kind='chromatogram',
            by='Annotation', # each annotation as seperate chromatogram
            aggregate_duplicates=True,
            annotation_data=annotation_data,
            width=700,
            height=500,
            legend = {'title':'Annotation'},
            title="Extracted Chromatogram for YNDTFWK_2",
            title_font_size=14)
Loading BokehJS ...

Grouping is not only limited to Annnotation we can change the by parameter to ms_level to get the total ion chromatograms for MS1 and MS2 levels respectively.

[32]:
annotation_data = pd.DataFrame(dict(leftWidth=[594.40], rightWidth=[605]))

# filter df across rt
dia_df_small = dia_df[dia_df['rt_values'].between(500, 700)]

dia_df_small.plot(x='rt_values',
            y='intensity_values',
            kind='chromatogram',
            by='ms_level', # each ms_level as a seperate chromatogram
            aggregate_duplicates=True,
            annotation_data=annotation_data,
            width=700,
            height=500,
            legend = {'title':'Annotation'},
            title="Extracted Chromatogram for YNDTFWK_2",
            title_font_size=14)
Loading BokehJS ...

Mobilogram Plotting#

We can also plot the extracted ion mobilogram as shown below

[33]:
group_cols=['ms_level', 'Annotation', 'mobility_values']
integrate_col = 'intensity_values'
dia_xim_df = dia_df.apply(
                lambda x: x.fillna(0) if x.dtype.kind in "biufc" else x.fillna(".")
            ) \
            .groupby(group_cols)[integrate_col] \
            .sum() \
            .reset_index()
[34]:
dia_df.plot(x='mobility_values',
                  y='intensity_values',
                  kind='mobilogram',
                  by='Annotation',
                  title = f"Extracted Spectrum for YNDTFWK_2",
                  height=500, width=700,
                  grid=False, # Remove the grid
                  aggregate_duplicates=True
                 )
Loading BokehJS ...

Spectrum Plotting#

Let’s take a look at the extracted ion spectrum data. Since we have annotation labels for the extracted spectrum, we can add these to the plot as shown below. Here we will use the "ms_matplotlib" backend to generate a static plot.

[35]:
dia_df.plot(x="mz_values",
            y="intensity_values",
            kind="spectrum",
            by="Annotation",
            title = f"Extracted Spectrum for YNDTFWK_2",
            backend='ms_matplotlib',
            aggregate_duplicates=True,
            height=500, width=700,
            annotate_top_n_peaks=0,
            grid=False, # Remove the grid
            legend = {'title':'Trace', 'bbox_to_anchor':(1.2,0.5)})
_images/Getting_Started_112_0.png

2D Peak Map Plotting#

An alternative usage to the peakmap plot is to plot retetion time vs ion mobility for a single mass trace. This usage is demonstrated below. Let’s see what the retention time and ion mobility peak map looks like for the mass range of the precursor ion across all MS1 spectra.

[36]:
dia_df_prec = dia_df[dia_df['Annotation'] == 'prec']
[37]:
dia_df_prec.plot(x="rt_values",
               y="mobility_values",
               z="intensity_values",
               kind="peakmap",
               xlabel="Retention Time [sec]",
               ylabel="Ion Mobility",
               height=600,
               width=700,
               grid=False)
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/pyopenms_viz/_core.py:250: UserWarning: Duplicate data detected, data will not be aggregated which may lead to unexpected plots. To enable aggregation set `aggregate_duplicates=True`.
  warnings.warn(
Loading BokehJS ...

As expected, we see an intense signal with a retention time ~600 consistent with the chromatogram above and ion mobility of ~0.81.

Since intensity can be difficult to gauge only by color, we can also add marginal plots which sums up the x and y-axis respectively

[38]:
dia_df_prec.plot(x="rt_values",
               y="mobility_values",
               z="intensity_values",
               kind="peakmap",
               xlabel="Retention Time [sec]",
               ylabel="Ion Mobility",
               add_marginals=True,
               width=700,
               backend='ms_matplotlib',
               y_kind='chromatogram', # specify on x marginal axis to plot as a chromatogram rather than a spectrum
               x_kind='chromatogram', # specify on x marginal axis to plot as a chromatogram rather than a spectrum
               grid=False)
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/pyopenms_viz/_core.py:250: UserWarning: Duplicate data detected, data will not be aggregated which may lead to unexpected plots. To enable aggregation set `aggregate_duplicates=True`.
  warnings.warn(
_images/Getting_Started_118_1.png