ns.viz Functions¶
Generating useful visualizations from your annotated data.¶
Every function available in ns.viz was written to accomplish two things:
- Show all of your data as simply and transparently as possible.
- Be useful for plate-like data always and any data format when possible.
Creating plots from a Plate¶
Creating plots from a DataFrame¶
Plots¶
plot_scatter | plot_rof | plot_hm | plot_bar | plot_curve¶
import ninetysix as ns
import pandas as pd
import numpy as np
import bokeh.palettes
from ns_data_generator import (generate_plate_data,
generate_condition_plate)
Creating plots from a Plate ¶
Once you create a Plate object, you have quick access to a host of plotting methods to create useful visualizations.
These methods (usually having the prefix plot_ followed by the type of plot) create an interactive holoviews chart. Although ninetysix provides defaults and arguments for many of these charts that help with plate-based analysis, they are almost infinitely tunable through the holoviews and bokeh libraries, which is described on the advanced visualization page.
Typical well-value data¶
A well-value scatter chart gives you the most general look at your data, where you can look at the distribution of data and if there are any edge/row/column effects, which would show up as patterns in this plot:
# Generate some data and pass to Plate
plate_data = generate_plate_data()
pt = ns.Plate(plate_data)
# Plot a well-value scatter plot
pt.plot_scatter()
We can see that the four lowest points look evenly spaced along the well axis, which will make sense when we apply annotations:
controls = {
'default': 'experiment',
'[A-D]10': 'standard',
'[E-H]10': 'negative',
}
pt = pt.annotate_wells({'controls': controls})
pt.plot_scatter(color='controls')
Adding .hist() to the end of this function will add on a histogram to show the distribution of value, a functionality directly callable from a holoviews Scatter chart:
pt.plot_scatter(color='controls').hist()
We can adjust the x and y axes via thexlim and ylim arguments in the .opts method available to holoviews charts. (You can read more about these types of operations on the advanced visualization page.)
pt.plot_scatter(color='controls').opts(ylim=(0, 50)).hist()
(Note that you can control the layering of points in this type of chart so that certain important groups (e.g., 'standard') do not get eclipsed by other points, as described in the plot_scatter section.)
Condition Testing¶
In some instances, a plate is used to assess the effect of various conditions on a given activity (or activities). Via ninetysix, this is intuitively mapped as the annotations and values so that you can process and analyze the effects of each condition as efficiently as possible.
# Generate some condition plate data
condition_plate = generate_condition_plate()
pt = ns.Plate(condition_plate)
# Plot
pt.plot_scatter().hist()
Clearly there is some very, very strong well-dependent effect on our value. We can see this in a different way if looking at a heatmap of the data:
pt.plot_hm()
Since we know what conditions we subjected each well to, we can assign these via annotate_wells. These are generic, but imagine that condition_1 is the concentration of something (a catalyst, substrate, or ligand, for example) and condition_2 is some condition that you suspect is important for the activity (perhaps a method of cell lysis, if purifying a protein/enzyme, or the presence/absence of an activating factor).
# Assign each condition
annotations = {
'condition_1': {
'[A,E][1-4]': 1,
'[A,E][5-8]': 2,
'[A,E][9-12]': 3,
'[B,F][1-4]': 4,
'[B,F][5-8]': 5,
'[B,F][9-12]': 6,
'[C,G][1-4]': 7,
'[C,G][5-8]': 8,
'[C,G][9-12]': 9,
'[D,H][1-4]': 10,
'[D,H][5-8]':11,
'[D,H][9-12]': 12,
},
'condition_2': {
'[A-D][1-12]': True,
'[E-H][1-12]': False,
},
}
# Annotate
pt = pt.annotate_wells(annotations)
pt
| well | row | column | condition_1 | condition_2 | value | |
|---|---|---|---|---|---|---|
| 0 | A1 | A | 1 | 1 | True | 1.838597 |
| 1 | A2 | A | 2 | 1 | True | 1.865927 |
| 2 | A3 | A | 3 | 1 | True | 1.999423 |
| 3 | A4 | A | 4 | 1 | True | 2.101183 |
| 4 | A5 | A | 5 | 2 | True | 3.932901 |
| ... | ... | ... | ... | ... | ... | ... |
| 91 | H8 | H | 8 | 11 | False | 3.923439 |
| 92 | H9 | H | 9 | 12 | False | 12.853956 |
| 93 | H10 | H | 10 | 12 | False | 13.768902 |
| 94 | H11 | H | 11 | 12 | False | 7.033448 |
| 95 | H12 | H | 12 | 12 | False | 4.983802 |
96 rows × 6 columns
We can confirm that our annotations are present by passing them to the outline argument of plot_hm:
pt.plot_hm(outline='condition_1')
or to the color argument of plot_scatter:
pt.plot_scatter(color='condition_2')
Although not optimized for this, since values are usually assumed to be quantitative (this will likely change in a future version), a heatmap can also be used to visualize the conditions in a plate if you pass an annotation to the value_name argument:
# Use 'condition_1' as the value name
pt.plot_hm(value_name='condition_1', hm_cmap='viridis')
We used the viridis colormap above, but any list of colors can be passed to cmap arguments in these functions. Many exceptional colormaps are available from resources such as bokeh.palettes (read more here) or colorcet (read more here). We will create a list of 12 colors from bokeh.palettes, which we imported above.
# This create a colormap of 12 colors
cmap = bokeh.palettes.Category20_12
# Use this cmap
pt.plot_hm(value_name='condition_1', hm_cmap=cmap).opts(color_levels=None)
(Note: the .opts(color_levels=None) addition is currently best if you want to use plot_hm in this way due to the assumption of quantitative value data, but this should be updated in a future version.)
# Colors for 'False', 'True', in condition_2
cmap = ['lightgray', 'cornflowerblue']
pt.plot_hm(value_name='condition_2', hm_cmap=cmap).opts(color_levels=None)
Grouping the data¶
The most powerful aspect of annotating data in this way is that it allows you to perform groupby operations, splitting the wells up along different annotations and analyzing the results separately. For example, we can visualize the data grouped by condition_1 using the plot_bar method with condition_1 as the variable on the x-axis:
pt.plot_bar('condition_1', width=600, cmap='viridis')
By default, these methods are designed to show all of your data. If we were only using error bars and did not already look at the scatter chart, the plot above might look convincing enough and not suggest any further analysis. However, because we can see the individual points and (especially) how they are grouped, it becomes clear that there may be another factor affecting our data. We have a second annotation in our Plate, and we can use this to further group our data with the groupby argument:
pt.plot_bar(
'condition_1',
groupby='condition_2',
width=600,
cmap='viridis',
)
We now have a plot with a dropdown menu to switch between each of the conditions in condition_2. (If you want to see all the plots at once, simply add the argument layout=True above, as shown below.) When condition_2 is absent, our results are attenuated and noisy. However, then condition_2 is present, our results show a linear trend with much lower noise.
You can read more about plot_bar in its section below.
Creating plots from a DataFrame ¶
From the plot_bar charts above, it should be clear that the well information isn't explicitly needed to create this type of chart; it was just needed to assign conditions to each well and then visualize based on the conditions.
plot_bar (and many other plotting functions in ns.viz) take this into account and can be called as stand-alone functions that take a generic DataFrame object, in the event that these plots are needed for a non-plate object.
Let's visualize the same dataset above, but without any well information so that it is not compatible with ns.Plate.
nonplate = pt.df[['condition_1', 'condition_2', 'value']].copy()
nonplate
| condition_1 | condition_2 | value | |
|---|---|---|---|
| 0 | 1 | True | 1.838597 |
| 1 | 1 | True | 1.865927 |
| 2 | 1 | True | 1.999423 |
| 3 | 1 | True | 2.101183 |
| 4 | 2 | True | 3.932901 |
| ... | ... | ... | ... |
| 91 | 11 | False | 3.923439 |
| 92 | 12 | False | 12.853956 |
| 93 | 12 | False | 13.768902 |
| 94 | 12 | False | 7.033448 |
| 95 | 12 | False | 4.983802 |
96 rows × 3 columns
We can use this as our object for ns.viz.plot_bar:
ns.viz.plot_bar(
nonplate,
'condition_1',
groupby='condition_2',
color='value',
cmap='viridis',
width=450,
layout=True
)
We did not need to specify value_name='value' because value was on the right side of the DataFrame and, as with ninetysix, assumed to be the value. If that's not the case (as it often may not be with a generic DataFrame), simply pass the additional agrument:
left_value_df = pt.df[['value', 'condition_1', 'condition_2']].copy()
# 'value' is now on the left
left_value_df
| value | condition_1 | condition_2 | |
|---|---|---|---|
| 0 | 1.838597 | 1 | True |
| 1 | 1.865927 | 1 | True |
| 2 | 1.999423 | 1 | True |
| 3 | 2.101183 | 1 | True |
| 4 | 3.932901 | 2 | True |
| ... | ... | ... | ... |
| 91 | 3.923439 | 11 | False |
| 92 | 12.853956 | 12 | False |
| 93 | 13.768902 | 12 | False |
| 94 | 7.033448 | 12 | False |
| 95 | 4.983802 | 12 | False |
96 rows × 3 columns
ns.viz.plot_bar(
left_value_df,
'condition_1',
value_name='value', # <- addition
groupby='condition_2',
color='value',
cmap='viridis',
width=450,
layout=True,
)
In the event that you try to create a plot from a DataFrame that does not contain well information even though it is needed (e.g., plot_hm requires 'row' and 'column' info), you will receive an error telling you that you must pass this information.
More information on optimizing these visualizations with the tools available from the holoviews and bokeh packages, including axes/font sizes and how to export to SVG format, is on the Advanced data visualization page.
More information on getting the most out of the Plate class, including using pandas methods directly from a Plate object, can be found on The Plate class page.
Information on constructing and using multi-Plate objects can be found on the Plates page.