ns.viz
Functions¶
Generating useful visualizations from your annotated data.¶
Every function available in ns.viz
was written to accomplish two things:
- Show all of your data as simply and transparently as possible.
- Be useful for plate-like data always and any data format when possible.
Creating plots from a Plate
¶
Creating plots from a DataFrame
¶
Plots¶
plot_scatter
| plot_rof
| plot_hm
| plot_bar
| plot_curve
¶
import ninetysix as ns
import pandas as pd
import numpy as np
import bokeh.palettes
from ns_data_generator import (generate_plate_data,
generate_condition_plate)
Creating plots from a Plate
¶
Once you create a Plate
object, you have quick access to a host of plotting methods to create useful visualizations.
These methods (usually having the prefix plot_
followed by the type of plot) create an interactive holoviews
chart. Although ninetysix
provides defaults and arguments for many of these charts that help with plate-based analysis, they are almost infinitely tunable through the holoviews
and bokeh
libraries, which is described on the advanced visualization page.
Typical well-value data¶
A well-value scatter chart gives you the most general look at your data, where you can look at the distribution of data and if there are any edge/row/column effects, which would show up as patterns in this plot:
# Generate some data and pass to Plate
plate_data = generate_plate_data()
pt = ns.Plate(plate_data)
# Plot a well-value scatter plot
pt.plot_scatter()
We can see that the four lowest points look evenly spaced along the well
axis, which will make sense when we apply annotations:
controls = {
'default': 'experiment',
'[A-D]10': 'standard',
'[E-H]10': 'negative',
}
pt = pt.annotate_wells({'controls': controls})
pt.plot_scatter(color='controls')
Adding .hist()
to the end of this function will add on a histogram to show the distribution of value
, a functionality directly callable from a holoviews
Scatter chart:
pt.plot_scatter(color='controls').hist()
We can adjust the x and y axes via thexlim
and ylim
arguments in the .opts
method available to holoviews
charts. (You can read more about these types of operations on the advanced visualization page.)
pt.plot_scatter(color='controls').opts(ylim=(0, 50)).hist()
(Note that you can control the layering of points in this type of chart so that certain important groups (e.g., 'standard') do not get eclipsed by other points, as described in the plot_scatter
section.)
Condition Testing¶
In some instances, a plate is used to assess the effect of various conditions on a given activity (or activities). Via ninetysix
, this is intuitively mapped as the annotations
and values
so that you can process and analyze the effects of each condition as efficiently as possible.
# Generate some condition plate data
condition_plate = generate_condition_plate()
pt = ns.Plate(condition_plate)
# Plot
pt.plot_scatter().hist()
Clearly there is some very, very strong well-dependent effect on our value. We can see this in a different way if looking at a heatmap of the data:
pt.plot_hm()
Since we know what conditions we subjected each well to, we can assign these via annotate_wells
. These are generic, but imagine that condition_1
is the concentration of something (a catalyst, substrate, or ligand, for example) and condition_2
is some condition that you suspect is important for the activity (perhaps a method of cell lysis, if purifying a protein/enzyme, or the presence/absence of an activating factor).
# Assign each condition
annotations = {
'condition_1': {
'[A,E][1-4]': 1,
'[A,E][5-8]': 2,
'[A,E][9-12]': 3,
'[B,F][1-4]': 4,
'[B,F][5-8]': 5,
'[B,F][9-12]': 6,
'[C,G][1-4]': 7,
'[C,G][5-8]': 8,
'[C,G][9-12]': 9,
'[D,H][1-4]': 10,
'[D,H][5-8]':11,
'[D,H][9-12]': 12,
},
'condition_2': {
'[A-D][1-12]': True,
'[E-H][1-12]': False,
},
}
# Annotate
pt = pt.annotate_wells(annotations)
pt
well | row | column | condition_1 | condition_2 | value | |
---|---|---|---|---|---|---|
0 | A1 | A | 1 | 1 | True | 1.838597 |
1 | A2 | A | 2 | 1 | True | 1.865927 |
2 | A3 | A | 3 | 1 | True | 1.999423 |
3 | A4 | A | 4 | 1 | True | 2.101183 |
4 | A5 | A | 5 | 2 | True | 3.932901 |
... | ... | ... | ... | ... | ... | ... |
91 | H8 | H | 8 | 11 | False | 3.923439 |
92 | H9 | H | 9 | 12 | False | 12.853956 |
93 | H10 | H | 10 | 12 | False | 13.768902 |
94 | H11 | H | 11 | 12 | False | 7.033448 |
95 | H12 | H | 12 | 12 | False | 4.983802 |
96 rows × 6 columns
We can confirm that our annotations are present by passing them to the outline
argument of plot_hm
:
pt.plot_hm(outline='condition_1')
or to the color
argument of plot_scatter
:
pt.plot_scatter(color='condition_2')
Although not optimized for this, since values
are usually assumed to be quantitative (this will likely change in a future version), a heatmap can also be used to visualize the conditions in a plate if you pass an annotation
to the value_name
argument:
# Use 'condition_1' as the value name
pt.plot_hm(value_name='condition_1', hm_cmap='viridis')
We used the viridis
colormap above, but any list of colors can be passed to cmap
arguments in these functions. Many exceptional colormaps are available from resources such as bokeh.palettes
(read more here) or colorcet
(read more here). We will create a list of 12 colors from bokeh.palettes
, which we imported above.
# This create a colormap of 12 colors
cmap = bokeh.palettes.Category20_12
# Use this cmap
pt.plot_hm(value_name='condition_1', hm_cmap=cmap).opts(color_levels=None)
(Note: the .opts(color_levels=None)
addition is currently best if you want to use plot_hm
in this way due to the assumption of quantitative value data, but this should be updated in a future version.)
# Colors for 'False', 'True', in condition_2
cmap = ['lightgray', 'cornflowerblue']
pt.plot_hm(value_name='condition_2', hm_cmap=cmap).opts(color_levels=None)
Grouping the data¶
The most powerful aspect of annotating data in this way is that it allows you to perform groupby
operations, splitting the wells up along different annotations and analyzing the results separately. For example, we can visualize the data grouped by condition_1
using the plot_bar
method with condition_1
as the variable on the x-axis:
pt.plot_bar('condition_1', width=600, cmap='viridis')
By default, these methods are designed to show all of your data. If we were only using error bars and did not already look at the scatter
chart, the plot above might look convincing enough and not suggest any further analysis. However, because we can see the individual points and (especially) how they are grouped, it becomes clear that there may be another factor affecting our data. We have a second annotation in our Plate
, and we can use this to further group our data with the groupby
argument:
pt.plot_bar(
'condition_1',
groupby='condition_2',
width=600,
cmap='viridis',
)
We now have a plot with a dropdown menu to switch between each of the conditions in condition_2
. (If you want to see all the plots at once, simply add the argument layout=True
above, as shown below.) When condition_2
is absent, our results are attenuated and noisy. However, then condition_2
is present, our results show a linear trend with much lower noise.
You can read more about plot_bar
in its section below.
Creating plots from a DataFrame
¶
From the plot_bar
charts above, it should be clear that the well
information isn't explicitly needed to create this type of chart; it was just needed to assign conditions to each well and then visualize based on the conditions.
plot_bar
(and many other plotting functions in ns.viz
) take this into account and can be called as stand-alone functions that take a generic DataFrame
object, in the event that these plots are needed for a non-plate object.
Let's visualize the same dataset above, but without any well information so that it is not compatible with ns.Plate
.
nonplate = pt.df[['condition_1', 'condition_2', 'value']].copy()
nonplate
condition_1 | condition_2 | value | |
---|---|---|---|
0 | 1 | True | 1.838597 |
1 | 1 | True | 1.865927 |
2 | 1 | True | 1.999423 |
3 | 1 | True | 2.101183 |
4 | 2 | True | 3.932901 |
... | ... | ... | ... |
91 | 11 | False | 3.923439 |
92 | 12 | False | 12.853956 |
93 | 12 | False | 13.768902 |
94 | 12 | False | 7.033448 |
95 | 12 | False | 4.983802 |
96 rows × 3 columns
We can use this as our object for ns.viz.plot_bar
:
ns.viz.plot_bar(
nonplate,
'condition_1',
groupby='condition_2',
color='value',
cmap='viridis',
width=450,
layout=True
)
We did not need to specify value_name='value'
because value
was on the right side of the DataFrame
and, as with ninetysix
, assumed to be the value
. If that's not the case (as it often may not be with a generic DataFrame
), simply pass the additional agrument:
left_value_df = pt.df[['value', 'condition_1', 'condition_2']].copy()
# 'value' is now on the left
left_value_df
value | condition_1 | condition_2 | |
---|---|---|---|
0 | 1.838597 | 1 | True |
1 | 1.865927 | 1 | True |
2 | 1.999423 | 1 | True |
3 | 2.101183 | 1 | True |
4 | 3.932901 | 2 | True |
... | ... | ... | ... |
91 | 3.923439 | 11 | False |
92 | 12.853956 | 12 | False |
93 | 13.768902 | 12 | False |
94 | 7.033448 | 12 | False |
95 | 4.983802 | 12 | False |
96 rows × 3 columns
ns.viz.plot_bar(
left_value_df,
'condition_1',
value_name='value', # <- addition
groupby='condition_2',
color='value',
cmap='viridis',
width=450,
layout=True,
)
In the event that you try to create a plot from a DataFrame
that does not contain well information even though it is needed (e.g., plot_hm
requires 'row' and 'column' info), you will receive an error telling you that you must pass this information.
More information on optimizing these visualizations with the tools available from the holoviews
and bokeh
packages, including axes/font sizes and how to export to SVG format, is on the Advanced data visualization page.
More information on getting the most out of the Plate
class, including using pandas
methods directly from a Plate
object, can be found on The Plate class page.
Information on constructing and using multi-Plate
objects can be found on the Plates page.