Plot

Functions to plot on 2D

Setup

Dimensionality reduction


source

reduce_feature

 reduce_feature (data, method='pca', complexity=20, n=2, seed:int=123,
                 **kwargs)

Reduce the dimensionality given a dataframe of values

Type Default Details
data df or numpy array
method str pca dimensionality reduction method, accept both capital and lower case
complexity int 20 None for PCA; perfplexity for TSNE, recommend: 30; n_neigbors for UMAP, recommend: 15
n int 2 n_components
seed int 123 seed for random_state
kwargs VAR_KEYWORD
# morgan fingerprints
df = pd.read_csv('files/morgan.csv')
df
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 ... 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047
0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
298 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
299 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

300 rows × 2048 columns

pca = reduce_feature(df,'pca',n=10)
pca
PCA1 PCA2 PCA3 PCA4 PCA5 PCA6 PCA7 PCA8 PCA9 PCA10
0 5.055364 -0.201475 0.017985 -1.484066 1.548818 0.998801 -1.959704 -1.446077 2.579568 0.791852
1 0.720893 0.104023 3.616964 0.774077 0.262882 -0.813578 0.194586 0.606086 0.337897 1.006187
... ... ... ... ... ... ... ... ... ... ...
298 -0.911653 -0.834387 0.054771 -0.141513 -0.385500 0.036934 0.139089 -0.157255 -0.316494 -0.042620
299 -0.506653 -0.217700 -0.309063 0.005900 -0.275369 -0.652045 -0.151574 0.838589 -0.281150 -0.323430

300 rows × 10 columns

2d plot


source

set_sns

 set_sns ()
set_sns()
/opt/hostedtoolcache/Python/3.10.18/x64/lib/python3.10/site-packages/fastcore/docscrape.py:230: UserWarning: Unknown section See Also
  else: warn(msg)

source

plot_2d

 plot_2d (X:pandas.core.frame.DataFrame, data=None, x=None, y=None,
          hue=None, size=None, style=None, palette=None, hue_order=None,
          hue_norm=None, sizes=None, size_order=None, size_norm=None,
          markers=True, style_order=None, legend='auto', ax=None)

Make 2D plot from a dataframe that has first column to be x, and second column to be y

Type Default Details
X DataFrame a dataframe that has first column to be x, and second column to be y
data NoneType None Input data structure. Either a long-form collection of vectors that can be
assigned to named variables or a wide-form dataset that will be internally
reshaped.
x NoneType None
y NoneType None
hue NoneType None Grouping variable that will produce points with different colors.
Can be either categorical or numeric, although color mapping will
behave differently in latter case.
size NoneType None Grouping variable that will produce points with different sizes.
Can be either categorical or numeric, although size mapping will
behave differently in latter case.
style NoneType None Grouping variable that will produce points with different markers.
Can have a numeric dtype but will always be treated as categorical.
palette NoneType None Method for choosing the colors to use when mapping the hue semantic.
String values are passed to :func:color_palette. List or dict values
imply categorical mapping, while a colormap object implies numeric mapping.
hue_order NoneType None Specify the order of processing and plotting for categorical levels of the
hue semantic.
hue_norm NoneType None Either a pair of values that set the normalization range in data units
or an object that will map from data units into a [0, 1] interval. Usage
implies numeric mapping.
sizes NoneType None An object that determines how sizes are chosen when size is used.
List or dict arguments should provide a size for each unique data value,
which forces a categorical interpretation. The argument may also be a
min, max tuple.
size_order NoneType None Specified order for appearance of the size variable levels,
otherwise they are determined from the data. Not relevant when the
size variable is numeric.
size_norm NoneType None Normalization in data units for scaling plot objects when the
size variable is numeric.
markers bool True Object determining how to draw the markers for different levels of the
style variable. Setting to True will use default markers, or
you can pass a list of markers or a dictionary mapping levels of the
style variable to markers. Setting to False will draw
marker-less lines. Markers are specified as in matplotlib.
style_order NoneType None Specified order for appearance of the style variable levels
otherwise they are determined from the data. Not relevant when the
style variable is numeric.
legend str auto How to draw the legend. If “brief”, numeric hue and size
variables will be represented with a sample of evenly spaced values.
If “full”, every group will get an entry in the legend. If “auto”,
choose between brief or full representation based on number of levels.
If False, no legend data is added and no legend is drawn.
ax NoneType None Pre-existing axes for the plot. Otherwise, call :func:matplotlib.pyplot.gca
internally.
Returns :class:matplotlib.axes.Axes The matplotlib axes containing the plot.
plot_2d(pca.iloc[:,:2])


source

plot_corr

 plot_corr (x, y, xlabel=None, ylabel=None, order=3)
Type Default Details
x a column of df
y a column of df
xlabel NoneType None x axis label
ylabel NoneType None y axis label
order int 3 polynomial level, if straight, order=1
plot_corr(pca.PCA1,pca.PCA2)

End