Other Utilities¶

Here are documented (occasionally sparsely) a few other utilities used in the vespa package.

Plotting¶

vespa.plotutils.plot2dhist(xdata, ydata, cmap='binary', interpolation='nearest', fig=None, logscale=True, xbins=None, ybins=None, nbins=50, pts_only=False, **kwargs)[source]¶

Plots a 2d density histogram of provided data

Parameters:

xdata,ydata – (array-like) Data to plot.
cmap – (optional) Colormap to use for density plot.
interpolation – (optional) Interpolation scheme for display (passed to plt.imshow).
fig – (optional) Argument passed to setfig().
logscale – (optional) If True then the colormap will be based on a logarithmic scale, rather than linear.
xbins,ybins – (optional) Bin edges to use (if None, then use np.histogram2d to find bins automatically).
nbins – (optional) Number of bins to use (if None, then use np.histogram2d to find bins automatically).
pts_only – (optional) If True, then just a scatter plot of the points is made, rather than the density plot.
**kwargs –
Keyword arguments passed either to plt.plot or plt.imshow depending upon whether pts_only is set to True or not.

vespa.plotutils.setfig(fig=None, **kwargs)[source]¶

Sets figure to ‘fig’ and clears; if fig is 0, does nothing (e.g. for overplotting)

if fig is None (or anything else), creates new figure

I use this for basically every function I write to make a plot. I give the function a “fig=None” kw argument, so that it will by default create a new figure.

Note

There’s most certainly a better, more object-oriented way of going about writing functions that make figures, but this was put together before I knew how to think that way, so this stays for now as a convenience.

Stats¶

vespa.statutils.conf_interval(x, L, conf=0.683, shortest=True, conftol=0.001, return_max=False)[source]¶: Returns desired 1-d confidence interval for provided x, L[PDF]

vespa.statutils.kdeconf(kde, conf=0.683, xmin=None, xmax=None, npts=500, shortest=True, conftol=0.001, return_max=False)[source]¶: Returns desired confidence interval for provided KDE object

vespa.statutils.qstd(x, quant=0.05, top=False, bottom=False)[source]¶: returns std, ignoring outer ‘quant’ pctiles

Hashing¶

In order to be able to compare population objects, it’s useful to define utility functions to hash ndarrays and DataFrames and to combine hashes in a legit way. This is generally useful and could be its own mini-package, but for now it’s stashed here.

class vespa.hashutils.hashable(wrapped, tight=False)[source]¶

Hashable wrapper for ndarray objects.

Instances of ndarray are not hashable, meaning they cannot be added to sets, nor used as keys in dictionaries. This is by design - ndarray objects are mutable, and therefore cannot reliably implement the __hash__() method.

The hashable class allows a way around this limitation. It implements the required methods for hashable objects in terms of an encapsulated ndarray object. This can be either a copied instance (which is safer) or the original object (which requires the user to be careful enough not to modify it).

This class taken from here; edited only slightly.

unwrap()[source]¶

Returns the encapsulated ndarray.

If the wrapper is “tight”, a copy of the encapsulated ndarray is returned. Otherwise, the encapsulated ndarray itself is returned.

vespa.hashutils.hasharray(arr)[source]¶: Hashes array-like object (except DataFrame)

vespa.hashutils.hashcombine(*xs)[source]¶: Combines multiple hashes using xor

vespa.hashutils.hashdf(df)[source]¶: hashes a pandas dataframe, forcing values to float

vespa.hashutils.hashdict(d)[source]¶: Hash a dictionary