pwtools.signal.smooth¶
- pwtools.signal.smooth(data, kern, axis=0, edge='m', norm=True)[source]¶
Smooth N-dim data by convolution with a kernel kern.
Uses
scipy.signal.fftconvolve()
.Note: This function is actually a special case of
scipy.ndimage.convolve()
, so you may also use that. See the Notes section below for details.Note that due to edge effect handling (padding) and kernel normalization, the convolution identity convolve(data,kern) == convolve(kern,data) doesn’t apply here. We always return an array of
data.shape
.- Parameters:
data (nd array) – The data to smooth. Example: 1d (N,) or (N,K,3) for trajectory
kern (nd array) – Convolution kernel. Example: 1d (M,) or (M,1,1) for trajectory along axis=0 (data length N)
axis (int) – Axis along which to do the smoothing. That is actually not needed for the convolution
fftconvolve(data, kern)
but is used for padding the data along axis to handle edge effects before convolution.edge (str) –
- Method for edge effect handling.
- ’m’ : pad with mirror signal’c’ : pad with constant values (i.e.
data[0]
anddata[-1]
in the 1d case)
norm (bool) – Normalize kernel. Default is True. This assures that the smoothed signal lies within the data. Note that this is not True for kernels with very big spread (i.e.
hann(N*10)
orgaussian(N/2, std=N*10)
. Then the kernel is effectively a constant.
- Returns:
ret – Convolved signal.
- Return type:
data.shape
Examples
>>> from pwtools.signal import welch, smooth >>> from numpy.random import rand >>> from scipy.signal.windows import hann, gaussian >>> from scipy.signal import convolve >>> x = linspace(0,2*pi,500); a=cos(x)+rand(500) >>> plot(a, color='0.7') >>> k=hann(21) >>> plot(smooth(a,k), 'r', label='hann') >>> # To match hann and gaussian, use sigma 1/5 hann window length >>> k=gaussian(51, 21/5) >>> plot(smooth(a,k), 'g', label='gauss') >>> k=welch(21) >>> plot(smooth(a,k), 'y', label='welch') >>> legend() >>> # odd kernel [0,1,0] reproduces data exactly, i.e. convolution with >>> # delta peak >>> figure(); title('smooth with delta [0,1,0]') >>> x=linspace(0,2*pi,15); k=scipy.signal.hann(3) >>> plot(cos(x)) >>> plot(smooth(cos(x),k), 'r') >>> legend() >>> # edge effects with normal convolution >>> figure(); title('edge effects') >>> x=rand(20)+10; k=hann(11); >>> plot(x); plot(smooth(x,k),label="smooth"); >>> plot(convolve(x,k/k.sum(),'same'), label='convolve') >>> legend() >>> # edge effect methods >>> figure(); title('edge effect methods') >>> x=rand(20)+10; k=hann(20); >>> plot(x); plot(smooth(x,k,edge='m'),label="edge='m'"); >>> plot(smooth(x,k,edge='c'),label="edge='c'"); >>> legend() >>> # smooth a trajectory of atomic coordinates >>> figure(); title('trajectory') >>> x = linspace(0,2*pi,500) >>> a = rand(500,2,3) # (nstep, natoms, 3) >>> a[:,0,:] += cos(x)[:,None] >>> a[:,1,:] += sin(x)[:,None] >>> k=hann(21)[:,None,None] >>> y = smooth(a,k) >>> plot(a[:,0,0], color='0.7'); plot(y[:,0,0],'b', ... label='atom1 x') >>> plot(a[:,1,0], color='0.7'); plot(y[:,1,0],'r', ... label='atom2 x') >>> legend()
References
[1] http://wiki.scipy.org/Cookbook/SignalSmooth [2] https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.convolve.html#scipy.ndimage.convolve
Notes
scipy.ndimage
tools:We only apply padding and convolution along axis (since the main ndarray use case is
pwtools.crys.smooth()
), whilendimage.convolve
supports this for all dimensions. Alsondimage.convolve
has more edge effect handling methods available. Might be slower thanscipy.signal.fftconvolve()
, though.>>> a=rand(100,200,300) >>> k=rand(3,1,1)
>>> # Defaults >>> np.allclose(pwtools.signal.smooth(a, k, edge="m", norm=True, axis=0), ... scipy.ndimage.convolve(a, k/k.sum(), mode="mirror"))
>>> # Without kernel normalization >>> np.allclose(pwtools.signal.smooth(a, k, edge="m", norm=False, axis=0), ... scipy.ndimage.convolve(a, k, mode="mirror"))
>>> # Use another axis and another edge method >>> k=rand(1,3,1) >>> np.allclose(pwtools.signal.smooth(a, k, edge="c", norm=True, axis=1), ... scipy.ndimage.convolve(a, k/k.sum(), mode="nearest"))
In addition, a Gaussian kernel can also be applied using
scipy.ndimage.gaussian_filter()
. This implements the same edge effect handling asndimage.convolve
.Kernels:
Even kernels result in shifted signals, odd kernels are better. However, for N >> M, it doesn’t make a difference really.
Usual kernels (window functions) are created by e.g.
scipy.signal.windows.hann(M)
. Forkern=scipy.signal.windows.gaussian(M, std)
, two values are needed, namely M and std, where M determines the number of points calculated for the convolution kernel, as in the other cases. But what is actually important is std, which determines the “used width” of the gaussian. Forgaussian()
, M should be bigger then std. The convolved signal will converge with increasing M. Good values are M=6*std and bigger. To match the results when using a Hann window of size M_hann, use approximately std=1/5*M_hann, as inhann(21)
andgaussian(51, 21/5)
with a large enough M, here 51. Forlorentz()
, much wider kernels are needed such as M=100*std b/c of the long tails of the Lorentz function. Testing is mandatory!Edge effects:
We use padding of the signal with
M=len(kern)
values at both ends such that the convolution with kern doesn’t zero the data at the signal edges. We have two methods. edge=’m’: padd with the signal mirrored at 0 and -1 or edge=’c’: use the constant valuesdata[0]
anddata[-1]
. Many more of these variants may be thought of. The choice of how to extend the data essentially involves an assumption about how the signal would continue, which is signal-dependent. In practice, we usually haveM << N
(e.g.scipy.signal.hann(M)
) orstd << N
(scipy.signal.gaussian(M, std
). Then, both methods are identical in the middle and show only very small differences at the edges. Essentially, edge effect handling shall only ensure that the smoothed signal doesn’t go to zero and that must be independent of the method, which is the case.Memory:
For big data, fftconvolve() can easily eat up all your memory, for example:
>>> # assume axis=0 is the axis along which to convolve >>> arr = ones((1e5,200,3)) >>> kern = scipy.signal.hann(101) >>> ret = scipy.signal.fftconvolve(arr, kern[:,None,None])
Then it is better to loop over some or all of the remaining dimensions:
>>> ret = np.empty_like(arr) >>> for jj in range(arr.shape[1]): >>> ret[:,jj,:] = smooth(arr[:,jj,:], kern[:,None])
or:
>>> for jj in range(arr.shape[1]): >>> for kk in range(arr.shape[2]): >>> ret[:,jj,kk] = smooth(arr[:,jj,kk], kern)
The size of the chunk over which you explicitly loop depends on the data of course. We do exactly this in
pwtools.crys.smooth()
.