# Cross correlation exercise

This exercise focuses on the use of cross correlation of similar waveforms. The first example uses the cross correlation to enhance the pick of a seismic onset.

In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long signal for a shorter, known feature. In seismology cross-correlation is used to compare seismograms from the same event recorded at different stations. It can give information about differences due to longer wave distances or different wave paths. 

For seismograms recorded at two Stations (1,2), the cross-correlation is defined as

$$\begin{equation}  f_{21}(t)=\int_{- \infty}^\infty f_2(\tau)f_1(\tau + t)d\tau \end{equation} $$

where $t$ is the displacement, also known as lag.

In the frequency range the equation is

\begin{equation}
  F_{21}(\omega)=F_2(\omega)F_1^\ast (\omega) \quad.
\end{equation}

$F_1^\ast $ denotes the complex conjugate of $F_1$. In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero, and its size will be the signal energy. Furthermore, the definition of correlation always includes a standardising factor in such a way that correlations have values between âˆ’1 and +1.

As an example, consider two real valued functions $f_1$ and $f_2$ differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much $f_2$ must be shifted along the x-axis to make it identical to $f_1$. The formula essentially slides the $f_2$ function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of $f_{21}(t)$ is maximized.

We will use this to detect the nuclear weapons tests of the DPRK (North Korea) in the recordings of station BUG. The largest event from Sep. 3, 2017 is used as template event.

### Matplotlib settings and Imports

In [None]:
# show figures inline
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('ggplot')                            # Matplotlib style sheet - nicer plots!
plt.rcParams['figure.figsize'] = 12, 8             # Slightly bigger plots by default

# python imports
from obspy import read, UTCDateTime
from matplotlib.pyplot import xcorr
import numpy as np

#from obspy.signal.cross_correlation import xcorr_pick_correction # 1st exercise
#from obspy.signal.trigger import coincidence_trigger # 2nd exercise
#from pprint import pprint

### Get waveform of template event

In [None]:
## compute and plot cross correlation of recordings of two events

nsec = 300 # 5 minutes

# use FDSNws to get the data from station BUG
from obspy.clients.fdsn import Client
client = Client("BGR")

t1 = UTCDateTime("2017-09-03T03:40:00Z")
t2 = t1 + nsec
template = client.get_waveforms("GR", "BUG", "", "BHZ", t1, t2, attach_response=True)
template.remove_response(output="VEL");

In [None]:
# try different starttime, enttime and filtersettings to display the onset of the event at BUG
st = template.copy()
st.plot(starttime=t1+10, endtime=t1+200);
st.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+10, endtime=t1+200);

### Get more Events

In [None]:
# Test Event 1
t1 = UTCDateTime("2016-09-09T00:40:00Z")
t2 = t1 + nsec
st1 = client.get_waveforms("GR", "BUG", "", "BHZ", t1, t2, attach_response=True)
st1.remove_response(output="VEL")

st_ = st1.copy()
st_.plot(starttime=t1+90, endtime=t1+160);
st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);

In [None]:
# Test Event 2
t1 = UTCDateTime("2016-01-06T01:40:00Z")
t2 = t1 + nsec
st2 = client.get_waveforms("GR", "BUG", "", "BHZ", t1, t2, attach_response=True)
st2.remove_response(output="VEL")

st_ = st2.copy()
st_.plot(starttime=t1+90, endtime=t1+160);
st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);

### Calculate Cross-correlation of template event data and test event data

In [None]:
# time window for cross-corelation
nsec1 = 90
nsec2 = 160

# extract data of template event
st_ = template.copy()
t1 = UTCDateTime("2017-09-03T03:40:00Z")
st_.filter('bandpass', freqmin=0.8, freqmax=3.0).detrend().trim(starttime=t1+nsec1, endtime=t1+nsec2)
st_.trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)
tr_template = st_[0].copy()

# extract data of 1st and 2nd test event
st_ = st1.copy()
t1 = UTCDateTime("2016-09-09T00:40:00Z")+0
st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)
tr_event1 = st_[0].copy()

st_ = st2.copy()
t1 = UTCDateTime("2016-01-06T01:40:00Z")+0
st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)
tr_event2 = st_[0].copy()

In [None]:
# plot template and test events
tr_template.normalize().plot();
tr_event1.normalize().plot();
tr_event2.normalize().plot();

In [None]:
# set parameters
cc_maxlag = nsec # maximum lag time [s]
samp_rate = tr_template.stats['sampling_rate']
shift_len = int(cc_maxlag * samp_rate)

# calculate cross correlation
a_t1 = xcorr(tr_template.data, tr_event1.data, maxlags=shift_len) # matplotlib
cc_lags = a_t1[0]   #lag vector
cc = a_t1[1]   #correlation vector

# extract parameters from result
cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)
cc_max = max(cc)
cc_shift = cc.argmax() - len(cc)/2
cc_shift_t = cc_shift / samp_rate

# plot result
fig = plt.figure(1)
plt.clf()
plt.rcParams["figure.figsize"] = [25,10]
plt.plot(cc_t, cc, 'k')
plt.title('cross-correlation of template and test event 1, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))
plt.xlabel('time lag [sec]')
plt.show()

In [None]:
# set parameters
cc_maxlag = nsec # maximum lag time [s]
samp_rate = tr_template.stats['sampling_rate']
shift_len = int(cc_maxlag * samp_rate)

# calculate cross correlation
a_t2 = xcorr(tr_template.data, tr_event2.data, maxlags=shift_len) # matplotlib
cc_lags = a_t2[0]   #lag vector
cc = a_t2[1]   #correlation vector

# extract parameters from result
cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)
cc_max = max(cc)
cc_shift = cc.argmax() - len(cc)/2
cc_shift_t = cc_shift / samp_rate

# plot result
fig = plt.figure(1)
plt.clf()
plt.rcParams["figure.figsize"] = [25,10]
plt.plot(cc_t, cc, 'k')
plt.title('cross-correlation of template and test event 2, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))
plt.xlabel('time lag [sec]')
plt.show()

## Exercise

* create a plot showing the template and the test events in one plot with the onset times alligned