DataAnalysis2022/03-Cross_Correlation/1-simple_cross_correlation.ipynb

367 lines
11 KiB
Plaintext
Raw Normal View History

2022-05-30 13:32:09 +02:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Cross correlation exercise\n",
"\n",
"This exercise focuses on the use of cross correlation of similar waveforms. The first example uses the cross correlation to enhance the pick of a seismic onset.\n",
"\n",
"In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long signal for a shorter, known feature. In seismology cross-correlation is used to compare seismograms from the same event recorded at different stations. It can give information about differences due to longer wave distances or different wave paths. \n",
"\n",
"For seismograms recorded at two Stations (1,2), the cross-correlation is defined as\n",
"\n",
"$$\\begin{equation} f_{21}(t)=\\int_{- \\infty}^\\infty f_2(\\tau)f_1(\\tau + t)d\\tau \\end{equation} $$\n",
"\n",
"where $t$ is the displacement, also known as lag."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"In the frequency range the equation is\n",
"\n",
"\\begin{equation}\n",
" F_{21}(\\omega)=F_2(\\omega)F_1^\\ast (\\omega) \\quad.\n",
"\\end{equation}\n",
"\n",
"$F_1^\\ast $ denotes the complex conjugate of $F_1$. In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero, and its size will be the signal energy. Furthermore, the definition of correlation always includes a standardising factor in such a way that correlations have values between 1 and +1.\n",
"\n",
"As an example, consider two real valued functions $f_1$ and $f_2$ differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much $f_2$ must be shifted along the x-axis to make it identical to $f_1$. The formula essentially slides the $f_2$ function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of $f_{21}(t)$ is maximized."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We will use this to detect the nuclear weapons tests of the DPRK (North Korea) in the recordings of station BUG. The largest event from Sep. 3, 2017 is used as template event."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Matplotlib settings and Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# show figures inline\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot') # Matplotlib style sheet - nicer plots!\n",
"plt.rcParams['figure.figsize'] = 12, 8 # Slightly bigger plots by default\n",
"\n",
"# python imports\n",
"from obspy import read, UTCDateTime\n",
"from matplotlib.pyplot import xcorr\n",
"import numpy as np\n",
"\n",
"#from obspy.signal.cross_correlation import xcorr_pick_correction # 1st exercise\n",
"#from obspy.signal.trigger import coincidence_trigger # 2nd exercise\n",
"#from pprint import pprint"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Get waveform of template event"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"## compute and plot cross correlation of recordings of two events\n",
"\n",
"nsec = 300 # 5 minutes\n",
"\n",
"# use FDSNws to get the data from station BUG\n",
"from obspy.clients.fdsn import Client\n",
"client = Client(\"BGR\")\n",
"\n",
"t1 = UTCDateTime(\"2017-09-03T03:40:00Z\")\n",
"t2 = t1 + nsec\n",
"template = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"template.remove_response(output=\"VEL\");"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# try different starttime, enttime and filtersettings to display the onset of the event at BUG\n",
"st = template.copy()\n",
"st.plot(starttime=t1+10, endtime=t1+200);\n",
"st.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+10, endtime=t1+200);"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Get more Events"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Test Event 1\n",
"t1 = UTCDateTime(\"2016-09-09T00:40:00Z\")\n",
"t2 = t1 + nsec\n",
"st1 = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"st1.remove_response(output=\"VEL\")\n",
"\n",
"st_ = st1.copy()\n",
"st_.plot(starttime=t1+90, endtime=t1+160);\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Test Event 2\n",
"t1 = UTCDateTime(\"2016-01-06T01:40:00Z\")\n",
"t2 = t1 + nsec\n",
"st2 = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"st2.remove_response(output=\"VEL\")\n",
"\n",
"st_ = st2.copy()\n",
"st_.plot(starttime=t1+90, endtime=t1+160);\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Calculate Cross-correlation of template event data and test event data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# time window for cross-corelation\n",
"nsec1 = 90\n",
"nsec2 = 160\n",
"\n",
"# extract data of template event\n",
"st_ = template.copy()\n",
"t1 = UTCDateTime(\"2017-09-03T03:40:00Z\")\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).detrend().trim(starttime=t1+nsec1, endtime=t1+nsec2)\n",
"st_.trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_template = st_[0].copy()\n",
"\n",
"# extract data of 1st and 2nd test event\n",
"st_ = st1.copy()\n",
"t1 = UTCDateTime(\"2016-09-09T00:40:00Z\")+0\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_event1 = st_[0].copy()\n",
"\n",
"st_ = st2.copy()\n",
"t1 = UTCDateTime(\"2016-01-06T01:40:00Z\")+0\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_event2 = st_[0].copy()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# plot template and test events\n",
"tr_template.normalize().plot();\n",
"tr_event1.normalize().plot();\n",
"tr_event2.normalize().plot();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# set parameters\n",
"cc_maxlag = nsec # maximum lag time [s]\n",
"samp_rate = tr_template.stats['sampling_rate']\n",
"shift_len = int(cc_maxlag * samp_rate)\n",
"\n",
"# calculate cross correlation\n",
"a_t1 = xcorr(tr_template.data, tr_event1.data, maxlags=shift_len) # matplotlib\n",
"cc_lags = a_t1[0] #lag vector\n",
"cc = a_t1[1] #correlation vector\n",
"\n",
"# extract parameters from result\n",
"cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)\n",
"cc_max = max(cc)\n",
"cc_shift = cc.argmax() - len(cc)/2\n",
"cc_shift_t = cc_shift / samp_rate\n",
"\n",
"# plot result\n",
"fig = plt.figure(1)\n",
"plt.clf()\n",
"plt.rcParams[\"figure.figsize\"] = [25,10]\n",
"plt.plot(cc_t, cc, 'k')\n",
"plt.title('cross-correlation of template and test event 1, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))\n",
"plt.xlabel('time lag [sec]')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# set parameters\n",
"cc_maxlag = nsec # maximum lag time [s]\n",
"samp_rate = tr_template.stats['sampling_rate']\n",
"shift_len = int(cc_maxlag * samp_rate)\n",
"\n",
"# calculate cross correlation\n",
"a_t2 = xcorr(tr_template.data, tr_event2.data, maxlags=shift_len) # matplotlib\n",
"cc_lags = a_t2[0] #lag vector\n",
"cc = a_t2[1] #correlation vector\n",
"\n",
"# extract parameters from result\n",
"cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)\n",
"cc_max = max(cc)\n",
"cc_shift = cc.argmax() - len(cc)/2\n",
"cc_shift_t = cc_shift / samp_rate\n",
"\n",
"# plot result\n",
"fig = plt.figure(1)\n",
"plt.clf()\n",
"plt.rcParams[\"figure.figsize\"] = [25,10]\n",
"plt.plot(cc_t, cc, 'k')\n",
"plt.title('cross-correlation of template and test event 2, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))\n",
"plt.xlabel('time lag [sec]')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Exercise\n",
"\n",
"* create a plot showing the template and the test events in one plot with the onset times alligned"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"interpreter": {
"hash": "b40d316b6eb93f8d3cd4b30bf241a4edb2b8ca4ea8a7206ab7ccdcc909e9c051"
},
"kernelspec": {
"display_name": "Python 3.6.13 ('obspy')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}