add exercises on cross correlation

This commit is contained in:
Kasper D. Fischer 2022-05-30 13:32:09 +02:00
parent 81479e92af
commit 85977c1a3a
4 changed files with 1436 additions and 0 deletions

View File

@ -0,0 +1,366 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Cross correlation exercise\n",
"\n",
"This exercise focuses on the use of cross correlation of similar waveforms. The first example uses the cross correlation to enhance the pick of a seismic onset.\n",
"\n",
"In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long signal for a shorter, known feature. In seismology cross-correlation is used to compare seismograms from the same event recorded at different stations. It can give information about differences due to longer wave distances or different wave paths. \n",
"\n",
"For seismograms recorded at two Stations (1,2), the cross-correlation is defined as\n",
"\n",
"$$\\begin{equation} f_{21}(t)=\\int_{- \\infty}^\\infty f_2(\\tau)f_1(\\tau + t)d\\tau \\end{equation} $$\n",
"\n",
"where $t$ is the displacement, also known as lag."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"In the frequency range the equation is\n",
"\n",
"\\begin{equation}\n",
" F_{21}(\\omega)=F_2(\\omega)F_1^\\ast (\\omega) \\quad.\n",
"\\end{equation}\n",
"\n",
"$F_1^\\ast $ denotes the complex conjugate of $F_1$. In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero, and its size will be the signal energy. Furthermore, the definition of correlation always includes a standardising factor in such a way that correlations have values between 1 and +1.\n",
"\n",
"As an example, consider two real valued functions $f_1$ and $f_2$ differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much $f_2$ must be shifted along the x-axis to make it identical to $f_1$. The formula essentially slides the $f_2$ function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of $f_{21}(t)$ is maximized."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We will use this to detect the nuclear weapons tests of the DPRK (North Korea) in the recordings of station BUG. The largest event from Sep. 3, 2017 is used as template event."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Matplotlib settings and Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# show figures inline\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot') # Matplotlib style sheet - nicer plots!\n",
"plt.rcParams['figure.figsize'] = 12, 8 # Slightly bigger plots by default\n",
"\n",
"# python imports\n",
"from obspy import read, UTCDateTime\n",
"from matplotlib.pyplot import xcorr\n",
"import numpy as np\n",
"\n",
"#from obspy.signal.cross_correlation import xcorr_pick_correction # 1st exercise\n",
"#from obspy.signal.trigger import coincidence_trigger # 2nd exercise\n",
"#from pprint import pprint"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Get waveform of template event"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"## compute and plot cross correlation of recordings of two events\n",
"\n",
"nsec = 300 # 5 minutes\n",
"\n",
"# use FDSNws to get the data from station BUG\n",
"from obspy.clients.fdsn import Client\n",
"client = Client(\"BGR\")\n",
"\n",
"t1 = UTCDateTime(\"2017-09-03T03:40:00Z\")\n",
"t2 = t1 + nsec\n",
"template = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"template.remove_response(output=\"VEL\");"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# try different starttime, enttime and filtersettings to display the onset of the event at BUG\n",
"st = template.copy()\n",
"st.plot(starttime=t1+10, endtime=t1+200);\n",
"st.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+10, endtime=t1+200);"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Get more Events"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Test Event 1\n",
"t1 = UTCDateTime(\"2016-09-09T00:40:00Z\")\n",
"t2 = t1 + nsec\n",
"st1 = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"st1.remove_response(output=\"VEL\")\n",
"\n",
"st_ = st1.copy()\n",
"st_.plot(starttime=t1+90, endtime=t1+160);\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Test Event 2\n",
"t1 = UTCDateTime(\"2016-01-06T01:40:00Z\")\n",
"t2 = t1 + nsec\n",
"st2 = client.get_waveforms(\"GR\", \"BUG\", \"\", \"BHZ\", t1, t2, attach_response=True)\n",
"st2.remove_response(output=\"VEL\")\n",
"\n",
"st_ = st2.copy()\n",
"st_.plot(starttime=t1+90, endtime=t1+160);\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).plot(starttime=t1+90, endtime=t1+160);"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Calculate Cross-correlation of template event data and test event data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# time window for cross-corelation\n",
"nsec1 = 90\n",
"nsec2 = 160\n",
"\n",
"# extract data of template event\n",
"st_ = template.copy()\n",
"t1 = UTCDateTime(\"2017-09-03T03:40:00Z\")\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).detrend().trim(starttime=t1+nsec1, endtime=t1+nsec2)\n",
"st_.trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_template = st_[0].copy()\n",
"\n",
"# extract data of 1st and 2nd test event\n",
"st_ = st1.copy()\n",
"t1 = UTCDateTime(\"2016-09-09T00:40:00Z\")+0\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_event1 = st_[0].copy()\n",
"\n",
"st_ = st2.copy()\n",
"t1 = UTCDateTime(\"2016-01-06T01:40:00Z\")+0\n",
"st_.filter('bandpass', freqmin=0.8, freqmax=3.0).trim(starttime=t1, endtime=t1+nsec, pad=True, fill_value=0.0)\n",
"tr_event2 = st_[0].copy()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# plot template and test events\n",
"tr_template.normalize().plot();\n",
"tr_event1.normalize().plot();\n",
"tr_event2.normalize().plot();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# set parameters\n",
"cc_maxlag = nsec # maximum lag time [s]\n",
"samp_rate = tr_template.stats['sampling_rate']\n",
"shift_len = int(cc_maxlag * samp_rate)\n",
"\n",
"# calculate cross correlation\n",
"a_t1 = xcorr(tr_template.data, tr_event1.data, maxlags=shift_len) # matplotlib\n",
"cc_lags = a_t1[0] #lag vector\n",
"cc = a_t1[1] #correlation vector\n",
"\n",
"# extract parameters from result\n",
"cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)\n",
"cc_max = max(cc)\n",
"cc_shift = cc.argmax() - len(cc)/2\n",
"cc_shift_t = cc_shift / samp_rate\n",
"\n",
"# plot result\n",
"fig = plt.figure(1)\n",
"plt.clf()\n",
"plt.rcParams[\"figure.figsize\"] = [25,10]\n",
"plt.plot(cc_t, cc, 'k')\n",
"plt.title('cross-correlation of template and test event 1, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))\n",
"plt.xlabel('time lag [sec]')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# set parameters\n",
"cc_maxlag = nsec # maximum lag time [s]\n",
"samp_rate = tr_template.stats['sampling_rate']\n",
"shift_len = int(cc_maxlag * samp_rate)\n",
"\n",
"# calculate cross correlation\n",
"a_t2 = xcorr(tr_template.data, tr_event2.data, maxlags=shift_len) # matplotlib\n",
"cc_lags = a_t2[0] #lag vector\n",
"cc = a_t2[1] #correlation vector\n",
"\n",
"# extract parameters from result\n",
"cc_t = np.linspace(-cc_maxlag, cc_maxlag, shift_len*2+1)\n",
"cc_max = max(cc)\n",
"cc_shift = cc.argmax() - len(cc)/2\n",
"cc_shift_t = cc_shift / samp_rate\n",
"\n",
"# plot result\n",
"fig = plt.figure(1)\n",
"plt.clf()\n",
"plt.rcParams[\"figure.figsize\"] = [25,10]\n",
"plt.plot(cc_t, cc, 'k')\n",
"plt.title('cross-correlation of template and test event 2, max %.2f at %.2f s shift' %(cc_max, cc_lags[np.argmax(cc)]/samp_rate))\n",
"plt.xlabel('time lag [sec]')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Exercise\n",
"\n",
"* create a plot showing the template and the test events in one plot with the onset times alligned"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"interpreter": {
"hash": "b40d316b6eb93f8d3cd4b30bf241a4edb2b8ca4ea8a7206ab7ccdcc909e9c051"
},
"kernelspec": {
"display_name": "Python 3.6.13 ('obspy')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@ -0,0 +1,153 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Simple cross correlation to enhance pick accuracy\n",
"\n",
"This part is taken from the ObsPy tutorial at https://docs.obspy.org/master/tutorial/code_snippets/xcorr_pick_correction.html.\n",
"\n",
"This example shows how to align the waveforms of phase onsets of two earthquakes in order to correct the original pick times that can never be set perfectly consistent in routine analysis. A parabola is fit to the concave part of the cross correlation function around its maximum, following the approach by [Deichmann 1992](https://docs.obspy.org/master/citations.html#deichmann1992).\n",
"\n",
"To adjust the parameters (i.e. the used time window around the pick and the filter settings) and to validate and check the results the options plot and filename can be used to open plot windows or save the figure to a file.\n",
"\n",
"See the documentation of [xcorr_pick_correction()](https://docs.obspy.org/master/packages/autogen/obspy.signal.cross_correlation.xcorr_pick_correction.html#obspy.signal.cross_correlation.xcorr_pick_correction) for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# show figures inline\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot') # Matplotlib style sheet - nicer plots!\n",
"plt.rcParams['figure.figsize'] = 12, 8 # Slightly bigger plots by default\n",
"\n",
"# python imports\n",
"from obspy import read, UTCDateTime\n",
"from obspy.signal.cross_correlation import xcorr_pick_correction"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# read example data of two small earthquakes\n",
"path = \"https://examples.obspy.org/BW.UH1..EHZ.D.2010.147.%s.slist.gz\"\n",
"st1 = read(path % (\"a\", ))\n",
"st2 = read(path % (\"b\", ))\n",
"\n",
"# display info and plot data\n",
"print(st1)\n",
"st1.plot();\n",
"print(st2)\n",
"st2.plot();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# select the single traces to use in correlation.\n",
"# to avoid artifacts from preprocessing there should be some data left and\n",
"# right of the short time window actually used in the correlation.\n",
"tr1 = st1.select(component=\"Z\")[0]\n",
"tr2 = st2.select(component=\"Z\")[0]\n",
"\n",
"# these are the original pick times set during routine analysis\n",
"t1 = UTCDateTime(\"2010-05-27T16:24:33.315000Z\")\n",
"t2 = UTCDateTime(\"2010-05-27T16:27:30.585000Z\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# estimate the time correction for pick 2 without any preprocessing and open\n",
"# a plot window to visually validate the results\n",
"dt, coeff = xcorr_pick_correction(t1, tr1, t2, tr2, 0.05, 0.2, 0.25, plot=True)\n",
"print(\"No preprocessing:\")\n",
"print(\" Time correction for pick 2: %.6f s\" % dt)\n",
"print(\" Correlation coefficient: %.2f\" % coeff)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# estimate the time correction with bandpass prefiltering\n",
"dt, coeff = xcorr_pick_correction(t1, tr1, t2, tr2, 0.05, 0.2, 0.25, plot=True,\n",
" filter=\"bandpass\",\n",
" filter_options={'freqmin': 1, 'freqmax': 10})\n",
"print(\"Bandpass prefiltering:\")\n",
"print(\" Time correction for pick 2: %.6f s\" % dt)\n",
"print(\" Correlation coefficient: %.2f\" % coeff)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@ -0,0 +1,911 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"<div style='background-image: url(\"../images/header.svg\") ; padding: 0px ; background-size: cover ; border-radius: 5px ; height: 250px'>\n",
" <div style=\"float: right ; margin: 50px ; padding: 20px ; background: rgba(255 , 255 , 255 , 0.7) ; width: 50% ; height: 150px\">\n",
" <div style=\"position: relative ; top: 50% ; transform: translatey(-50%)\">\n",
" <div style=\"font-size: xx-large ; font-weight: 900 ; color: rgba(0 , 0 , 0 , 0.8) ; line-height: 100%\">Ambient Seismic Noise Analysis</div>\n",
" <div style=\"font-size: large ; padding-top: 20px ; color: rgba(0 , 0 , 0 , 0.5)\">Cross Correlation </div>\n",
" </div>\n",
" </div>\n",
"</div>\n",
"\n",
"In this tutorial you will try to reproduce one of the figures in Shapiro _et al._. To see which one, execute the second code block below. \n",
"\n",
"Reference: *High-Resolution Surface-Wave Tomography from Ambient Seismic Noise*, Nikolai M. Shapiro, et al. **Science** 307, 1615 (2005);\n",
"DOI: 10.1126/science.1108339\n",
"\n",
"##### Authors:\n",
"* Celine Hadziioannou\n",
"* Ashim Rijal\n",
"---\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### In this notebook\n",
"We will reproduce figure B below. This figure compares: \n",
"1) the seismogram from an event near station MLAC, recorded at station PHL (top)\n",
"2) the \"Greens function\" obtained by correlating noise recorded at stations MLAC and PHL (center and bottom)\n",
"\n",
"All bandpassed for periods between 5 - 10 seconds. \n",
"\n",
"<img src=\"https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/shapiro_figure.png\">\n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# Configuration step (Please run it before the code!)\n",
"\n",
"# MatplotLib.PyPlot\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot') # Matplotlib style sheet - nicer plots!\n",
"plt.rcParams['figure.figsize'] = 12, 8\n",
"\n",
"# NumPy\n",
"import numpy as np\n",
"\n",
"# ObsPy\n",
"from obspy.core import UTCDateTime, read\n",
"from obspy.clients.fdsn import Client\n",
"try: # depends on obspy version; this is for v1.1.0\n",
" from obspy.geodetics import gps2dist_azimuth as gps2DistAzimuth\n",
"except ImportError:\n",
" from obspy.core.util import gps2DistAzimuth\n",
"\n",
"# ignore warnings from filters\n",
"import warnings\n",
"warnings.filterwarnings('ignore')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### 1. Read in noise data \n",
"Read the noise data for station MLAC into a stream. \n",
"\n",
"Then, **read in** noise data for station PHL.\n",
"Add this to the stream created above.\n",
"\n",
"These two data files contain 90 days of vertical component noise for each station.\n",
"#### If you need data for more than 90 days, it can be downloaded form IRIS database. ###"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Shapiro et al. use noise data from MLAC and PHL stations\n",
"\n",
"num_of_days = 90 # no of days of data: change if more than 90days of data is required\n",
"if num_of_days <= 90:\n",
" # get noise data for station MLAC\n",
" stn = read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/noise.CI.MLAC.LHZ.2004.294.2005.017.mseed')\n",
" # get noise data for the station PHL and add it to the previous stream\n",
" stn += read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/noise.CI.PHL.LHZ.2004.294.2005.017.mseed')\n",
" # if you have data stored locally, comment the stn = and stn += lines above\n",
" # then uncomment the following 3 lines and adapt the path: \n",
" # stn = obspy.read('./noise.CI.MLAC.LHZ.2004.294.2005.017.mseed')\n",
" # stn += obspy.read('noise.CI.PHL.LHZ.2004.294.2005.017.mseed')\n",
" # ste = obspy.read('event.CI.PHL.LHZ.1998.196.1998.196.mseed')\n",
"else:\n",
" # download data from IRIS database\n",
" client = Client(\"IRIS\") # client specification\n",
" t1 = UTCDateTime(\"2004-10-20T00:00:00.230799Z\") # start UTC date/time\n",
" t2 = t1+(num_of_days*86400) # end UTC date/time\n",
" stn = client.get_waveforms(network=\"CI\", station=\"MLAC\",location=\"*\", channel=\"*\",\n",
" starttime=t1, endtime=t2) # get data for MLAC\n",
" stn += client.get_waveforms(network=\"CI\", station=\"PHL\", location=\"*\", channel=\"*\",\n",
" starttime=t1, endtime=t2) # get data for PHL and add it to the previous stream"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### 2. Preprocess noise ###\n",
"***Preprocessing 1***\n",
"* Just to be sure to keep a 'clean' original stream, first **copy** the noise stream with [st.copy()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.copy.html)\n",
"The copied stream is the stream you will use from now on. \n",
"\n",
"* In order to test the preprocessing without taking too long, it's also useful to first **trim** this copied noise data stream to just one or a few days. This can be done with [st.trim()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.trim.html), after defining your start- and endtime. \n",
"\n",
"\n",
"Many processing functionalities are included in Obspy. For example, you can remove any (linear) trends with [st.detrend()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.detrend.html), and taper the edges with [st.taper()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.taper.html). \n",
"Different types of filter are also available in [st.filter()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.filter.html). \n",
"\n",
"* first **detrend** the data. \n",
"* next, apply a **bandpass filter** to select the frequencies with most noise energy. The secondary microseismic peak is roughly between 0.1 and 0.2 Hz. The primary microseismic peak between 0.05 and 0.1 Hz. Make sure to use a zero phase filter! *(specify argument zerophase=True)*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# Preprocessing 1\n",
"\n",
"stp = stn.copy() # copy stream\n",
"t = stp[0].stats.starttime\n",
"stp.trim(t, t + 4 * 86400) # shorten stream for quicker processing\n",
"\n",
"stp.detrend('linear') # remove trends using detrend\n",
"stp.taper(max_percentage=0.05, type='cosine') # taper the edges\n",
"stp.filter('bandpass', freqmin=0.1, freqmax=0.2, zerophase=True) # filter data of all traces in the streams"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"***Preprocessing 2***\n",
"\n",
"Some additional useful processing functions are provided in the following **Functions**\n",
"\n",
"* For each trace in the stream, apply **spectral whitening** on the frequency range you chose before (either [0.1 0.2]Hz or [0.05 0.1]Hz), using function **``whiten``**. \n",
"\n",
"\n",
"* For the **time normalization**, the simplest option is to use the one-bit normalization option provided in function **``normalize``**. \n",
"\n",
"* *Optional: play around with different normalization options, such as clipping to a certain number of standard deviations, or using the running absolute mean normalization.*"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A brief *desription of individual* **functions** (see the next cell) are as follows:\n",
"\n",
"1) **whiten**:\n",
"\n",
" spectral whitening of trace `tr` using a cosine tapered boxcar between `freqmin` and `freqmax`\n",
" (courtesy Gaia Soldati & Licia Faenza, INGV)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def whiten(tr, freqmin, freqmax):\n",
" \n",
" nsamp = tr.stats.sampling_rate\n",
" \n",
" n = len(tr.data)\n",
" if n == 1:\n",
" return tr\n",
" else: \n",
" frange = float(freqmax) - float(freqmin)\n",
" nsmo = int(np.fix(min(0.01, 0.5 * (frange)) * float(n) / nsamp))\n",
" f = np.arange(n) * nsamp / (n - 1.)\n",
" JJ = ((f > float(freqmin)) & (f<float(freqmax))).nonzero()[0]\n",
" \n",
" # signal FFT\n",
" FFTs = np.fft.fft(tr.data)\n",
" FFTsW = np.zeros(n) + 1j * np.zeros(n)\n",
"\n",
" # Apodization to the left with cos^2 (to smooth the discontinuities)\n",
" smo1 = (np.cos(np.linspace(np.pi / 2, np.pi, nsmo+1))**2)\n",
" FFTsW[JJ[0]:JJ[0]+nsmo+1] = smo1 * np.exp(1j * np.angle(FFTs[JJ[0]:JJ[0]+nsmo+1]))\n",
"\n",
" # boxcar\n",
" FFTsW[JJ[0]+nsmo+1:JJ[-1]-nsmo] = np.ones(len(JJ) - 2 * (nsmo+1))\\\n",
" * np.exp(1j * np.angle(FFTs[JJ[0]+nsmo+1:JJ[-1]-nsmo]))\n",
"\n",
" # Apodization to the right with cos^2 (to smooth the discontinuities)\n",
" smo2 = (np.cos(np.linspace(0., np.pi/2., nsmo+1))**2.)\n",
" espo = np.exp(1j * np.angle(FFTs[JJ[-1]-nsmo:JJ[-1]+1]))\n",
" FFTsW[JJ[-1]-nsmo:JJ[-1]+1] = smo2 * espo\n",
"\n",
" whitedata = 2. * np.fft.ifft(FFTsW).real\n",
" \n",
" tr.data = np.require(whitedata, dtype=\"float32\")\n",
"\n",
" return tr"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"2) **correlateNoise**:\n",
" \n",
" correlate two stations, using slices of 'corrwin' seconds at a time correlations are also stacked. \n",
" NB hardcoded: correlates 1st with 2nd station in the stream only signals are merged - any data gaps are\n",
" filled with zeros.\n",
" st : stream containing data from the two stations to correlate\n",
" stations : list of stations\n",
" corrwin : correlation window length\n",
" returns 'corr' (all correlations) and 'stack' (averaged correlations)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def correlateNoise(st, stations, corrwin):\n",
"\n",
" print ('correlating stations', (stations[0], stations[1]))\n",
"\n",
" # initialize sliding timewindow (length = corrwin) for correlation\n",
" # start 1 corrwin after the start to account for different stream lengths\n",
" timewin = st.select(station=stations[1])[0].stats.starttime + corrwin\n",
"\n",
" # loop over timewindows \n",
" # stop 1 corrwin before the end to account for different stream lengths\n",
" while timewin < st.select(station=stations[0])[-1].stats.endtime - 2*corrwin:\n",
" sig1 = st.select(station=stations[0]).slice(timewin, timewin+corrwin)\n",
" sig1.merge(method=0, fill_value=0)\n",
" sig2 = st.select(station=stations[1]).slice(timewin, timewin+corrwin)\n",
" sig2.merge(method=0, fill_value=0)\n",
" xcorr = np.correlate(sig1[0].data, sig2[0].data, 'same')\n",
"\n",
" try: \n",
" # build array with all correlations\n",
" corr = np.vstack((corr, xcorr))\n",
" except: \n",
" # if corr doesn't exist yet\n",
" corr = xcorr\n",
" \n",
" # shift timewindow by one correlation window length\n",
" timewin += corrwin\n",
"\n",
" # stack the correlations; normalize\n",
" stack = np.sum(corr, 0)\n",
" stack = stack / float((np.abs(stack).max())) \n",
" print (\"...done\")\n",
"\n",
" return corr, stack"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"3) **plotStack**:\n",
"\n",
" plots stack of correlations with correct time axis\n",
" st: stream containing noise (and station information)\n",
" stack: array containing stack \n",
" maxlag: maximum length of correlation to plot (in seconds)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def plotStack(st, stack, maxlag, figurename=None):\n",
"\n",
" # define the time vector for the correlation (length of corr = corrwin + 1)\n",
" limit = (len(stack) / 2.) * st[0].stats.delta\n",
" timevec = np.arange(-limit, limit, st[0].stats.delta)\n",
"\n",
" plt.plot(timevec, stack, 'k')\n",
" stations = list(set([_i.stats.station for _i in st]))\n",
" plt.title(\"Stacked correlation between %s and %s\" % (stations[0], stations[1]))\n",
" plt.xlim(-maxlag, maxlag)\n",
" plt.xlabel('time [s]')\n",
"\n",
" if figurename is not None:\n",
" fig.savefig(figurename, format=\"pdf\")\n",
" else:\n",
" plt.show() "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"4) **plotXcorrEvent**:\n",
"\n",
" plot the noise correlation (MLAC, PHL) alongside the 1998 event signal \n",
" st : event stream\n",
" stn : noise stream\n",
" stack : noise correlation array\n",
" maxlag : maximum length of correlation, in seconds\n",
" acausal : set to True to use acausal part (=negative times) of the correlation\n",
" figurename : if a filename is specified, figure is saved in pdf format\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def plotXcorrEvent(st, stn, stack, maxlag, acausal=False, figurename=None):\n",
"\n",
" eventtime = UTCDateTime(1998,7,15,4,53,21,0) # event near MLAC\n",
"\n",
" # station locations\n",
" latP, lonP = 35.41, -120.55 # station PHL\n",
" latM, lonM = 37.63, -118.84 # station MLAC\n",
" latE, lonE = 37.55, -118.809 # event 1998\n",
" \n",
" # calculate distance between stations\n",
" dist = gps2DistAzimuth(latP, lonP, latM, lonM)[0] # between PHL and MLAC\n",
" distE = gps2DistAzimuth(latP, lonP, latE, lonE)[0] # between event and PHL\n",
" #\n",
" # CROSSCORRELATION\n",
" # reverse stack to plot acausal part (= negative times of correlation)\n",
" if acausal:\n",
" stack = stack[::-1]\n",
" \n",
" # find center of stack\n",
" c = int(np.ceil(len(stack)/2.) + 1)\n",
" \n",
" #cut stack to maxlag\n",
" stack = stack[c - maxlag * int(np.ceil(stn[0].stats.sampling_rate)) : c + maxlag * int(np.ceil(stn[0].stats.sampling_rate))]\n",
" \n",
" # find new center of stack\n",
" c2 = int(np.ceil(len(stack)/2.) + 1)\n",
"\n",
" # define time vector for cross correlation\n",
" limit = (len(stack) / 2.) * stn[0].stats.delta\n",
" timevec = np.arange(-limit, limit, stn[0].stats.delta)\n",
" # define timevector: dist / t\n",
" timevecDist = dist / timevec\n",
" \n",
" # EVENT\n",
" ste = st.copy()\n",
" st_PHL_e = ste.select(station='PHL')\n",
" \n",
" # cut down event trace to 'maxlag' seconds\n",
" dt = len(stack[c2:])/stn[0].stats.sampling_rate #xcorrlength\n",
" st_PHL_e[0].trim(eventtime, eventtime + dt)\n",
" \n",
" # create time vector for event signal\n",
" # extreme values:\n",
" limit = st_PHL_e[0].stats.npts * st_PHL_e[0].stats.delta\n",
" timevecSig = np.arange(0, limit, st_PHL_e[0].stats.delta)\n",
"\n",
" # PLOTTING\n",
" fig = plt.figure(figsize=(12.0, 8.0))\n",
" ax1 = fig.add_subplot(2,1,1)\n",
" ax2 = fig.add_subplot(2,1,2)\n",
"\n",
" # plot noise correlation\n",
" ax1.plot(timevecDist[c2:], stack[c2:], 'k')\n",
" ax1.set_title('Noise correlation between MLAC and PHL')\n",
"\n",
" # plot event near MLAC measured at PHL\n",
" ax2.plot(distE/timevecSig, st_PHL_e[0].data / np.max(np.abs(st_PHL_e[0].data)), 'r')\n",
" ax2.set_title('Event near MLAC observed at PHL')\n",
"\n",
" ax2.set_xlim((0, 8000))\n",
" ax1.set_xlim((0, 8000))\n",
"\n",
" ax2.set_xlabel(\"group velocity [m/s]\")\n",
" \n",
" if figurename is not None:\n",
" fig.savefig(figurename, format=\"pdf\")\n",
" else:\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"5) **Normalize**:\n",
"\n",
" Temporal normalization of the traces, most after Bensen 2007. NB. before this treatment, traces must be\n",
" demeaned, detrended and filtered. Description of argument:\n",
"\n",
" norm_method=\"clipping\"\n",
" signal is clipped to 'clip_factor' times the std\n",
" clip_factor recommended: 1 (times std)\n",
" \n",
" norm_method=\"clipping_iter\"\n",
" the signal is clipped iteratively: values above 'clip_factor * std' \n",
" are divided by 'clip_weight'. until the whole signal is below \n",
" 'clip_factor * std'\n",
" clip_factor recommended: 6 (times std)\n",
" \n",
" \n",
" norm_method=\"ramn\"\n",
" running absolute mean normalization: a sliding window runs along the \n",
" signal. The values within the window are used to calculate a \n",
" weighting factor, and the center of the window is scaled by this \n",
" factor. \n",
" weight factor: w = np.mean(np.abs(tr.data[win]))/(2. * norm_win + 1) \n",
" finally, the signal is tapered with a tukey window (alpha = 0.2).\n",
"\n",
" norm_win: running window length, in seconds.\n",
" recommended: half the longest period\n",
"\n",
" norm_method=\"1bit\"\n",
" only the sign of the signal is conserved\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Functions\n",
"# collection of functions used in noise correlation processing\n",
"\n",
"def normalize(tr, clip_factor=6, clip_weight=10, norm_win=None, norm_method=\"1bit\"): \n",
" \n",
" if norm_method == 'clipping':\n",
" lim = clip_factor * np.std(tr.data)\n",
" tr.data[tr.data > lim] = lim\n",
" tr.data[tr.data < -lim] = -lim\n",
"\n",
" elif norm_method == \"clipping_iter\":\n",
" lim = clip_factor * np.std(np.abs(tr.data))\n",
" \n",
" # as long as still values left above the waterlevel, clip_weight\n",
" while tr.data[np.abs(tr.data) > lim] != []:\n",
" tr.data[tr.data > lim] /= clip_weight\n",
" tr.data[tr.data < -lim] /= clip_weight\n",
"\n",
" elif norm_method == 'ramn':\n",
" lwin = tr.stats.sampling_rate * norm_win\n",
" st = 0 # starting point\n",
" N = lwin # ending point\n",
"\n",
" while N < tr.stats.npts:\n",
" win = tr.data[st:N]\n",
"\n",
" w = np.mean(np.abs(win)) / (2. * lwin + 1)\n",
" \n",
" # weight center of window\n",
" tr.data[st + lwin / 2] /= w\n",
"\n",
" # shift window\n",
" st += 1\n",
" N += 1\n",
"\n",
" # taper edges\n",
" taper = get_window(tr.stats.npts)\n",
" tr.data *= taper\n",
"\n",
" elif norm_method == \"1bit\":\n",
" tr.data = np.sign(tr.data)\n",
" tr.data = np.float32(tr.data)\n",
"\n",
" return tr"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"6) **get_window**:\n",
"\n",
" Return tukey window of length N\n",
" N: length of window\n",
" alpha: alpha parameter in case of tukey window.\n",
" 0 -> rectangular window\n",
" 1 -> cosine taper\n",
" returns: window (np.array)\n",
" \n",
"Doc of [scipy.signal.get_window](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.signal.get_window.html)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0,
3,
46,
57,
93,
130,
150
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def get_window(N, alpha=0.2):\n",
"\n",
" window = np.ones(N)\n",
" x = np.linspace(-1., 1., N)\n",
" ind1 = (abs(x) > 1 - alpha) * (x < 0)\n",
" ind2 = (abs(x) > 1 - alpha) * (x > 0)\n",
" window[ind1] = 0.5 * (1 - np.cos(np.pi * (x[ind1] + 1) / alpha))\n",
" window[ind2] = 0.5 * (1 - np.cos(np.pi * (x[ind2] - 1) / alpha))\n",
" return window"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Actual preprocessing happens here -- this can take a while!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Preprocessing 2\n",
"st = stp.copy() # copy stream\n",
"\n",
"for tr in st:\n",
" tr = normalize(tr, norm_method=\"1bit\")\n",
" tr = whiten(tr, 0.1, 0.2)\n",
"print ('done!')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Cross-correlation ####\n",
"Once you're happy with the preprocessing, you can calculate the **cross-correlation** using **``correlateNoise``** function. The cross-correlation are computed by slices of a few hours each (specified in *corrwin*). \n",
"\n",
"**For correlateNoise function**\n",
"* input: stream, list of stations (here: ['MLAC', 'PHL']), slice length in seconds\n",
"* output: all individual correlations, stack\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Cross-correlate\n",
"xcorr, stack = correlateNoise(st, ['MLAC','PHL'], 7200)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"The resulting stack can be **plotted** with **``plotStack``** function. Since it doesn't make much sense to look at a 2 hour long correlation signal, you can decide to plot only the central part by specifying a ``maxlag`` (in seconds). "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Plotting\n",
"\n",
"plotStack(st, stack, 400)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"If you're only working with a few days of noise (after trimming), this plot probably doesn't look very nice. You could go back to the code block named 'preprocessing 1', and keep a longer noise record (10 days works quite well already). "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Compare to event trace ####\n",
"In 1998, a M = 5.1 event occurred next to station MLAC. This event was recorded at PHL and we read this data.\n",
"\n",
"* **read** the event data to a separate stream"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"ste = read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/event.CI.PHL.LHZ.1998.196.1998.196.mseed')\n",
"# if data is stored locally, uncomment the following line and comment the line above:\n",
"#ste = obspy.read('./event.CI.PHL.LHZ.1998.196.1998.196.mseed')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Preprocess event ####\n",
"\n",
"The event signal should be processed in a similar way to the noise. \n",
"\n",
"* **detrend** in the same way as before\n",
"* **bandpass filter** for the same frequencies as chosen above\n",
"* apply **spectral whitening** to each trace in the event stream to ensure the spectrum is comparable to the noise. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Preprocessing\n",
"\n",
"ste.detrend('linear')\n",
"ste.filter('bandpass', freqmin=0.1, freqmax=0.2, zerophase=True)\n",
"for tr in ste:\n",
" tr = whiten(tr, 0.1, 0.2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"#### Plot! ####\n",
"\n",
"A plotting function is provided to plot both signals alongside: **``plotXcorrEvent``**. \n",
"\n",
"* input: event stream, noise stream, stack, maxlag"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"code_folding": [
0
],
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# Plotting\n",
"\n",
"plotXcorrEvent(ste, stn, stack, 400)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@ -13,3 +13,9 @@ This has been forked from [Seismo Live](http://seismo-live.org). The source code
### 02 - Fourier-Transform and Basic Filtering
This has been forked from [Seismo Live](http://seismo-live.org). The source code is available at https://github.com/krischer/seismo_live (licensed under a ***CC BY-NC-SA 4.0 License***. &copy; 2015-2022 Lion Krischer).
### 03 - Cross Correlation
1. Simple Cross Correlation: Use the cross-correlation to detect and determine the time shift of two test events with one template event.
2. Enhanced Picker: This has been taken from the [ObsPy Tutorial](https://docs.obspy.org/master/tutorial/code_snippets/xcorr_pick_correction.html). ObsPy is licensed under the [LGPL v3.0](https://www.gnu.de/documents/lgpl-3.0.en.html)
3. Ambient Seismic Noise Analysis from [Seismo Live](http://seismo-live.org). The source code is available at https://github.com/krischer/seismo_live (licensed under a ***CC BY-NC-SA 4.0 License***. &copy; 2015-2019 Lion Krischer).