{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<div style='background-image: url(\"../images/header.svg\") ; padding: 0px ; background-size: cover ; border-radius: 5px ; height: 250px'>\n",
    "    <div style=\"float: right ; margin: 50px ; padding: 20px ; background: rgba(255 , 255 , 255 , 0.7) ; width: 50% ; height: 150px\">\n",
    "        <div style=\"position: relative ; top: 50% ; transform: translatey(-50%)\">\n",
    "            <div style=\"font-size: xx-large ; font-weight: 900 ; color: rgba(0 , 0 , 0 , 0.8) ; line-height: 100%\">Ambient Seismic Noise Analysis</div>\n",
    "            <div style=\"font-size: large ; padding-top: 20px ; color: rgba(0 , 0 , 0 , 0.5)\">Cross Correlation </div>\n",
    "        </div>\n",
    "    </div>\n",
    "</div>\n",
    "\n",
    "In this tutorial you will try to reproduce one of the figures in Shapiro _et al._. To see which one, execute the second code block below. \n",
    "\n",
    "Reference: *High-Resolution Surface-Wave Tomography from Ambient Seismic Noise*, Nikolai M. Shapiro, et al. **Science** 307, 1615 (2005);\n",
    "DOI: 10.1126/science.1108339\n",
    "\n",
    "##### Authors:\n",
    "* Celine Hadziioannou\n",
    "* Ashim Rijal\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### In this notebook\n",
    "We will reproduce figure B below. This figure compares: \n",
    "1) the seismogram from an event near station MLAC, recorded at station PHL (top)\n",
    "2) the \"Greens function\" obtained by correlating noise recorded at stations MLAC and PHL (center and bottom)\n",
    "\n",
    "All bandpassed for periods between 5 - 10 seconds. \n",
    "\n",
    "<img src=\"https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/shapiro_figure.png\">\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "# Configuration step (Please run it before the code!)\n",
    "\n",
    "# MatplotLib.PyPlot\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "plt.style.use('ggplot')                            # Matplotlib style sheet - nicer plots!\n",
    "plt.rcParams['figure.figsize'] = 12, 8\n",
    "\n",
    "# NumPy\n",
    "import numpy as np\n",
    "\n",
    "# ObsPy\n",
    "from obspy.core import UTCDateTime, read\n",
    "from obspy.clients.fdsn import Client\n",
    "try: # depends on obspy version; this is for v1.1.0\n",
    "    from obspy.geodetics import gps2dist_azimuth as gps2DistAzimuth\n",
    "except ImportError:\n",
    "    from obspy.core.util import gps2DistAzimuth\n",
    "\n",
    "# ignore warnings from filters\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### 1. Read in noise data \n",
    "Read the noise data for station MLAC into a stream. \n",
    "\n",
    "Then, **read in** noise data for station PHL.\n",
    "Add this to the stream created above.\n",
    "\n",
    "These two data files contain 90 days of vertical component noise for each station.\n",
    "#### If you need data for more than 90 days, it can be downloaded form IRIS database. ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Shapiro et al. use noise data from MLAC and PHL stations\n",
    "\n",
    "num_of_days = 90                            # no of days of data: change if more than 90days of data is required\n",
    "if num_of_days <= 90:\n",
    "    # get noise data for station MLAC\n",
    "    stn = read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/noise.CI.MLAC.LHZ.2004.294.2005.017.mseed')\n",
    "    # get noise data for the station PHL and add it to the previous stream\n",
    "    stn += read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/noise.CI.PHL.LHZ.2004.294.2005.017.mseed')\n",
    "    # if you have data stored locally, comment the stn = and stn += lines above\n",
    "    # then uncomment the following 3 lines and adapt the path: \n",
    "    # stn = obspy.read('./noise.CI.MLAC.LHZ.2004.294.2005.017.mseed')\n",
    "    # stn += obspy.read('noise.CI.PHL.LHZ.2004.294.2005.017.mseed')\n",
    "    # ste = obspy.read('event.CI.PHL.LHZ.1998.196.1998.196.mseed')\n",
    "else:\n",
    "    # download data from IRIS database\n",
    "    client = Client(\"IRIS\")                               # client specification\n",
    "    t1 = UTCDateTime(\"2004-10-20T00:00:00.230799Z\")       # start UTC date/time\n",
    "    t2 = t1+(num_of_days*86400)                           # end UTC date/time\n",
    "    stn = client.get_waveforms(network=\"CI\", station=\"MLAC\",location=\"*\", channel=\"*\",\n",
    "                               starttime=t1, endtime=t2)  # get data for MLAC\n",
    "    stn += client.get_waveforms(network=\"CI\", station=\"PHL\", location=\"*\", channel=\"*\",\n",
    "                                starttime=t1, endtime=t2) # get data for PHL and add it to the previous stream"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### 2.  Preprocess noise ###\n",
    "***Preprocessing 1***\n",
    "* Just to be sure to keep a 'clean' original stream, first **copy** the noise stream with [st.copy()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.copy.html)\n",
    "The copied stream is the stream you will use from now on. \n",
    "\n",
    "* In order to test the preprocessing without taking too long, it's also useful to first **trim** this copied noise data stream to just one or a few days. This can be done with [st.trim()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.trim.html), after defining your start- and endtime. \n",
    "\n",
    "\n",
    "Many processing functionalities are included in Obspy. For example, you can remove any (linear) trends with [st.detrend()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.detrend.html), and taper the edges with [st.taper()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.taper.html). \n",
    "Different types of filter are also available in [st.filter()](https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.filter.html). \n",
    "\n",
    "* first **detrend** the data. \n",
    "* next, apply a **bandpass filter** to select the frequencies with most noise energy. The secondary microseismic peak is roughly between 0.1 and 0.2 Hz. The primary microseismic peak between 0.05 and 0.1 Hz. Make sure to use a zero phase filter! *(specify argument zerophase=True)*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Preprocessing 1\n",
    "\n",
    "stp = stn.copy()                                                 # copy stream\n",
    "t = stp[0].stats.starttime\n",
    "stp.trim(t, t + 4 * 86400)                                       # shorten stream for quicker processing\n",
    "\n",
    "stp.detrend('linear')                                            # remove trends using detrend\n",
    "stp.taper(max_percentage=0.05, type='cosine')                    # taper the edges\n",
    "stp.filter('bandpass', freqmin=0.1, freqmax=0.2, zerophase=True) # filter data of all traces in the streams"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "***Preprocessing 2***\n",
    "\n",
    "Some additional useful processing functions are provided in the following **Functions**\n",
    "\n",
    "* For each trace in the stream, apply **spectral whitening** on the frequency range you chose before (either [0.1 0.2]Hz or [0.05 0.1]Hz), using function **``whiten``**. \n",
    "\n",
    "\n",
    "* For the **time normalization**, the simplest option is to use the one-bit normalization option provided in function **``normalize``**. \n",
    "\n",
    "* *Optional: play around with different normalization options, such as clipping to a certain number of standard deviations, or using the running absolute mean normalization.*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "A brief *desription of individual* **functions** (see the next cell) are as follows:\n",
    "\n",
    "1) **whiten**:\n",
    "\n",
    "        spectral whitening of trace `tr` using a cosine tapered boxcar between `freqmin` and `freqmax`\n",
    "        (courtesy Gaia Soldati & Licia Faenza, INGV)\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def whiten(tr, freqmin, freqmax):\n",
    "    \n",
    "    nsamp = tr.stats.sampling_rate\n",
    "    \n",
    "    n = len(tr.data)\n",
    "    if n == 1:\n",
    "        return tr\n",
    "    else: \n",
    "        frange = float(freqmax) - float(freqmin)\n",
    "        nsmo = int(np.fix(min(0.01, 0.5 * (frange)) * float(n) / nsamp))\n",
    "        f = np.arange(n) * nsamp / (n - 1.)\n",
    "        JJ = ((f > float(freqmin)) & (f<float(freqmax))).nonzero()[0]\n",
    "            \n",
    "        # signal FFT\n",
    "        FFTs = np.fft.fft(tr.data)\n",
    "        FFTsW = np.zeros(n) + 1j * np.zeros(n)\n",
    "\n",
    "        # Apodization to the left with cos^2 (to smooth the discontinuities)\n",
    "        smo1 = (np.cos(np.linspace(np.pi / 2, np.pi, nsmo+1))**2)\n",
    "        FFTsW[JJ[0]:JJ[0]+nsmo+1] = smo1 * np.exp(1j * np.angle(FFTs[JJ[0]:JJ[0]+nsmo+1]))\n",
    "\n",
    "        # boxcar\n",
    "        FFTsW[JJ[0]+nsmo+1:JJ[-1]-nsmo] = np.ones(len(JJ) - 2 * (nsmo+1))\\\n",
    "        * np.exp(1j * np.angle(FFTs[JJ[0]+nsmo+1:JJ[-1]-nsmo]))\n",
    "\n",
    "        # Apodization to the right with cos^2 (to smooth the discontinuities)\n",
    "        smo2 = (np.cos(np.linspace(0., np.pi/2., nsmo+1))**2.)\n",
    "        espo = np.exp(1j * np.angle(FFTs[JJ[-1]-nsmo:JJ[-1]+1]))\n",
    "        FFTsW[JJ[-1]-nsmo:JJ[-1]+1] = smo2 * espo\n",
    "\n",
    "        whitedata = 2. * np.fft.ifft(FFTsW).real\n",
    "        \n",
    "        tr.data = np.require(whitedata, dtype=\"float32\")\n",
    "\n",
    "        return tr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "2) **correlateNoise**:\n",
    "        \n",
    "        correlate two stations, using slices of 'corrwin' seconds at a time correlations are also stacked. \n",
    "        NB hardcoded: correlates 1st with 2nd station in the stream only signals are merged - any data gaps are\n",
    "        filled with zeros.\n",
    "        st : stream containing data from the two stations to correlate\n",
    "        stations : list of stations\n",
    "        corrwin : correlation window length\n",
    "        returns 'corr' (all correlations) and 'stack' (averaged correlations)\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def correlateNoise(st, stations, corrwin):\n",
    "\n",
    "    print ('correlating stations', (stations[0], stations[1]))\n",
    "\n",
    "    # initialize sliding timewindow (length = corrwin) for correlation\n",
    "    # start 1 corrwin after the start to account for different stream lengths\n",
    "    timewin = st.select(station=stations[1])[0].stats.starttime + corrwin\n",
    "\n",
    "    # loop over timewindows \n",
    "    # stop 1 corrwin before the end to account for different stream lengths\n",
    "    while timewin < st.select(station=stations[0])[-1].stats.endtime - 2*corrwin:\n",
    "        sig1 = st.select(station=stations[0]).slice(timewin, timewin+corrwin)\n",
    "        sig1.merge(method=0, fill_value=0)\n",
    "        sig2 = st.select(station=stations[1]).slice(timewin, timewin+corrwin)\n",
    "        sig2.merge(method=0, fill_value=0)\n",
    "        xcorr = np.correlate(sig1[0].data, sig2[0].data, 'same')\n",
    "\n",
    "        try: \n",
    "            # build array with all correlations\n",
    "            corr = np.vstack((corr, xcorr))\n",
    "        except: \n",
    "            # if corr doesn't exist yet\n",
    "            corr = xcorr\n",
    "            \n",
    "        # shift timewindow by one correlation window length\n",
    "        timewin += corrwin\n",
    "\n",
    "        # stack the correlations; normalize\n",
    "        stack = np.sum(corr, 0)\n",
    "        stack = stack / float((np.abs(stack).max()))    \n",
    "    print (\"...done\")\n",
    "\n",
    "    return corr, stack"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "3) **plotStack**:\n",
    "\n",
    "        plots stack of correlations with correct time axis\n",
    "        st: stream containing noise (and station information)\n",
    "        stack: array containing stack       \n",
    "        maxlag: maximum length of correlation to plot (in seconds)\n",
    "     "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def plotStack(st, stack, maxlag, figurename=None):\n",
    "\n",
    "    # define the time vector for the correlation (length of corr = corrwin + 1)\n",
    "    limit = (len(stack) / 2.) * st[0].stats.delta\n",
    "    timevec = np.arange(-limit, limit, st[0].stats.delta)\n",
    "\n",
    "    plt.plot(timevec, stack, 'k')\n",
    "    stations = list(set([_i.stats.station for _i in st]))\n",
    "    plt.title(\"Stacked correlation between %s and %s\" % (stations[0], stations[1]))\n",
    "    plt.xlim(-maxlag, maxlag)\n",
    "    plt.xlabel('time [s]')\n",
    "\n",
    "    if figurename is not None:\n",
    "        fig.savefig(figurename, format=\"pdf\")\n",
    "    else:\n",
    "        plt.show()    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "4) **plotXcorrEvent**:\n",
    "\n",
    "        plot the noise correlation (MLAC, PHL) alongside the 1998 event signal \n",
    "        st : event stream\n",
    "        stn : noise stream\n",
    "        stack : noise correlation array\n",
    "        maxlag : maximum length of correlation, in seconds\n",
    "        acausal : set to True to use acausal part (=negative times) of the correlation\n",
    "        figurename : if a filename is specified, figure is saved in pdf format\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def plotXcorrEvent(st, stn, stack, maxlag, acausal=False, figurename=None):\n",
    "\n",
    "    eventtime = UTCDateTime(1998,7,15,4,53,21,0)                 # event near MLAC\n",
    "\n",
    "    # station locations\n",
    "    latP, lonP = 35.41, -120.55                                  # station PHL\n",
    "    latM, lonM = 37.63, -118.84                                  # station MLAC\n",
    "    latE, lonE = 37.55, -118.809                                 # event 1998\n",
    "    \n",
    "    # calculate distance between stations\n",
    "    dist = gps2DistAzimuth(latP, lonP, latM, lonM)[0]            # between PHL and MLAC\n",
    "    distE = gps2DistAzimuth(latP, lonP, latE, lonE)[0]           # between event and PHL\n",
    "                                                                 #\n",
    "    # CROSSCORRELATION\n",
    "    # reverse stack to plot acausal part (= negative times of correlation)\n",
    "    if acausal:\n",
    "        stack = stack[::-1]\n",
    "    \n",
    "    # find center of stack\n",
    "    c = int(np.ceil(len(stack)/2.) + 1)\n",
    "    \n",
    "    #cut stack to maxlag\n",
    "    stack = stack[c - maxlag * int(np.ceil(stn[0].stats.sampling_rate)) : c + maxlag * int(np.ceil(stn[0].stats.sampling_rate))]\n",
    "    \n",
    "    # find new center of stack\n",
    "    c2 = int(np.ceil(len(stack)/2.) + 1)\n",
    "\n",
    "    # define time vector for cross correlation\n",
    "    limit = (len(stack) / 2.) * stn[0].stats.delta\n",
    "    timevec = np.arange(-limit, limit, stn[0].stats.delta)\n",
    "    # define timevector: dist / t\n",
    "    timevecDist = dist / timevec\n",
    "    \n",
    "    # EVENT\n",
    "    ste = st.copy()\n",
    "    st_PHL_e = ste.select(station='PHL')\n",
    "    \n",
    "    # cut down event trace to 'maxlag' seconds\n",
    "    dt = len(stack[c2:])/stn[0].stats.sampling_rate                  #xcorrlength\n",
    "    st_PHL_e[0].trim(eventtime, eventtime + dt)\n",
    "    \n",
    "    # create time vector for event signal\n",
    "    # extreme values:\n",
    "    limit = st_PHL_e[0].stats.npts * st_PHL_e[0].stats.delta\n",
    "    timevecSig = np.arange(0, limit, st_PHL_e[0].stats.delta)\n",
    "\n",
    "    # PLOTTING\n",
    "    fig = plt.figure(figsize=(12.0, 8.0))\n",
    "    ax1 = fig.add_subplot(2,1,1)\n",
    "    ax2 = fig.add_subplot(2,1,2)\n",
    "\n",
    "    # plot noise correlation\n",
    "    ax1.plot(timevecDist[c2:], stack[c2:], 'k')\n",
    "    ax1.set_title('Noise correlation between MLAC and PHL')\n",
    "\n",
    "    # plot event near MLAC measured at PHL\n",
    "    ax2.plot(distE/timevecSig, st_PHL_e[0].data / np.max(np.abs(st_PHL_e[0].data)), 'r')\n",
    "    ax2.set_title('Event near MLAC observed at PHL')\n",
    "\n",
    "    ax2.set_xlim((0, 8000))\n",
    "    ax1.set_xlim((0, 8000))\n",
    "\n",
    "    ax2.set_xlabel(\"group velocity [m/s]\")\n",
    " \n",
    "    if figurename is not None:\n",
    "        fig.savefig(figurename, format=\"pdf\")\n",
    "    else:\n",
    "        plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "5) **Normalize**:\n",
    "\n",
    "        Temporal normalization of the traces, most after Bensen 2007. NB. before this treatment, traces must be\n",
    "        demeaned, detrended and filtered. Description of argument:\n",
    "\n",
    "        norm_method=\"clipping\"\n",
    "            signal is clipped to 'clip_factor' times the std\n",
    "            clip_factor recommended: 1 (times std)\n",
    "        \n",
    "        norm_method=\"clipping_iter\"\n",
    "            the signal is clipped iteratively: values above 'clip_factor * std' \n",
    "            are divided by 'clip_weight'. until the whole signal is below \n",
    "            'clip_factor * std'\n",
    "            clip_factor recommended: 6 (times std)\n",
    "        \n",
    "        \n",
    "        norm_method=\"ramn\"\n",
    "            running absolute mean normalization: a sliding window runs along the \n",
    "            signal. The values within the window are used to calculate a \n",
    "            weighting factor, and the center of the window is scaled by this \n",
    "            factor. \n",
    "                weight factor: w = np.mean(np.abs(tr.data[win]))/(2. * norm_win + 1) \n",
    "            finally, the signal is tapered with a tukey window (alpha = 0.2).\n",
    "\n",
    "            norm_win: running window length, in seconds.\n",
    "              recommended: half the longest period\n",
    "\n",
    "        norm_method=\"1bit\"\n",
    "            only the sign of the signal is conserved\n",
    "            "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Functions\n",
    "# collection of functions used in noise correlation processing\n",
    "\n",
    "def normalize(tr, clip_factor=6, clip_weight=10, norm_win=None, norm_method=\"1bit\"): \n",
    "    \n",
    "    if norm_method == 'clipping':\n",
    "        lim = clip_factor * np.std(tr.data)\n",
    "        tr.data[tr.data > lim] = lim\n",
    "        tr.data[tr.data < -lim] = -lim\n",
    "\n",
    "    elif norm_method == \"clipping_iter\":\n",
    "        lim = clip_factor * np.std(np.abs(tr.data))\n",
    "        \n",
    "        # as long as still values left above the waterlevel, clip_weight\n",
    "        while tr.data[np.abs(tr.data) > lim] != []:\n",
    "            tr.data[tr.data > lim] /= clip_weight\n",
    "            tr.data[tr.data < -lim] /= clip_weight\n",
    "\n",
    "    elif norm_method == 'ramn':\n",
    "        lwin = tr.stats.sampling_rate * norm_win\n",
    "        st = 0                                               # starting point\n",
    "        N = lwin                                             # ending point\n",
    "\n",
    "        while N < tr.stats.npts:\n",
    "            win = tr.data[st:N]\n",
    "\n",
    "            w = np.mean(np.abs(win)) / (2. * lwin + 1)\n",
    "            \n",
    "            # weight center of window\n",
    "            tr.data[st + lwin / 2] /= w\n",
    "\n",
    "            # shift window\n",
    "            st += 1\n",
    "            N += 1\n",
    "\n",
    "        # taper edges\n",
    "        taper = get_window(tr.stats.npts)\n",
    "        tr.data *= taper\n",
    "\n",
    "    elif norm_method == \"1bit\":\n",
    "        tr.data = np.sign(tr.data)\n",
    "        tr.data = np.float32(tr.data)\n",
    "\n",
    "    return tr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "6) **get_window**:\n",
    "\n",
    "        Return tukey window of length N\n",
    "        N: length of window\n",
    "        alpha: alpha parameter in case of tukey window.\n",
    "        0 -> rectangular window\n",
    "        1 -> cosine taper\n",
    "        returns: window (np.array)\n",
    "        \n",
    "Doc of [scipy.signal.get_window](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.signal.get_window.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0,
     3,
     46,
     57,
     93,
     130,
     150
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def get_window(N, alpha=0.2):\n",
    "\n",
    "    window = np.ones(N)\n",
    "    x = np.linspace(-1., 1., N)\n",
    "    ind1 = (abs(x) > 1 - alpha) * (x < 0)\n",
    "    ind2 = (abs(x) > 1 - alpha) * (x > 0)\n",
    "    window[ind1] = 0.5 * (1 - np.cos(np.pi * (x[ind1] + 1) / alpha))\n",
    "    window[ind2] = 0.5 * (1 - np.cos(np.pi * (x[ind2] - 1) / alpha))\n",
    "    return window"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Actual preprocessing happens here -- this can take a while!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Preprocessing 2\n",
    "st = stp.copy()                            # copy stream\n",
    "\n",
    "for tr in st:\n",
    "    tr = normalize(tr, norm_method=\"1bit\")\n",
    "    tr = whiten(tr, 0.1, 0.2)\n",
    "print ('done!')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Cross-correlation ####\n",
    "Once you're happy with the preprocessing, you can calculate the **cross-correlation** using **``correlateNoise``** function. The cross-correlation are computed by slices of a few hours each (specified in *corrwin*). \n",
    "\n",
    "**For correlateNoise function**\n",
    "* input: stream, list of stations (here: ['MLAC', 'PHL']), slice length in seconds\n",
    "* output: all individual correlations, stack\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Cross-correlate\n",
    "xcorr, stack = correlateNoise(st, ['MLAC','PHL'], 7200)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The resulting stack can be **plotted** with **``plotStack``** function. Since it doesn't make much sense to look at a 2 hour long correlation signal, you can decide to plot only the central part by specifying a ``maxlag`` (in seconds). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Plotting\n",
    "\n",
    "plotStack(st, stack, 400)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "If you're only working with a few days of noise (after trimming), this plot probably doesn't look very nice. You could go back to the code block named 'preprocessing 1', and keep a longer noise record (10 days works quite well already). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Compare to event trace ####\n",
    "In 1998, a M = 5.1 event occurred next to station MLAC. This event was recorded at PHL and we read this data.\n",
    "\n",
    "* **read** the event data to a separate stream"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ste = read('https://raw.github.com/ashimrijal/NoiseCorrelation/master/data/event.CI.PHL.LHZ.1998.196.1998.196.mseed')\n",
    "# if data is stored locally, uncomment the following line and comment the line above:\n",
    "#ste = obspy.read('./event.CI.PHL.LHZ.1998.196.1998.196.mseed')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Preprocess event ####\n",
    "\n",
    "The event signal should be processed in a similar way to the noise. \n",
    "\n",
    "* **detrend** in the same way as before\n",
    "* **bandpass filter** for the same frequencies as chosen above\n",
    "* apply **spectral whitening** to each trace in the event stream to ensure the spectrum is comparable to the noise.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Preprocessing\n",
    "\n",
    "ste.detrend('linear')\n",
    "ste.filter('bandpass', freqmin=0.1, freqmax=0.2, zerophase=True)\n",
    "for tr in ste:\n",
    "    tr = whiten(tr, 0.1, 0.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "#### Plot! ####\n",
    "\n",
    "A plotting function is provided to plot both signals alongside: **``plotXcorrEvent``**. \n",
    "\n",
    "* input: event stream, noise stream, stack, maxlag"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "code_folding": [
     0
    ],
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# Plotting\n",
    "\n",
    "plotXcorrEvent(ste, stn, stack, 400)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}