seismic.xcorqc package¶
Submodules¶
seismic.xcorqc.analytic_plot_utils module¶
Utility functions supporting plotting for cross-correlation visualizations.
-
seismic.xcorqc.analytic_plot_utils.
distance
(origin, destination)[source]¶ Compute the distance in km between origin coordinates and destination coordinates. The coordinates are (latitude, longitude) couplets in units of degrees.
-
seismic.xcorqc.analytic_plot_utils.
drawBBox
(min_lon, min_lat, max_lon, max_lat, base_map, **kwargs)[source]¶ Draw bounding box on a basemap
-
seismic.xcorqc.analytic_plot_utils.
timestamps_to_plottable_datetimes
(time_series)[source]¶ Convert a series of float (or equivalent) timestamp values to matplotlib plottable datetimes.
- Parameters
time_series (iterable container) – Series of timestamps
- Returns
Equivalent series of plottable timestamps
- Return type
numpy.array(‘datetime64[ms]’) with millisecond resolution
seismic.xcorqc.correlator module¶
- Description:
Generates cross-correlations for data from staion-pairs in parallel
References:
CreationDate: 11/07/18
Developer: rakib.hassan@ga.gov.au
- Revision History:
LastUpdate: 11/07/18 RH LastUpdate: dd/mm/yyyy Who Optional description
-
seismic.xcorqc.correlator.
process
(data_source1, data_source2, output_path, interval_seconds, window_seconds, window_overlap, window_buffer_length, resample_rate=None, taper_length=0.05, nearest_neighbours=1, fmin=None, fmax=None, netsta_list1='*', netsta_list2='*', pairs_to_compute=None, start_time='1970-01-01T00:00:00', end_time='2100-01-01T00:00:00', instrument_response_inventory=None, instrument_response_output='vel', water_level=50, clip_to_2std=False, whitening=False, whitening_window_frequency=0, one_bit_normalize=False, read_buffer_size=10, ds1_zchan=None, ds1_nchan=None, ds1_echan=None, ds2_zchan=None, ds2_nchan=None, ds2_echan=None, corr_chan=None, envelope_normalize=False, ensemble_stack=False, restart=False, dry_run=False, no_tracking_tag=False)[source]¶ DATA_SOURCE1: Text file containing paths to ASDF files
DATA_SOURCE2: Text file containing paths to ASDF files
OUTPUT_PATH: Output folder
INTERVAL_SECONDS: Length of time window (s) over which to compute cross-correlations; e.g. 86400 for 1 day
- WINDOW_SECONDS: Length of stacking window (s); e.g 3600 for an hour. INTERVAL_SECONDS must be a multiple of
WINDOW_SECONDS; no stacking is performed if they are of the same size.
seismic.xcorqc.fft module¶
seismic.xcorqc.generate_dispersion_curves module¶
- Description:
Runs Rhys Hawkins’ code in parallel to generate dispersion curves based on cross-correlations of station-pairs. Note that this script call shell scripts that are expected to be in the current working directory.
todo: remove dependence on shell scripts.
References:
CreationDate: 10/01/20
Developer: rakib.hassan@ga.gov.au
- Revision History:
LastUpdate: 10/01/20 RH LastUpdate: dd/mm/yyyy Who Optional description
seismic.xcorqc.generate_test_data module¶
seismic.xcorqc.utils module¶
-
class
seismic.xcorqc.utils.
ProgressTracker
(output_folder, restart_mode=False)[source]¶ Bases:
object
-
seismic.xcorqc.utils.
drop_bogus_traces
(st, sampling_rate_cutoff=1)[source]¶ Removes spurious traces with suspect sampling rates. :param st: Obspy Stream :param sampling_rate_cutoff: sampling rate threshold :return: Input stream is updated inplace
seismic.xcorqc.validate_xcorr_setup module¶
seismic.xcorqc.xcorqc module¶
- Description:
Cross-correlation functionality
References:
CreationDate: 29/06/17
Developer: laurence.davies@ga.gov.au
- Revision History:
LastUpdate: 29/06/17 LD First commit of xcor code. LastUpdate: 13/07/17 LD Fixed xcor filtering issue when traces have different sample rates. LastUpdate: 11/08/17 RH Implement ASDF-based cross-correlation workflow LastUpdate: 11/07/18 RH Implemented parallel cross-correlator LastUpdate: 19/07/18 RH Implemented cross-correlation approaches described in Habel et al. 2018
LastUpdate: dd/mm/yyyy Who Optional description
-
seismic.xcorqc.xcorqc.
IntervalStackXCorr
(refds, tempds, start_time, end_time, ref_net_sta, temp_net_sta, ref_sta_inv, temp_sta_inv, instrument_response_output, water_level, ref_cha, temp_cha, baz_ref_net_sta, baz_temp_net_sta, resample_rate=None, taper_length=0.05, buffer_seconds=864000, interval_seconds=86400, window_seconds=3600, window_overlap=0.1, window_buffer_length=0, flo=None, fhi=None, clip_to_2std=False, whitening=False, whitening_window_frequency=0, one_bit_normalize=False, envelope_normalize=False, ensemble_stack=False, outputPath='/tmp', verbose=1, tracking_tag='')[source]¶ This function rolls through two ASDF data sets, over a given time-range and cross-correlates waveforms from all possible station-pairs from the two data sets. To allow efficient, random data access asdf data sources, an instance of a SeisDB object, instantiated from the corresponding Json database is passed in (tempds_db) – although this parameter is not mandatory, data-access from large ASDF files will be slow without it.
Station-ids to be processed from the two data-sources can be specified as lists of strings, while wildcards can be used to process all stations. Data is fetched from the sources in chunks to limit memory usage and data-windows with gaps are discarded.
Cross-correlation results are written out for each station-pair, in the specified folder, as NETCDF4 files. Panoply (https://www.giss.nasa.gov/tools/panoply/), already installed on the NCI VDIs can be used to interrogate these results.
- Parameters
refds (FederatedASDFDataSet) – FederatedASDFDataSet containing reference-station data
tempds (FederatedASDFDataSet) – FederatedASDFDataSet containing temporary-stations data
ref_net_sta (str) – Network.Station for the reference Dataset.
temp_net_sta (str) – Network.Station for the temporary Dataset.
ref_sta_inv (Inventory) – Inventory containing instrument response for station
temp_sta_inv (Inventory) – Inventory containing instrument response for station
instrument_response_output (str) – Output of instrument response correction; can be either ‘vel’ or ‘disp’
water_level (float) – Water-level used during instrument response correction
ref_cha (str) – Channel name for the reference Dataset
temp_cha (str) – Channel name for the temporary Dataset
baz_ref_net_sta (float) – Back-azimuth of ref station from temp station in degrees
baz_temp_net_sta (float) – Back-azimuth of temp station from ref station in degrees
resample_rate (float) – Resampling rate (Hz). Applies to both data-sets
taper_length (float) – Taper length as a fraction of window length
buffer_seconds (int) – The amount of data to be fetched per call from the ASDFDataSets, because we may not be able to fetch all the data (from start_time to end_time) at once. The default is set to 10 days and should be a multiple of interval_seconds.
interval_seconds (int) – The interval in seconds, over which cross-correlation windows are stacked. Default is 1 day.
window_seconds (int) – Length of cross-correlation window in seconds. Default is 1 hr.
window_overlap (float) – Window overlap fraction. Default is 0.1.
window_buffer_length (float) – Buffer length as a fraction of ‘window-seconds’ around actual data windows of interest. This helps exclude effects of tapering and other edge artefacts from data windows before cross-correlation. Default is 0
flo (float) – Lower frequency for Butterworth bandpass filter
fhi (float) – Upper frequency for Butterworth bandpass filter
clip_to_2std (bool) – Clip data in each window to +/- 2 standard deviations
whitening (bool) – Apply spectral whitening
whitening_window_frequency (float) – Window frequency (Hz) used to determine length of averaging window for smoothing spectral amplitude
one_bit_normalize (bool) – Apply one-bit normalization to data in each window
envelope_normalize (bool) – Envelope via Hilbert transforms and normalize
ensemble_stack (bool) – Outputs a single CC function stacked over all data for a given station-pair
verbose (int) – Verbosity of printouts. Default is 1; maximum is 3.
tracking_tag (str) – File tag to be added to output file names so runtime settings can be tracked
outputPath (str) – Folder to write results to
- Param
start_time: Start-time (UTCDateTime format) for data to be used in cross-correlation
- Param
end_time: End-time (UTCDateTime format) for data to be used in cross-correlation
- Returns
1: 1d np.array with time samples spanning [-window_samples+dt:window_samples-dt] 2: A dictionary of 2d np.arrays containing cross-correlation results for each station-pair. Rows in each 2d array represent number of interval_seconds processed and columns represent stacked samples of length window_seconds. 3: A dictionary of 1d np.arrays containing number of windows processed, within each interval_seconds period, for each station-pair. These Window-counts could be helpful in assessing robustness of results.
-
seismic.xcorqc.xcorqc.
setup_logger
(name, log_file, level=20)[source]¶ Function to setup a logger; adapted from stackoverflow
-
seismic.xcorqc.xcorqc.
whiten
(a, sampling_rate, window_freq=0)[source]¶ Applies spectral whitening to trace samples. When window_freq=0, all frequency bins are normalized by their amplitudes, i.e. all frequency bins end up with an amplitude of 1. When window_freq is nonzero, a smoothed amplitude spectrum (smoothing window length is as computed below) is used to normalize the frequency bins.
- Parameters
a – trace samples
sampling_rate – sampling rate
window_freq – smoothing window length (Hz)
- Returns
spectrally whitened samples
-
seismic.xcorqc.xcorqc.
xcorr2
(tr1, tr2, sta1_inv=None, sta2_inv=None, instrument_response_output='vel', water_level=50.0, window_seconds=3600, window_overlap=0.1, window_buffer_length=0, interval_seconds=86400, taper_length=0.05, resample_rate=None, flo=None, fhi=None, clip_to_2std=False, whitening=False, whitening_window_frequency=0, one_bit_normalize=False, envelope_normalize=False, verbose=1, logger=None)[source]¶
seismic.xcorqc.xcorr_station_clock_analysis module¶
Functions for computing estimated GPS clock corrections based on station pair cross-correlation and plotting in standard layout.
-
class
seismic.xcorqc.xcorr_station_clock_analysis.
XcorrClockAnalyzer
(src_file, time_window, snr_threshold, pcf_cutoff_threshold)[source]¶ Bases:
object
Helper class for bundling of preprocessed cross-correlation data before plotting or subsequent processing.
-
do_clustering
(coeffs)[source]¶ Do DBSCAN clustering on the corrections.
- Parameters
coeffs (tuple(float, float, float)) – Triplet of distance coefficients, corresponding to the sensitivity of the clustering to point separation along 1) x-axis (time), 2) y-axis (correction) and 3) slope (drift rate)
- Returns
Results of sklearn.cluster.dbscan (refer to third party documentation)
-
do_spline_regression
(group_ids, regression_degree)[source]¶ Do univariate spline regression on each cluster of points.
- Parameters
group_ids – Cluster IDS generated from do_clustering()
regression_degree – Desired degree of curve fit for each cluster, one for each non-negative cluster ID
- Returns
dict of regressors that can be applied to arbitrary time values
-
do_spline_resampling
(group_ids, regressors, sampling_period_seconds)[source]¶ Using pre-computed regressors, resample every cluster at a prescribed frequency
- Parameters
group_ids –
regressors –
sampling_period_seconds –
- Returns
-
plot_clusters
(ax, ids, coeffs, stn_code='')[source]¶ Plot the distinct clusters color coded by cluster ID, with underlying corrections shown in gray.
- Parameters
ax –
ids –
coeffs –
- Returns
-
-
seismic.xcorqc.xcorr_station_clock_analysis.
batch_process_folder
(folder_name, dataset, time_window, snr_threshold, pearson_cutoff_factor=0.5, save_plots=True)[source]¶ Process all the .nc files in a given folder into graphical visualizations.
- Parameters
folder_name (str) – Path to process containing .nc files
dataset (FederatedASDFDataset) – Dataset to be used to ascertain the distance between stations.
time_window (float) – Lag time window to plot (plus or minus this value in seconds)
snr_threshold (float) – Minimum signal to noise ratio for samples to be included into the clock lag estimate
save_plots – Whether to save plots to file, defaults to True
save_plots – bool, optional
-
seismic.xcorqc.xcorr_station_clock_analysis.
batch_process_xcorr
(src_files, dataset, time_window, snr_threshold, pearson_cutoff_factor=0.5, save_plots=True, underlay_rcf_xcorr=False, force_save=False)[source]¶ Process a batch of .nc files to generate standard visualization graphics. PNG files are output alongside the source .nc file. To suppress file output, set save_plots=False.
- Parameters
src_files (Iterable of str) – List of files to process
dataset (FederatedASDFDataset) – Dataset to be used to ascertain the distance between stations.
time_window (float) – Lag time window to plot (plus or minus this value in seconds)
snr_threshold (float) – Minimum signal to noise ratio for samples to be included into the clock lag estimate
save_plots – Whether to save plots to file, defaults to True
save_plots – bool, optional
underlay_rcf_xcorr – Show the individual correlation of row sample with RCF beneath the computed time lag, defaults to False
underlay_rcf_xcorr – bool, optional
- Returns
List of files for which processing failed, and associated error.
- Return type
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_estimated_timeshift
(ax, x_lag, y_times, correction, annotation=None, row_rcf_crosscorr=None)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_pearson_corr_coeff
(ax, rcf, ccf_masked, y_times)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_reference_correlation_function
(ax, x_lag, rcf, rcf_corrected, snr_threshold)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_snr_histogram
(ax, snr, time_window, nbins=10)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_xcorr_file_clock_analysis
(src_file, asdf_dataset, time_window, snr_threshold, pearson_correlation_factor, show=True, underlay_rcf_xcorr=False, pdf_file=None, png_file=None, title_tag='', settings=None)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
plot_xcorr_time_series
(ax, x_lag, y_times, xcorr_data, use_formatter=False)[source]¶
-
seismic.xcorqc.xcorr_station_clock_analysis.
read_correlator_config
(nc_file)[source]¶ Read the correlator settings used for given nc file.
- Parameters
nc_file (str) – File name of the .nc file containing the cross-correlation data.
- Returns
Pandas Series with named fields whose values are the runtime settings used for the .nc file
- Return type
-
seismic.xcorqc.xcorr_station_clock_analysis.
station_codes
(filename)[source]¶ Convert a netCDF4 .nc filename generated by correlator to the corresponding station codes in the format
NETWORK.STATION
Assumed format:
NET1.STAT1.NET2.STA2.*.nc