Stream merge, sampling rates do not match

Hi,

We are continuously writing 16bit data into 10 second ASDF (ASDF — ASDF Definition 1.0.2 documentation) files . The digitizer card is clocked by a synthesizer output of a GPS card at 200 kHz and therefore should be exact within errors of course.

Now, for testing I wanted to load one trace (location = 25) of five 10 second ASDF-snippets into an Obspy Stream and merge it:

from pathlib import Path
import pyasdf
from obspy.core import Stream
import numpy as np

# Path to asdf files
asdf_folder = Path('path/to/asdf/files')
asdf_list = list(sorted(asdf_folder.glob('*.h5')))

file_names = []
npts_all = []
delta = []
stream_1 = Stream()
locations = [25] #np.linspace(1, 32, 32, dtype=int)
for index, file_name in enumerate(asdf_list):

    asdf_1 = pyasdf.ASDFDataSet(file_name, mode='r')
    start_time = asdf_1.waveforms.XB_02.raw_recording[0].stats.starttime
    end_time = asdf_1.waveforms.XB_02.raw_recording[0].stats.endtime
    for location in locations:
        stream_1 += asdf_1.get_waveforms(network='XB', station='02',
                                         location=str(int(location)).zfill(2), channel="001",
                                         starttime=start_time,
                                         endtime=end_time,
                                         tag="raw_recording")
    # print(stream_1[index].stats.sampling_rate)
    # stream_1[index].stats.sampling_rate = 200000
    # stream_1.plot(outfile='single_channel.png')
print(stream_1)

for tr in stream_1:
    tr.stats.sampling_rate = 200000.0

# stream_1.merge(misalignment_threshold=0.5)
stream_2 = stream_1.copy()
stream_2.merge(method=0)
print(stream_2)
stream_1.merge(method=0)
  • print(stream_1) shows me:

5 Trace(s) in Stream:
XB.02.25.001 | 2021-07-07T09:54:58.629657Z - 2021-07-07T09:55:08.629652Z | 200000.0 Hz, 2000000 samples
XB.02.25.001 | 2021-07-07T09:55:08.629657Z - 2021-07-07T09:55:18.629652Z | 200000.0 Hz, 2000000 samples
XB.02.25.001 | 2021-07-07T09:55:18.629657Z - 2021-07-07T09:55:28.629652Z | 200000.0 Hz, 2000000 samples
XB.02.25.001 | 2021-07-07T09:55:28.629657Z - 2021-07-07T09:55:38.629652Z | 200000.0 Hz, 2000000 samples
XB.02.25.001 | 2021-07-07T09:55:38.629657Z - 2021-07-07T09:55:48.629652Z | 200000.0 Hz, 2000000 samples

  • Trying to merge stream_1 including messing around with sampling rates results in the following error:
    Traceback (most recent call last):
    File “C:\Program Files\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev\pydevd.py”, line 1448, in _exec
    pydev_imports.execfile(file, globals, locals) # execute the script
    File “C:\Program Files\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py”, line 18, in execfile
    exec(compile(contents+"\n", file, ‘exec’), glob, loc)
    File “E:/Linus_1/22_GMuG_DAQ/01/DUGseis-acquisition/dug_seis/acquisition/scripts/x_plot_saw_teeth_of_consecutive_asdfs.py”, line 37, in
    stream_1.merge(method=0)
    File “C:\ProgramData\Anaconda3\envs\DUGSeis\lib\site-packages\obspy\core\stream.py”, line 1994, in merge
    self._cleanup(**kwargs)
    File “C:\ProgramData\Anaconda3\envs\DUGSeis\lib\site-packages\obspy\core\stream.py”, line 3053, in _cleanup
    cur_trace += trace
    File “C:\ProgramData\Anaconda3\envs\DUGSeis\lib\site-packages\obspy\core\trace.py”, line 731, in add
    raise TypeError(“Sampling rate differs: %s vs %s” %
    TypeError: Sampling rate differs: 199999.99999999997 vs 200000.0

  • Surprisingly, stream_2 (copy of stream_1) I was able to merge:
    1 Trace(s) in Stream:
    XB.02.25.001 | 2021-07-07T09:54:58.629657Z - 2021-07-07T09:55:48.629652Z | 200000.0 Hz, 10000000 samples

I do not really understand why we cannot merge stream_1 as timestamps and sampling rates of the traces look fine? Would appreciate any suggestion you might have! In case you’d like to try with data, you will find the asdf files here: polybox

Thanks!

Hi @Linvill, I haven’t been able to confirm this, but I think that this is related to this issue 2600 that Chet Hopp noted with similarly highly sampled data. In that case though the issue arose on the copied data rather than the raw data!

The issue seems to stem from small floating-point inaccuracies when computing derived quantities. It looks like forcing the sample-rate in your example doesn’t correct the issue, but forcing the delta does, e.g.

for tr in stream_1:
   tr.stats.delta = 0.000005

I imagine that there is some issue in reading from ASDF that sometimes incorrectly converts the small delta value from binary format.

Sorry I don’t have a great fix, but hopefully that helps you work out/understand what might be going on.