MiniSEED to GCF

Is there any way to generate GCF files from miniSEED files? Obspy lists GCF for both read/write supported files. But apparently, obspy only supports writing a GCF file when the stream is GCF packets (re-write support rather than write support). Am I missing something? Thank you in advance for any help.

I don’t see why that shouldn’t work…

st = read('http://examples.obspy.org/BW.KW1..EHZ.D.2011.038', format='MSEED')
st.write("/tmp/tmp.gcf", format="GCF")
st[0].stats.starttime = UTCDateTime(2000, 1, 1)
st.write("/tmp/tmp.gcf", format="GCF")
st = read("/tmp/tmp.gcf", format="GCF")

Although there is an error with fractional start time and at the end trying to read it again there also in some error. CC @cpaitor

The GCF format do not support fractional starttime for sampling rates <= 250 Hz and even for sampling rates only certain starttimes are supported. This is pointed out in the documentation but should perhaps be better highlighted. Supported fractional starttimes can be computed using the function obspy.io.gcf.core.get_time_denominator.

However, the code example given by megies triggered a BUG that I will fix (explained at end)

Note that the read in the code example will work if the keyword argument errorret=True is added to the call, i.e.

st = read('http://examples.obspy.org/BW.KW1..EHZ.D.2011.038', format='MSEED')
st[0].stats.starttime = UTCDateTime(2000, 1, 1)
st.write("tmp.gcf", format="GCF")
st = read("tmp.gcf", format="GCF",errorret=True)

Regarding the getting the write to work I’m not sure I would recomend to change the starttime, perhaps better to trim the stream object instead so that it is aligned with supported starttimes. This could be achieved using the code:

from obspy.io.gcf.core import compatible_sps, get_time_denominator
for tr in st:
    # check that sampling rate is actually supported as not all are
    if not compatible_sps(tr.stats.sampling_rate):
        raise IOError(" sampling rate not supported by GCF format")
    else:
        if tr.stats.starttime.microsecond != 0:
            # trace have fractional starttime
            starttime = None
            if tr.stats.sampling_rate <= 250:
                # fractional starttime not supported, set new starttime to integer second
                starttime = tr.stats.starttime + (1000000 - tr.stats.starttime.microsecond)*1.e-6
            else:
                # fractional starttime supported, find closest supported starttime
                d = int(1.e6/get_time_denominator(tr.stats.sampling_rate))
                for i in range(1,d):
                    if i*d == tr.stats.starttime.microsecond:
                        # Starttime already aligned with supported fractional starttime
                        break
                    elif i*d > tr.stats.starttime.microsecond:
                        # i*d is the closest supported starttime (after current starttime)
                        starttime = tr.stats.starttime + (i*d-tr.stats.starttime.microsecond)*1.e-6
                        break
        if starttime:
            tr.trim(starttime=starttime)

The BUG in megies code example arises as a GCF file consists of 1024 byte blocks, each with a 16 byte header. The header of each block contains a compression code (1, 2, or 4) and an integer giving the number of four-byte records in the. The number of data samples in the block is then equal to the number of four-byte records times the compression code. Hence for compression code 4 the number of data samples must be an integer (<= 250) multiple of 4 (i.e. 4 or 8 or 12 or …). The mseed file in megies code example contains 17,279,919 data samples of which 17,279,200 samples fills up the first 20,236 blocks in the gcf file, leaving 719 data samples for the last block. Data is stored as signed first difference values and in the case here the signed first difference is within the range -128 to 127, hence can be stored using 8-bits each which means compression code 4. Now 719 is not an integer multiple of 4 and since integer division is used to compute the number of four-byte records this becomes 179, i.e. 716 data samples. Yet all samples get written to the file. After the last data sample the reverse integration constant (ric, basically the value of the last data sample) gets written using 32 bits. Upon read however the reader thinks that there are only 716 data samples in the block and then reads the ric which of course gives the wrong number as the first three-bytes are data samples 717-719 and the last byte is the first four bytes of the actual ric. Ergo when comparing the ric to the last data samples the reader raises an exception. The solution is then to split the last block in two, the first holding the proper number of data samples (here 716) at used compression and the last holding the remaining data samples at compression 1.

2 Likes

come to think of it the suggested function to trim the traces to supported starttimes perhaps should be added as a utility function to obspy.io.gcf.core, would this be of interest? (and should I then split the BUG-fix and addition of this function into two PR’s?)

Apart from the fractional starttime, I get the following error from Megies example.

st.write(“tmp.gcf”, format=“GCF”)
Traceback (most recent call last):
File “D:\anaconda3\lib\site-packages\obspy\core\stream.py”, line 1447, in write
format_ep = ENTRY_POINTS[‘waveform_write’][format]
KeyError: ‘GCF’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “”, line 1, in
File “D:\anaconda3\lib\site-packages\obspy\core\stream.py”, line 1454, in write
raise ValueError(msg % (format,
ValueError: Writing format “GCF” is not supported. Supported types: MSEED, SAC, GSE2, SACXY, Q, SH_ASC, SLIST, TSPAIR, PICKLE, SEGY, SU, WAV, AH

I am not sure what “Writing format “GCF” is not supported” means if obspy supports read/write of GCF files.

which version of obspy are you using (write support for GCF were added in version 1.4.0)?

Thank you Peter and Tobias! I realized that I was using v1.3. Updated and it worked fine.

Absolutely, that was just a quick thing to get the code to run :+1:

1 Like

see updated algorithm to split and encode data into blocks and addded test by paitor · Pull Request #3249 · obspy/obspy · GitHub