Hi there, long shot but I was wondering how I might be able to use obspy to salvage some old miniseed data which appears to have some borked blockette header somewhere or another.
Reading it in the classic way in obspy works, but the start time and the end time (+1 hour theoretically) are the same and it says there are 0 samples. I can scan it in rdseed etc and it shows data is there but seemingly broken into many small slices, and for whatever reason I am struggling to grab hold of it or otherwise get anywhere with the IRIS programs.
I thought maybe if I set the station/location to ānormalā values (and not the hex hybrid whatever is shown in bold) w/ msmod that might fix it, but sadly it does not.
In [42]: ri = obspy.io.mseed.util.get_record_information(āstc3070320230000.BHZā)
Hereās the file (oops new users canāt upload, so https://filebin.net/albkvy9g4h6xgfz4 , 184 kb), if anyone has any luck or tips on how to fix this data, highly appreciated
You could read your file record-by-record. This will be very slow, but you will be able to recover all data that is not in one of the broken records since each 4k (in your case) miniseed record is completely self contained and not depending on anything else but itās own header.
Something likeā¦
st_all = Stream()
with open(..., 'rb') as fh:
while True:
data = fh.read(4096)
if not data:
break
st = read(data, 'MSEED')
st_all += st
Might have to put the read data in a BytesIO before passing it to read, not sure, or go with a lower level miniseed reading routing that eats bytes.
If the file is broken in other ways, such that records donāt start exactly at multiples of your record length (4096 bytes) throughout the whole file, youād have to come up with some more sophisticated code looking for positions of record headers.
thanks for the reply!.. tried on both the broken file given and an equivalent working file from the same recorder, same size etc but both give āembedded null character in pathā errors which I think is me just not understanding something about python and binary strings.
sliiiightly editing,
st_all = obspy.Stream()
with open(āstc3070320230000.BHZā, ārbā) as fh:
while True:
data = fh.read(4096)
if not data: break
f = open(ā/dev/shm/crapā,āwbā); f.write(data); f.close()
st = obspy.read(ā/dev/shm/crapā, āMSEEDā)
st_all += st
which works! (something like this would be a handy āemergency record-by-recordā obspy function?) giving me a complete stream with 45 traces in it. but all of those traces still have 0 samples in it so Iām at a loss. I suspect Iām just going to have to hunt for the original pre-miniseed data and try to build this from scratch again.
I didnāt think to use obspy-mseed-recordanalyzer script on these but FWIW hereās what that looks like. The broken one seems to be missing the 1000 blockette.
Hereās a āgoodā file / example of what I would expect
$ obspy-mseed-recordanalyzer EVA4060203230000.BHZ
FILE: EVA4060203230000.BHZ
Record Number: 0
Record Offset: 0 byte
Header Endianness: Big Endian
FIXED SECTION OF DATA HEADER
Sequence number: 1
Data header/quality indicator: D
Station identifier code: EVA4
Location identifier:
Channel identifier: SHZ
Network code: 7R
Record start time: 2006-02-03T23:00:00.103200Z
Number of samples: 2016
Sample rate factor: 25
Sample rate multiplier: 1
Activity flags: 0
I/O and clock flags: 0
Data quality flags: 0
Number of blockettes that follow: 1
Time correction: 0
Beginning of data: 64
First blockette: 48
BLOCKETTES
1000: Encoding Format: 1
Word Order: 1
Data Record Length: 12
$ obspy-mseed-recordanalyzer stc3070320230000.BHZ
FILE: stc3070320230000.BHZ
Record Number: 0
Record Offset: 0 byte
Header Endianness: Big Endian
FIXED SECTION OF DATA HEADER
Sequence number: 1
Data header/quality indicator: D
Station identifier code: stc3
Location identifier:
Channel identifier: BHZ
Network code: 7S
Record start time: 2007-03-20T23:00:00.218000Z
Number of samples: 2016
Sample rate factor: 25
Sample rate multiplier: 1
Activity flags: 0
I/O and clock flags: 0
Data quality flags: 0
Number of blockettes that follow: 0
Time correction: 0
Beginning of data: 0
First blockette: 0
$ ./add_blockette_1000.py --reclen 4096 --encoding INT16 stc3070320230000.BHZ test.mseed
Traceback (most recent call last):
File ā./add_blockette_1000.pyā, line 109, in
raise ValueError("Requires at least 8 bytes between fixed header "
ValueError: Requires at least 8 bytes between fixed header and beginning of data
I obviously know very little about working in binary and probably less about hacking miniseed 2.4ās (if it even is 2.4? 2007?) data structure so this potentially the end of the line for me unless Iām doing something wrong or itās possible to cram in or overwrite an extra 8 bytes where they need to be. I would also be perfectly happy just being able to access the waveform data with no headers at all.
I spent just a couple of minutes playing with your data and I am didnāt see a quick solution. You could try and crack open the records, but it looks like it doesnāt tell you what byte the data starts on in the record. When I force it, I get that each record has 0 samples, but the headers have samples.
Time 2007,079,23:55:06.4582 Samples 2016 Factor 25 Mult 1 (25sps)
Blockettes 0 Correction 0 Data Start 0 First Block 0 Host Swap LE
Record 000043 (42) Type D Network 7S Station Channel BHZ
Time 2007,079,23:56:27.0982 Samples 2016 Factor 25 Mult 1 (25sps)
Blockettes 0 Correction 0 Data Start 0 First Block 0 Host Swap LE
Record 000044 (43) Type D Network 7S Station Channel BHZ
Time 2007,079,23:57:47.7382 Samples 2016 Factor 25 Mult 1 (25sps)
Blockettes 0 Correction 0 Data Start 0 First Block 0 Host Swap LE
Record 000045 (44) Type D Network 7S Station Channel BHZ
Time 2007,079,23:59:08.3782 Samples 1296 Factor 25 Mult 1 (25sps)
Blockettes 0 Correction 0 Data Start 0 First Block 0 Host Swap LE
You might be better off trying to find the original data. You could force the data to have an offset (the error likely being raised in Krischerās code) but I would be a bit worried that you would be decoding the data incorrectly.
FYI managed to finally fix this by manually forcing the ābeginning_of_dataā var to 64 after being read in Krischerās add_blockette_1000.py script. Due to corruption this bit was always 0 / always being read in as 0.
Not sure if thereās a sane way to catch that instance for general use but it probably should never be zero / 64 appears to be a good guess if you ever find yourself in this weird mess.