Problem with ASCII encoded MiniSeed files

Davit_Mikava · March 15, 2016, 12:42pm

I’m getting ValueError when trying to convert mseed files to SAC, or even when plotting the data, or merging them.
I am using python 2.7 and Obspy 0.10.2 on Ubuntu 14.04. (I have got same error with Obspy 1.0.0).

read() function read this files without warnings or errors. encoding of mseed is ASCII and dtype is ‘|S1’.
This mseed files can be opened using SHM without problem. So i think there is not issue with files itself.

Script:

import functions,sys,MySQLdb,glob,obspy,os,numpy;
numpy.set_printoptions(threshold=numpy.nan)
hostname = “127.0.0.1”
username = “st_user”
password = “qwert”
database = “earthquakes”
db = MySQLdb.connect(hostname,username,password,database)
cursor = db.cursor(MySQLdb.cursors.DictCursor)
storage = “/home/sysop/Projects/Mseed2SAC/sac_files/”
merge_method = 1
gap_fill_value = 0
for id_ in sys.stdin:
sql_query = "SELECT primaries.id, primaries.networkCode, primaries.stationCode, primaries.locationCode,
earthquakes.eq_key FROM primaries INNER JOIN earthquakes ON earthquakes.id = primaries.eq_id AND
primaries.eq_id = %s WHERE networkCode = ‘GO’ or networkCode = ‘KO’ or networkCode = ‘TU’ or
networkCode = ‘II’ or networkCode = ‘IU’ " % id_
cursor.execute(sql_query)
primaries = cursor.fetchall()
year, month, day, hour, minute, second, doy, eq_time = functions.eqkey_parser(primaries[0][‘eq_key’])
path = str(‘%s%s/%s/’ % (storage, str(year), id_.rstrip(‘\n’))) # id_ -s boloshi aqvs \n. rstrip-it vashoreb.
if not os.path.exists(path):
os.makedirs(path)
for station in primaries:
st_name = station[‘stationCode’]
net_code = station[‘networkCode’]
loc_code = functions.get_location(st_name, net_code)
ies_loc_code = station[‘locationCode’]
file_regex_path = functions.file_full_path(year, net_code, st_name, loc_code, doy)
file_path = glob.glob(file_regex_path)
if st_name == “CHVG”: # this is just for reading only problematic files
if file_path:
for file in file_path:
st = obspy.core.read(file, starttime = eq_time-15, endtime = eq_time+300)
channel = st[0].stats.channel
if len(st)>1:
st.merge(method=merge_method, fill_value=gap_fill_value) # gawyvetili an gadafaruli failebis gadabma.
file_name = st_name + “.” + net_code + “.” + ies_loc_code + “.” + channel + “.SAC”
st.write(path + file_name, format=‘SAC’)
else:
print “File Not Found ,”, id_.rstrip(), “,”, st_name, “,”, net_code, “,”, file_regex_path

Error output:

>>> echo 11479 | python ./mseed2sac.py
Traceback (most recent call last):
File “./mseed2sac.py”, line 38, in
st.write(path + file_name, format=‘SAC’)
File “/usr/lib/python2.7/dist-packages/obspy/core/stream.py”, line 1418, in write
writeFormat(self, filename, **kwargs)
File “/usr/lib/python2.7/dist-packages/obspy/sac/core.py”, line 441, in writeSAC
_writeSAC(trace, fh, byteorder=byteorder, **kwargs)
File “/usr/lib/python2.7/dist-packages/obspy/sac/core.py”, line 471, in _writeSAC
t = SacIO(trace)
File “/usr/lib/python2.7/dist-packages/obspy/sac/sacio.py”, line 287, in init
self.readTrace(filen)
File “/usr/lib/python2.7/dist-packages/obspy/sac/sacio.py”, line 787, in readTrace
starttime=trace.stats.starttime)
File “/usr/lib/python2.7/dist-packages/obspy/sac/sacio.py”, line 342, in fromarray
self.seis = np.require(trace, native_str(‘<f4’))
File “/usr/lib/python2.7/dist-packages/numpy/core/numeric.py”, line 649, in require
return asanyarray(a, dtype=dtype)
File “/usr/lib/python2.7/dist-packages/numpy/core/numeric.py”, line 512, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: could not convert string to float: )

This is the st[0].data output :

[‘\x02’ ‘\xe3’ ‘\x01’ ‘\xd2’ ‘’ … ‘\x0f’ ‘*’ ‘\xaa’ ‘\xaa’ ‘\xaa’]

You can download example of miniseed file from the following link:
https://drive.google.com/file/d/0B7Sj7pYqfJc6X0xzM0ZLLVJxWjQ/view?usp=sharing

P.S: This problem dose not occur with steim1 and steim2 ecnoded mseed files. Only whit ASCII encoding.

Chad_Trabant · March 15, 2016, 9:28pm

Hi Davit,

The miniSEED encoding of ASCII is probably incorrect. I tried reading your test data while forcing the decoding of Steim-1 (encoding value 10) and I got something that looks decently like seismic time series, see attached plot.

The ASCII encoding is identified with a value of 0, which I’m guessing the program that created these records incorrectly put in the header, perhaps memory initialized to 0’s and then never set.

By the way the lower case “e” in the channel codes of “HHe” is invalid SEED.

regards,
Chad

Chad_Trabant · March 15, 2016, 9:33pm

Also, I should mention that ASCII encoded data is rarely time series. In miniSEED, the ASCII encoding is basically opaque data and without external knowledge of the contents a reader will not be able to do anything but perhaps try to print it.

Chad

Davit_Mikava · March 16, 2016, 11:39am

Chad,
Thank you very much for helping.

Can you point me out how to force decoding to steim-1. As i can see there is not any option in Obspy read function to force decoding.

Have i to change value 0 with 10 directly in miniseed file?

regards,
Davit

Chad_Trabant · March 16, 2016, 3:50pm

Hi Davit,

For programs using libmseed, which I believe should be true for all ObsPy miniSEED reading, you can set the UNPACK_DATA_FORMAT environment variable to the encoding you wish it to use. For example (using bash):

export UNPACK_DATA_FORMAT=10

will instruct libmseed to unpack Steim-1 (encoding 10). I presume this will work if you set it prior to starting a Python/ObsPy session/script, but I’ve not tried. Obviously you do not want to forget that you have this set or it could be very confusing when reading miniSEED with different encoding.

If you want to change the miniSEED so that it is self consistent (highly recommended) you can use msmod:
https://seiscode.iris.washington.edu/projects/msmod

There are probably other tools.

Chad

PS. allowed encodings are documented in the SEED manual (http://www.fdsn.org/publications/). For libmseed, there are decoders for the following encodings:

/* SEED data encoding types */
#define DE_ASCII 0
#define DE_INT16 1
#define DE_INT32 3
#define DE_FLOAT32 4
#define DE_FLOAT64 5
#define DE_STEIM1 10
#define DE_STEIM2 11
#define DE_GEOSCOPE24 12
#define DE_GEOSCOPE163 13
#define DE_GEOSCOPE164 14
#define DE_CDSN 16
#define DE_SRO 30
#define DE_DWWSSN 32

Davit_Mikava · March 17, 2016, 8:13am

Chad,

I have tried to set environmental variable and it worked just fine, but as you recommended i am going to change miniSEED files itself.

Thank you again,
Davit

PS: I am just curious, is there any other way to find out encoding of miniSEED file other than encoding value in miniSEED header. Can you explain how SHM can read this files properly.

Chad_Trabant · March 17, 2016, 4:23pm

Hi Davit,

PS: I am just curious, is there any other way to find out encoding of miniSEED file other than encoding value in miniSEED header. Can you explain how SHM can read this files properly.

In full SEED (SEED headers + data) the encoding is documented, but any encoding in the data records (in blockette 1000) should take precedent. It is quite rare nowadays to have data records without blockette 1000, aka miniSEED. There are no other standard ways to mark the encoding. So SHM must be ignoring the encoding, either using a default or have some other external information.

Chad