Seismic data format conversion

Hi,
I have seismic data in SEG-Y format, which I want to convert to miniSEED format. From the online resources, I know that ObsPy can be used to read and write various seismic file formats (SEG-Y and miniSEED files). Did anyone ever came across converting SEG-Y to miniSEED. Please share if you have any piece of code. I would be extremely grateful for your help.

Thanks

Abhash Kumar

Hi Abhash,

it works the same as for any other file format in ObsPy:

import obspy
st = obspy.read("filename.segy")
st.write("out.mseed", format="mseed")

Now there are of course a lot of subtle issues with transfering SEG-Y
header values to MiniSEED headers but that is application specific and
so you'll have to decide on your own. The two formats are just too
different to enable a fully automatic header conversion and cannot store
the same information.

Cheers!

Lion

Hi Lion,

I am grateful for your help. I have SEG-Y files, each having 144 traces corresponding to three components at 48 geophones. After converting the individual trace into miniseed file, I also want to put header information (station, channel, network, sampling rate, npts, etc) in the miniseed file. I wrote the following command that loop through station, channel, and network name corresponding to 144 traces, fill header attributes and write 144 miniseed files.

###To create an array of stations

St = []

for i in range (1, 145):
if i <=9:
St.append(‘ST0’+ str(i))
elif i <=48:
St.append(‘ST’+ str(i))
elif i <=57:
St.append(‘ST0’+ str(i-48))
elif i <=96:
St.append(‘ST’+ str(i-48))
elif i <=105:
St.append(‘ST0’+ str(i-96))
elif i <=144:
St.append(‘ST’+ str(i-96))

###To create an array for channel

C = []

Channel

for i in range (1, 49):
C.append(‘BHZ’)

for i in range (49, 97):
C.append(‘BHN’)

for i in range (97, 145):
C.append(‘BHE’)

###To create an array of network

N = []
for i in range (1, 145):
N.append(‘TX’)

from obspy.segy.segy import readSEGY
from obspy.segy.core import readSEGY
from obspy import Stream, Trace

filename = ‘1.0.0.sgy’
st2 = readSEGY(filename) #reading SEG-Y file

i=0
for tr in st2:
stats = {‘network’: N[i], ‘station’: St[i], ‘location’: ‘’,‘channel’: C[i], ‘npts’: 30000,
‘sampling_rate’: 1000, ‘mseed’: {‘dataquality’: ‘D’}} ###filling header attributes

i = i+1
tr = Stream([Trace(data=tr, header=stats)]) ##This is problematic part
tr.stats.segy.trace_header = stats
tr.write(str(i)+’.MSEED’,format=‘MSEED’)

But I am not able to successfully write the headers to the miniseed file. With this code, I get a value error “ValueError: Trace.data must be a NumPy array”

I would be extremely grateful for any suggestion on how to modify the last loop of the code that is being used to write miniseed file with header information.

Best regards

Abhash Kumar

Hi Abhash,

    tr = Stream([Trace(data=tr, header=stats)]) ##This is problematic part

tr is already a trace object so you'll either have to do:

tr = Stream([Trace(data=tr.data, header=stats)])

or just update the existing trace object:

tr.stats.update(stats)

    tr.stats.segy.trace_header = stats

This line will not do anything as you are writing to MiniSEED so all
segy headers are ignored.

Cheers!

Lion

Hi Lion,

I am grateful for your help. I modified my code and now the station, channel and network information for miniSEED file is written correctly. Though, I ran into a problem of incorrect start and end time for the newly created miniSEED files. When I run the code (as written below):

filename = '1.0.0.sgy
st2 = readSEGY(filename, unpack_trace_headers=True)
print(st2.str(extended=True))

i=0
for tr in st2:
stats = {‘network’: N[i], ‘station’: St[i], ‘location’: ‘’,‘channel’: C[i], ‘npts’: 30000,
‘sampling_rate’: 1000, ‘mseed’: {‘dataquality’: ‘D’}}
i = i+1
tr = Stream([Trace(data=tr.data, header=stats)])
tr.write(str(i)+’.MSEED’,format=‘MSEED’)
print tr

I see the correct start and end time for each SEG-Y trace in the python console when I run the code (as below):
144 Trace(s) in Stream:
Seq. No. in line: 1 | 2015-12-08T19:00:00.000000Z - 2015-12-08T19:00:29.999000Z | 1000.0 Hz, 30000 samples
Seq. No. in line: 2 | 2015-12-08T19:00:00.000000Z - 2015-12-08T19:00:29.999000Z | 1000.0 Hz, 30000 samples

But the start and end time for each miniSEED file appears to be set at January 1970:
1 Trace(s) in Stream:
TX.ST01…BHZ | 1970-01-01T00:00:00.000000Z - 1970-01-01T00:00:29.999000Z | 1000.0 Hz, 30000 samples
1 Trace(s) in Stream:
TX.ST02…BHZ | 1970-01-01T00:00:00.000000Z - 1970-01-01T00:00:29.999000Z | 1000.0 Hz, 30000 samples

I need to extract the time header of each SEG-Y trace and assign it is start and end time for miniSEED file. Can you please suggest me where and what should I write in the code above (highlighted red) to get the correct time from each SEG-Y trace and assign it to corresponding miniSEED file. I would be more than grateful for your help.

Best regards

Abhash Kumar

for tr in st2:
    stats = {'network': N[i], 'station': St[i], 'location':
'','channel': C[i], 'npts': 30000,
    'sampling_rate': 1000, 'mseed': {'dataquality': 'D'}}
    i = i+1
    tr = Stream([Trace(data=tr.data, header=stats)])
    tr.write(str(i)+'.MSEED',format='MSEED')
    print tr

The stats are all the metadata - you do not copy the time information
from one to the next.

Just replace

tr = Stream([Trace(data=tr.data, header=stats)])

with

tr.stats.update(stats)

as proposed before and everything should work fine.

Also best remove the `sampling_rate` key from the dictionary one just
use the one in the existing stats dictionary.

Cheers!

Lion

Hi Lion,

You are great!!! It worked really well and finally I am able create my own seismic database. This would not have been possible without your help.

Thanks a lot!!!

Best regards

Abhash Kumar