Reading a SEG2 file format

Good afternoon, I’m Angel Matos from Peru, Here, this topic processing seismological data using Python and Obspy is not very widespread. I am currently working on my undergraduate thesis and a part of the workflow is to read a SEG2 file format recorded from an OYO seismograph (OYO Corp from Japan). I have two queries and I hope you can help me.

a) With Obspy, I can read files with .dat extension (from Geometrics seismograph), on the other hand, I can’t read files with .sg2 extension (from OYO seismograph). You can see from the picture below a better explanation about the extensions. This screenshot was taken from SeisImagerSW Manual
reason_why
The case i that I don’t really want to use SeisImager, For my undergraduate thesis, I would like to read seismological data using Obspy, so, Is there a way I can read files with .sg2 extension.
Example of my code in Python (Python 3.8.5, Obspy 1.2.2, and working with JupyterNotebook on an environment created with Anaconda)
from obspy import read
st = read(pathname_or_url = “filename_1.dat”) #this works
st = read(pathname_or_url = “filename_2.sg2”) #this gives me a large error ending with KeyError: ‘SAMPLE_INTERVAL’

b) If that’s not possible, is there a way to convert .sg2 to .dat? Is it possible that exists a Python code for converting .sg2 to .dat? , since I need to automate the process (work with many files)

c) I would like you to be able to help me by providing a bibliography or some standard guide to how to work with these files. In a try to convert the files, I found this on internet:
image
image

I sincerely hope that you can help me because this is very important to me. Thanks in advance.

If you attach an example file that fails to load somebody might find time to have a look.

Thank for the reply: I attach the next files:

This is .sg2 extension (used from OYO seismographs)
data_OYO.sg2 (204.6 KB)
This is .dat extension (used from GEOMETRICS seismographs)
data_OYO_fixed.dat (201.0 KB)

Both are SEG2 file format about the same data but with different .extension, the case is that obspy is able to read .dat extension only, and the other one gave me an error. Thank a lot.

Did you try st = read("data_OYO.sg2", format="SEG2")?

Yes sir, I tried that code too, and gives me the same eror.

It looks like the free form header written by that instrument does not follow SEG2 conventions (although I don’t know much about SEG2, only know that it’s definition is ugly and flexible to an extent that makes it hard to deal with).

The binary content of the free form header of the valid file looks like this:

b'\x0f\x00CDP_NUMBER 0\x00\x0e\x00CDP_TRACE 0\x00\n\x00DELAY 0\x00\x1c\x00DIGITAL_HIGH_CUT_FILTER 0\x00\x1b\x00DIGITAL_LOW_CUT_FILTER 0\x000\x00RECEIVER_GEOMETRY 10.000000 0.000000 0.000000\x000\x00RECEIVER_LOCATION 10.000000 0.000000 0.000000\x00\x1c\x00RECEIVER_STATION_NUMBER 0\x00\x19\x00SAMPLE_INTERVAL 0.0005\x00\x19\x00SHOT_SEQUENCE_NUMBER 0\x00-\x00SOURCE_LOCATION 0.000000 0.000000 0.000000\x00\x1a\x00SOURCE_STATION_NUMBER 0\x00\x00\x00\x00\x00\x00'

Looking at our source code it is a list of items each comprised of one unsigned integer (2 bytes length) followed by “field_name value” of varying size given by the leading number.

The file that can not be read looks different in terms of free form header:

b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf31\x10ACQUISITION_DATE 01/01/2001\x00  ACQUISITION_TIME 00:00:00\x00  CLIENT OYO_CORPORATION\x00  COMPANY OYO_CORPORATION\x00  *DELAY 0.000000\x00  GENERAL_CONSTANT 1\x00  INSTRUMENT PICKWIN95\x00  JOB_ID 1\x00  OBSERVER OYO_CORPORATION\x00  PROCESSING_DATE 01/01/2001\x00  PROCESSING_TIME 00:00:00\x00  RECEIVER_LOCATION 10.000000\x00  SAMPLE_INTERVAL 0.00050000\x00  SOURCE_LOCATION 0.000000\x00  TRACE_SORT COMMON_SOURCE\x00  TRACE_TYPE 1\x00  *UNITS METERS\x00  MANUAL_CHANNEL_NUMBER 0\x00  NOTE \x00                       '

It basically seems to have some garbage in front and also each individual header value is not preceded by its length in bytes, solely relying on a single NULL character to represent the end of one key/value string.

It looks like it might be possible to read that file from the looks of it, but you would have to probably patch the obspy source code to get it done, replacing or extending the part that reads the free form header (function parse_free_form(...) in obspy/io/seg2/seg2.py). You could set a debugger break point in there and do it interactively for a start.

1 Like

So looks like to read the file you only need the SAMPLE_INTERVAL from free form header, so if you manually replace and hard code this line…

        header['delta'] = float(header['seg2']['SAMPLE_INTERVAL'])

…with the actual value…

        header['delta'] = 0.0005

… the file can be read. Obviously this is…

  • dangerous, as you would get wrong data if you read a file that has a different sampling rate, and…
  • you will lose all the other free form header info (although it seems like in the “broken” file all the header variables are either 0 or 1 so they might just be bogus values anyway maybe)
1 Like

Dear Mr. Megies, Than you so much for you reply. I understood the main idea of what you said, but since I’m too inexperienced (I’m not a developer) working with the source code of modules and libraries and also understanding binary code, I’m afraid I could damage my data or even my system.

Just to mention, I found a way to convert .sg2 files in .dat files, by using a software (I’ not mentioning the name of that software, cause maybe it can be illegal or considered promotion for the policy of this community; of course, I’m sure there exist a lot of softwares that convert files). Then, with that conversion, I can succesfully use Obspy and proceed with my workflow.

As I mentioned before, for me, the ideal would be that Obspy could read directly .sg2 files or to build a little Python code that converts .sg2 files into .dat files, in order to automate the procces, instead of using a software and converting the files one by one.

Next year, I expect my university reopen in a normally way so I can work with developer parties. And, I will be happy to share my findings.

Thank you again, Mr. Megies. Regards from Perú.

Glad you found a solution, and we have no problem at all if you mention what software you converted your files with. :wink:

Like I said , it would totally be possible to modify our source code to read those files, but first we would need somebody more familiar and/or working with SEG2 give a statement how much of a format breach we are looking at and how to judge the situation.

See https://github.com/obspy/obspy/issues/2767

I tested with own segD reader and doesn’t work from scratch either. will see if that hard-coded hack could be safely ported to the seg2 reader

Well that hard coded hack is really only if you are sure about the sampling rate a-priori, obviously. It wouldnt be hard to read, just would like some input from people familliar with the SEG2 file format definition how to judge this different header structure.