Trace data normalization/scaling and resampling in waveform section plots

cpaitor · November 11, 2020, 1:01pm

Hi

I just noted that plotting the same waveforms using either plot() or plot(type=‘section’) can yield rather different results. I can provide a simple script and and input file (or figures) to demonstrate the issue but by some annoying reason (being a new user to discourse) I’m not allowed to upload files.

regP

megies · November 12, 2020, 12:47pm

I’ve bumped up your user’s trust level, can you try again if you can upload files now?

cpaitor · November 12, 2020, 1:49pm

Hi T

Sorry, still doesn’t seem to work, or I’m doing something stupid. The example can be picked up here instead: https://www.dropbox.com/s/c7lc32fctp9iiom/obspy_plot.tgz?dl=0

megies · November 16, 2020, 9:09am

I had a look at the settings and actually I think any logged in user should be able to attach pictures using the “upload” button.. that shows in the symbol bar on top of the text field when typing out a reply.

Screenshot from 2020-11-16 10-08-39

I think some filetypes might be restricted to upload, but images should work, as different people have posted pictures before.

cpaitor · November 16, 2020, 12:24pm

Hi T

Seems to be working fine now, perhaps there were some delay before one can upload files. In any case attached is a ziped tar ball holding script, input files and figures. Note that there are two waveforms plotted as type=‘section’ seems to have problem properly scaling the waveform if a single waveform is plotted.

obspy_plot.tgz (561.1 KB)

megies · November 17, 2020, 9:46am

Look at the docs for the section plot, there’s quite a few parameters to specify, e.g. normalization can be done per trace (which is default) or with a global normalization for the whole dataset.

megies · November 17, 2020, 9:51am

There also seems to be some resampling done if number of samples in a trace is more than 10k samples (looking at the source code), and it seems this can not be changed or deactivated by the user right now.

cpaitor · November 17, 2020, 10:14am

The resampling, is this performed only in the case of a section plot? That surely could explain the different appearance. From reading the manual (parameter method) I would expect this to not occur if the number of data samples in the waveform is less than 400k

Have read and tested various of the available parameters but no avail. My work around for now if plotting a single waveform in a section plot is to plot the waveform twice with a tiny separation and then scale them enough so that the separation is not visible. Now of course plotting a single waveform as a section plot may not seem meaningful, but may happen if data is not available for all but one channel.

Did try to look into the source code my self, but got lost among the objects (sorry never really liked object oriented programming, to much of a black box mentality for my taste)

megies · November 17, 2020, 2:07pm

The scaling you are talking about can be controled with the norm_method option, like I mentioned.

Here is the resampling being done:

https://github.com/obspy/obspy/blob/d3067af2bdef5dd92965dbcdba6072e418ce1fbb/obspy/imaging/waveform.py#L1218-L1222

https://github.com/obspy/obspy/blob/d3067af2bdef5dd92965dbcdba6072e418ce1fbb/obspy/imaging/waveform.py#L124-L128

cpaitor · December 1, 2020, 1:50pm

Hi

Sorry for the late reply. Quick suggestion, would it perhaps be reasonable to update the code in line 1219 then read:

if len(tr.data) >= self.max_npts/len(self.stream)

(or better yet save len(self.stream) to a variable prior to iterating over the traces and use variable for scaling), and then removing lines 124-127?
The idea being that resampling would only be triggered if number of data samples is roughly 400,000, for an exact solution (including respecting the method argument) an alternative could be:

have_points = sum([len(tr.data) for tr in self.stream])*len(self.stream)
for _i, tr in enumerate(self.stream):
    if have_points > self.max_npts and self.plotting_method is None or self.plotting_method == "fast":
        tmp_data = singal.resample(tr.data, self.max_npts)
    else
        tmp_data = tr.data

Regarding the scaling of plotting a single trace using type=section I get the same results regardless if I use norm_method=trace or norm_method=stream which is a straight thin line. E.g. using this (240 KB) file and the code

from obspy import read
wf = read('wf1.mseed')
wf.stats.distance = 1000
wf.plot()
wf.plot(type='section')
wf.plot(type='section',norm_method='trace')
wf.plot(type='section,norm_method='stream'')

Displays the waveform nicely only in the first case, wf.plot(), whereas in the cases where type='section' is used the trace is just a straight line (presumably due to some issue with scaling algorithm when only one trace is in the stream object - and yes I read that norm_method defaults to 'trace').

(while at it there seems to be a space missing between normalisations and are in the message returned if an invalid value is set on norm_method - message at present is: ValueError: Define a normalisation method. Valid normalisationsare 'trace', 'stream'. See documentation.)

megies · December 4, 2020, 1:01pm

Not sure really what the value of a section plot with a single trace is. Probably nobody ever tried that case, so I would not be too surprised if funny things happen in that case.

As to the resampling with many points, I guess we could add some way to control the number of samples when this is done, or enable the existing ‘‘method=full’’ switch that is used for regular wiggle plot. This part best be handled in a github issue, I think.

cpaitor · December 4, 2020, 2:09pm

Agree, a single trace in a section plot may seem an ill choice of plot type and sure not all cases can be tested/thought of at the time of coding.

The user case here is that I’ve put together a small utility to quickly plot the traces on the closest stations from a given point in time and space. Occasionally some of the stations were temporarily malfunctioning during the time window requested resulting in available traces from a single station only ergo a single trace ending up in the section plot.

As we frequently get contacted by both media and the public when there has been a felt shaking of the ground (be it seismic in nature or not) and the majority of the seismic events in Sweden are blasts or mining induced events, this utility is helpful to quickly get a hang of if there were a seismic event and if so the likely source.

Knowing that obspy does not properly scale a single trace if plotted using section plot it’s not really a major issue to work around this, but perhaps it could be worth adding a scenence to the documentation that this is the case to aid whoever stumbles across it again in the future (if anyone).

Regarding the resampling, perhaps also here a clarification in the documentation that resampling is done already for traces with > 10,000 samples if type=section is used would be fair. Ultimately enabling method=full also for type=section would be appreciated and if adding a github ticket helps in getting this done at some point I’d be glad to do so.

megies · December 7, 2020, 8:43am

So I took the time to have a look at section plot parameters (docs linked above) and everything is in your control. Scaling looks ok, like mentioned use norm_method="stream" for global scaling. I see that with a single trace (again, can’t blame anybody not to consider that case for a section plot when implementing this routine) we run into an edge case, but you can get around with explicitly specifying plot limits with offset_min and offset_max. The plot is scaled to full y-Axis range if you do that, so in addition you can set a bit of a smaller scale, e.g. scale=0.5.

#! /usr/bin/env python

from obspy import read

st = read("wf1.mseed")
st[0].stats.distance = 2000
# st += read("wf2.mseed")
# st[1].stats.distance = 1000

st.plot(type="section", alpha=1, orientation='horizontal',
        norm_method="stream", offset_max=2500, offset_min=1500, scale=0.5)

imaging/waveform.py

  1188         # Define minimum and maximum offsets                                                                                    
  1189         if self.sect_offset_min is None:                                                                                        
  1190             self._offset_min = self._tr_offsets.min()                                                                           
  1191         else:                                                                                                                   
  1192             self._offset_min = self.sect_offset_min                                                                             
  1193                                                                                                                                 
  1194         if self.sect_offset_max is None:                                                                                        
  1195             self._offset_max = self._tr_offsets.max()                                                                           
  1196         else:                                                                                                                   
  1197             self._offset_max = self.sect_offset_max

megies · December 7, 2020, 8:45am

Btw. for proper scaling in physical units if you have different instrumentation, you’d need to do instrument response removal yourself before plotting, but that should be clear enough I think.

cpaitor · December 7, 2020, 9:07am

Great, thanks.

(adding some extra characters as post needs to be at least 20 characters, yayks)

megies · December 7, 2020, 9:18am

See fix flatline in section plot with single trace by megies · Pull Request #2764 · obspy/obspy · GitHub