Hi
I’m currently putting together a convenience function to wrap up data into a Trace object so I later can use it with obspy. While numpy.ma.MaskedArray can be used for data with gaps it is not clear in the obspy docs what value to assign to the Trace.stats.npts attribute when a masked array is used so hopefully someone can help with with this. In principle the question boils down to:
If a numpy.ma.MaskedArray is used as value on Trace.data should Trace.stats.npts be set to the size of the underlying numpy.ndarray or to the number of unmasked data in the numpy.ma.MaskedArray?
I think it should be set to the size of the underlying numpy.ndarray.
Though, you don’t need to set npts
at all, it will be determined automatically by obspy.
I also were under the impression that one did not have to explicitly set npts
, however the following code:
import numpy as np
from obspy.core import trace, UTCDateTime
n = 1000
data = [i for i in range(n)]
stats = trace.Stats()
stats.network = "X"
stats.station = "Y"
stats.channel = "Z"
stats.starttime = UTCDateTime(0)
stats.sampling_rate = 10
tr = trace.Trace(data=np.array(range(n)),header=stats)
print(tr)
tr.stats.npts = n
print(tr)
when run outputs:
X.Y..Z | 1970-01-01T00:00:00.000000Z - 1970-01-01T00:00:00.000000Z | 10.0 Hz, 0 samples
X.Y..Z | 1970-01-01T00:00:00.000000Z - 1970-01-01T00:01:39.900000Z | 10.0 Hz, 1000 samples
Indicates that one has to. Further if trying to plot the created trace before setting npts
proper I get the following error:
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
>>> import obspy
>>> from obspy.core import trace, UTCDateTime
>>> print(obspy.__version__)
1.1.1
>>> n = 1000
>>> data = [i for i in range(n)]
>>> stats = trace.Stats()
>>> stats.network = "X"
>>> stats.station = "Y"
>>> stats.channel = "Z"
>>> stats.starttime = UTCDateTime(0)
>>> stats.sampling_rate = 10
>>> tr = trace.Trace(data=np.array(range(n)),header=stats)
>>> print(tr)
X.Y..Z | 1970-01-01T00:00:00.000000Z - 1970-01-01T00:00:00.000000Z | 10.0 Hz, 0 samples
>>> tr.plot()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/obspy/core/trace.py", line 893, in plot
waveform = WaveformPlotting(stream=self, **kwargs)
File "/usr/lib/python3/dist-packages/obspy/imaging/waveform.py", line 218, in __init__
self.title = kwargs.get('title', self.stream[0].id)
File "/usr/lib/python3/dist-packages/obspy/core/stream.py", line 661, in __getitem__
return self.traces.__getitem__(index)
IndexError: list index out of range
I see. The problem with your code is that header
should be a dictionary. Then it will work. The stats object sets the npts
attribute, please check with print(stats)
.
Hmm, played around with this a bit more as, if I understand it correctly, the Stats
object attributes can be accessed using a dict
similar syntax, e.g. the value of stats.npts
can be accessed as stats['npts']
.
So my conclusions then are:
When creating a new Trace
object the argument header
takes a dict
as input and stores the input key:value pairs in a Stats
object, is this correct?
It seems though that it is also possible to use a Stats
as value for header
but in this case the attribute npts
will not be updated with the proper data size of the input data vector, is this correct?
The reason for this is that Stats
has an attribute npts
(which defaults to 0) in which case npts
in the trace meta data will not get updated. Similarly if the input dict
on argument header
has a key npts
then npts
in the meta data of the created Trace
object will be set to the value of this key and not to the size of the input data vector, is this correct?
All observations are correct. I think the npts
entry in header (be it dict or Stats object) should probably be ignored by ObsPy if data is present. Note that you can create pure metadata traces. In this case it is important to set npts.
Sure, agree on this.
Guess getting all the nuances into the documentation would be quite a task but would it be sensible to add to the obspy tutorial a section on creating a trace object from scratch including various potential pitfalls (I consider the the topic in this post such a as the documentation of Trace
states that the header
argument will accept a Stats
object as input).