exception handling for IRIS web services getWaveform method

I wrote an obspy script driven by an Antelope database to fetch data from the IRIS DMC using the obspy.iris web service interface object. I have two questions for this community:
1. I've gotten a lot of failures (approaching 25% of about 600000 waveforms from USARRAY from 2012) in which the getWaveform method throws an exception. Question is if the exception thrown has any message content or is just a generic throw? I tried to track through the code, but there seem to be a lot of levels and I confess to giving up. The documentation says nothing about what error handlers could be implemented so I only used a generic except command. It would be helpful to know more about why I am seeing these failures and any details about any error object that might be thown (if there is one).
2. A second reason for knowing more about the reason for these errors is the second question. Has anyone done a large transfer with IRIS web services before? I am presently running a cleanup job on the 2012 data on all the waveforms that getWaveform threw an error on the first time round and the success rate is now very high. That is originally I was getting failures approaching 25%, but now the only ones I'm missing are ones I would expect because I should be fetching them from a different repository. The issue is if IRIS DMC's server was just not responding and getWaveform threw an error from some timeout condition. This links back to (1) because if the exception thrown had multiple conditions this could be sorted out. Not a big deal with 10 waveforms, but a big deal when you are trying to fetch about 5 million waveforms like I am with this script. The fact that I have a cleanup approach makes this tractable, but is a fundamental problem when the data volume is large.

Thanks for any help any of you can provide.

Gary,

I don't know your exact workflow but let my try to give you some hint.

Gary Pavlis [10.09.2013 18:31]:

1. I've gotten a lot of failures (approaching 25% of about 600000 waveforms from USARRAY from 2012) in which the getWaveform method throws an exception. Question is if the exception thrown has any message content or is just a generic throw? I tried to track through the code, but there seem to be a lot of levels and I confess to giving up. The documentation says nothing about what error handlers could be implemented so I only used a generic except command. It would be helpful to know more about why I am seeing these failures and any details about any error object that might be thown (if there is one).

This reads as if you have something like

try:
     st = client.getWaveform( ... )
except:
     logfile.write("Unknown exception caught\n")

Instead you should retrieve as much information as possible about the exception, which is quite easy:

import sys, traceback

try:
     st = client.getWaveform( ... )
except:
     exc_info = traceback.format_exception(*sys.exc_info())
     for line in exc_info: logfile.write(line)

HTH.

I can't help you with the second issue. About a week ago, however, some temporary server issues (slow responses and timeouts) were reported on the IRIS webservices mailing list. Could that be related to the problems you experienced?

Cheers
Joachim

Hi Gary,

Not sure if this is an ObsPy issue or not, without any specific example to see where the error came from. I would guess if you looked at the tracebacks you should be able to see whether the obspy.iris.Client was raising an error from lack of connection or timeout. From a quick look at the Client code, it looks like Client.bulkdataselect catches an HTTPError and raises an Exception with a “No waveform data available” message (with the original error and message added), So if you are getting those, than it may be IRIS.

In addition to Joachim’s advice, if you’re doing long-running or real-time stuff, I highly recommend the excellent python ‘logging’ module, which allows you to log exceptions directly from a handler, e.g.:

here ‘my_logger’ is an instance of logging.Logger

try:
do_some_process§
my_logger.debug(“Processing worked!”)
except Exception as e:
my_logger.exception(e)

That will log any error, the message and the entire traceback, and keep going, since all error objects inherit from Exception.
http://docs.python.org/2/library/logging.html

We have a base class example that does real-time processing/logging from an Antelope MQ here:
https://github.com/NVSeismoLab/rtapps-python/blob/master/rtapps/rtapp.py

-Mark

Hi Gary,

Joachim and Mark pretty much said it all but here are some more thoughts:

- if the same code didn't work one day and then worked again I agree
with Joachim that it might have been connected with the note about
server issues on the IRIS list
- you could try to set a higher timeout value when setting up the client
(probably won't really help with more severe server side problems
though, I guess)

- since you are doing a lot of automated requests (and are probably
working with the latest stable ObsPy version 0.8.4) you should be aware
of an issue that was reported last week
(http://lists.swapbytes.de/archives/obspy-users/2013-September/000980.html)
and since fixed in master (https://github.com/obspy/obspy/issues/623).
Basically, with ObsPy version 0.8.4 it might be better to use
.availability(...) in connection with .bulkdataselect(... ,
longestonly=False) in case of requesting data that contains gaps.

best,
Tobias