MassDownloader for large number of short waveform chunks


Is MassDownloader() potentially faster than RoutingClient(...).get_waveforms_bulk(...) ?

If yes, is there a way I can harvest the capabilities of MassDownloader to download a large number of waveform chunks, as would be done with a RoutingClient (for automatic selection of best data provider) bulk request (for short data-chunk windowing) of type :

[[network0, station0, location0, channel0, start time0, end time0], # first short chunk
 [network1, station1, location1, channel1, start time1, end time1], # next one
 ... # a lot more (basically every P phases for every events in a given catalog)

If no, what the best way to fasten get_waveforms_bulk ? Should it be split and parallelised ? Any advice on how to do it ?



Not too familiar with MassDownloader myself, maybe @LionKrischer can comment.

Thanks for pushing :slight_smile:

The mass downloader does not allow this fine grained selection - it would rather take more general criteria like the geographical domain and a start and an end time and then attempt to fetch everything it can for those settings.

In terms of overall speed I also don’t know. The routing client would first ask the routing web service who has what piece of data and then launch one thread per data center so it is already parallel at that level. The mass downloader by design goes sequential per data center but parallelizes it per data center. But only up to 3 threads per data center as they don’t like it.

In general there often is only so much one can do to improve the download speed as many factors are not in one’s control. You can split and parallelize a bit but a certain point any given data center will not deliver the data any faster - parallelizing per data center is a much saner choice I think.

Hope it helps!


Very interesting!

It is really to know that the routing client is that clever.

Thanks a lot.