And it wont valisate if I set vaildate=True it, or if I don’t validate, I get this error:
UserWarning: 'quakeml:earthquake.usgs.gov/fdsnws/event/1/query?starttime=2014-07-17T00%3A00%3A00.000000&endtime=2014-07-19T00%3A00%3A00.000000&latitude=46.2&longitude=-122.19&maxradius=0.25' is not a valid QuakeML URI. It will be in the final file but note that the file will not be a valid QuakeML file.
warnings.warn(msg % obj.id)
Thanks for reporting this. It’s mainly an issue of USGS presenting an invalid QuakeML file and it should be reported upstream at USGS.
We could add a check for this, and fix it somehow, but ultimately it’s not really our problem and we’d still have to change a file/value that we shouldn’t have to touch and we’d have to issue a warning too that we did.
That is correct, both “%3A” and “:” are invalid characters in the PublicID according to the QuakeML XSD spec. In the previous email I was trying to show that removing the colon is the only way to pass the validation check for starttime & endtime values in the PublicID.
We looked into the Obspy code more and found that when you call client.get_events() it eventually performs a urlencode() on each parameter in the FDSN QuakeML request. Our FDSN web service uses the request URI in eventParameters[PublicID]. This means that if the request URI has a url encoded value like “%3A”, that we output that same URL encoded value and it breaks the validation on your end.
Assuming, that this doesn’t break the Obspy workflow, you could remove all of the problematic time values from your call to our FDSN web service. This should remove the colon values, which means that Obspy will not urlencode anything in the start/endtime. See your code in red, and the suggested updates in green:
We currently, do not handle the eventParameters[PublicID] generation very gracefully. I will put in a change request to include handling all URL encoded values. For now, the Obspy handling of the starttime & endtime query string parameters is introducing the “%” character that is breaking the PublicID validation.
Yes, like I said, it’s something they should fix on their end. In many cases, the resource ID / public ID of the catalog will not really be important. For other objects like origins etc, it is meaningful since it can be used to uniquely identify/link a certain item. But these catalogs are just created ad-hoc without any traceability anyway. So long story short, you could put whatever in there and it would not be a big loss. Having the original request URL might be nice to have maybe, you could think about putting that in a comment instead and just cut the public ID short (which you don’t have to do manually of course, but automatically after requesting) or just generate a safe one. If you load multiple catalogs that have the same ID it might have side effects, so probably does not hurt to use autogenerated custom IDs