[webservices] ws_station network identifier

Philip Crotwell crotwell at seis.sc.edu
Tue Jun 14 18:08:37 PDT 2011


Hi John

I have had more than a few headaches along the lines of what you are
describing. There is good news and bad news from my experiences. The
good news is that mostly you can use the network code alone for
permanent networks and network code and begin year for temporary
networks, ie BK and XA2007 are mostly unique and fixed. The bad news
is that even this is only "mostly" a unique identifier. In general I
think the permanent network codes are single and unique and temporary
network codes are issued for a given begin year while they may be
extended, ie end date change, it would be really weird for the begin
date to change.

You should NOT use the begin date as part of the key for permanent
networks as those have changed over the years. A some point in the
past the begin time for permanent networks was dynamically determined
from the earliest data at the DMC, not sure if that is still the case.
So some networks were in the database with some data and then later
they sent in additional "old" data, causing the begin times to move
backwards. For example BK used to start in the 80s I think, but now
starts in the 30s?

More bad news is that the AF network (I think I am remembering
correctly), a single permanent network, at some point split into two
networks due to issues related to some data being restricted and some
not. So my software started having real problems because it was coded
to assume that the 2 char network code was unique for permanent
networks and suddenly there were 2 distinct networks (at least at the
software level) with the same code. I think there is work at the DMC
to redo the notion of restricted data so that this bifurcation of that
network will no longer be an issue in the future, but just pointing it
out as an example of how limited the options are for creating a unique
ID based on anything data "in" a network. Basically all fields are
subject to change, meaning nothing can be assured to be a unique id.
Big :(

I think this is the argument given way back when people were creating
database normalization theories and arguing for meaningless integer
database ids, because any ID based on real world data is subject to
change and so can not be counted on for a good id.

One more peice of bad news, the same problems that exist in the
network level also exist at the station and channel level, except that
they are even more likely to change.

I should also say that this is not a fault of the DMC, they don't
control when or how networks make changes to their metadata. But it is
a problem none the less as we simply do not have a globally unique,
non-changing identifier for any of our metadata. You do the best you
can and try to put code in to catch when things change. I have had
very limited success and grumble with regularity about how hard it is
to keep a metadata database in sync with the upstream one. It is just
a really really hard problem with no good solutions as far as I can
see. If you come up with a good answer please, please let me know.

Good luck...
Philip

On Tue, Jun 14, 2011 at 8:45 PM, Chad Trabant <chad at iris.washington.edu> wrote:
>
> Got it. The network start/end dates don't change often but on occasion they
> do.  I think the most common case is when a temporary network code is
> extended to match an extended experiment time window.  The only other useful
> identifier that I can think of is the network description contained in the
> <Description> tags, although that is subject to change as well but also
> doesn't change often.  Perhaps by checking the description you can figure
> out when it's the same network versus something new more often than not.
> Chad
> On Jun 14, 2011, at 5:04 PM, John D. West wrote:
>
> That was what I assumed from the output of the web service. The question is:
> can a start date or end date EVER change? If an incorrect date is entered
> and then later corrected, I end up with overlapping networks because network
> code + start date + end date combine to form the unique identifier.
>      -- John
>
>
> On Tue, Jun 14, 2011 at 4:58 PM, Chad Trabant <chad at iris.washington.edu>
> wrote:
>>
>> Hello.
>>
>> In general, networks, like stations and channels, have the notion of a
>> start time and an end time.  For permanent networks there are normally not
>> breaks in the continuity.  For temporary networks there are often blocks of
>> years allocated for specific experiments, for example XY 2005-2006, XY
>> 2007-2009 and XY 2010-2010.  We would not consider those temporary networks
>> to be modifications of an existing network, but instead to be logically
>> different networks.  Essentially the network code combined with the start
>> and end time uniquely identifies a "network", when the dates change and the
>> network code is recycled it should be considered a new network.  Not sure I
>> understood your question, did that help at all?
>>
>> Chad
>>
>> On Jun 14, 2011, at 2:00 PM, John D. West wrote:
>>
>> > Hello.
>> >
>> > I'm using the station webservice in EMERALD to maintain a local cache of
>> > network, station, and component metadata. In the Network level, reuse of
>> > network codes makes it difficult to differentiate between new and modified
>> > networks, e.g., if a network EndDate changes, my system registers it as a
>> > new usage of the network code instead of modification of an existing
>> > network.
>> >
>> > Is there some unique identifier for each network which can be included
>> > in the web service?
>> >
>> > Thanks!
>> >
>> >      -- John
>> > _______________________________________________
>> > webservices mailing list
>> > webservices at iris.washington.edu
>> > http://www.iris.washington.edu/mailman/listinfo/webservices
>>
>
>
>
> _______________________________________________
> webservices mailing list
> webservices at iris.washington.edu
> http://www.iris.washington.edu/mailman/listinfo/webservices
>
>



More information about the webservices mailing list