Hi, Philip.<div><br></div><div>Thanks for all of the info. I&#39;m working on a set of rules on handling such updates and would like your thoughts on them when I&#39;m done. It seems clear that there will always be exceptions, so I think EMERALD should include a way to automatically disseminate corrections when needed. </div>


<div><br></div><div>Incidentally, I&#39;m a big believer in numeric surrogate primary keys on database tables and use them throughout EMERALD.</div><div><br></div><div>Thanks!</div><div><br clear="all">     -- John<br>

<br><br><div class="gmail_quote">On Tue, Jun 14, 2011 at 6:08 PM, Philip Crotwell <span dir="ltr">&lt;<a href="mailto:crotwell@seis.sc.edu">crotwell@seis.sc.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


Hi John<br>

<br>

I have had more than a few headaches along the lines of what you are<br>

describing. There is good news and bad news from my experiences. The<br>

good news is that mostly you can use the network code alone for<br>

permanent networks and network code and begin year for temporary<br>

networks, ie BK and XA2007 are mostly unique and fixed. The bad news<br>

is that even this is only &quot;mostly&quot; a unique identifier. In general I<br>

think the permanent network codes are single and unique and temporary<br>

network codes are issued for a given begin year while they may be<br>

extended, ie end date change, it would be really weird for the begin<br>

date to change.<br>

<br>

You should NOT use the begin date as part of the key for permanent<br>

networks as those have changed over the years. A some point in the<br>

past the begin time for permanent networks was dynamically determined<br>

from the earliest data at the DMC, not sure if that is still the case.<br>

So some networks were in the database with some data and then later<br>

they sent in additional &quot;old&quot; data, causing the begin times to move<br>

backwards. For example BK used to start in the 80s I think, but now<br>

starts in the 30s?<br>

<br>

More bad news is that the AF network (I think I am remembering<br>

correctly), a single permanent network, at some point split into two<br>

networks due to issues related to some data being restricted and some<br>

not. So my software started having real problems because it was coded<br>

to assume that the 2 char network code was unique for permanent<br>

networks and suddenly there were 2 distinct networks (at least at the<br>

software level) with the same code. I think there is work at the DMC<br>

to redo the notion of restricted data so that this bifurcation of that<br>

network will no longer be an issue in the future, but just pointing it<br>

out as an example of how limited the options are for creating a unique<br>

ID based on anything data &quot;in&quot; a network. Basically all fields are<br>

subject to change, meaning nothing can be assured to be a unique id.<br>

Big :(<br>

<br>

I think this is the argument given way back when people were creating<br>

database normalization theories and arguing for meaningless integer<br>

database ids, because any ID based on real world data is subject to<br>

change and so can not be counted on for a good id.<br>

<br>

One more peice of bad news, the same problems that exist in the<br>

network level also exist at the station and channel level, except that<br>

they are even more likely to change.<br>

<br>

I should also say that this is not a fault of the DMC, they don&#39;t<br>

control when or how networks make changes to their metadata. But it is<br>

a problem none the less as we simply do not have a globally unique,<br>

non-changing identifier for any of our metadata. You do the best you<br>

can and try to put code in to catch when things change. I have had<br>

very limited success and grumble with regularity about how hard it is<br>

to keep a metadata database in sync with the upstream one. It is just<br>

a really really hard problem with no good solutions as far as I can<br>

see. If you come up with a good answer please, please let me know.<br>

<br>

Good luck...<br>

<font color="#888888">Philip<br>

</font><div><div></div><div class="h5"><br>

On Tue, Jun 14, 2011 at 8:45 PM, Chad Trabant &lt;<a href="mailto:chad@iris.washington.edu">chad@iris.washington.edu</a>&gt; wrote:<br>

&gt;<br>

&gt; Got it. The network start/end dates don&#39;t change often but on occasion they<br>

&gt; do.  I think the most common case is when a temporary network code is<br>

&gt; extended to match an extended experiment time window.  The only other useful<br>

&gt; identifier that I can think of is the network description contained in the<br>

&gt; &lt;Description&gt; tags, although that is subject to change as well but also<br>

&gt; doesn&#39;t change often.  Perhaps by checking the description you can figure<br>

&gt; out when it&#39;s the same network versus something new more often than not.<br>

&gt; Chad<br>

&gt; On Jun 14, 2011, at 5:04 PM, John D. West wrote:<br>

&gt;<br>

&gt; That was what I assumed from the output of the web service. The question is:<br>

&gt; can a start date or end date EVER change? If an incorrect date is entered<br>

&gt; and then later corrected, I end up with overlapping networks because network<br>

&gt; code + start date + end date combine to form the unique identifier.<br>

&gt;      -- John<br>

&gt;<br>

&gt;<br>

&gt; On Tue, Jun 14, 2011 at 4:58 PM, Chad Trabant &lt;<a href="mailto:chad@iris.washington.edu">chad@iris.washington.edu</a>&gt;<br>

&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; Hello.<br>

&gt;&gt;<br>

&gt;&gt; In general, networks, like stations and channels, have the notion of a<br>

&gt;&gt; start time and an end time.  For permanent networks there are normally not<br>

&gt;&gt; breaks in the continuity.  For temporary networks there are often blocks of<br>

&gt;&gt; years allocated for specific experiments, for example XY 2005-2006, XY<br>

&gt;&gt; 2007-2009 and XY 2010-2010.  We would not consider those temporary networks<br>

&gt;&gt; to be modifications of an existing network, but instead to be logically<br>

&gt;&gt; different networks.  Essentially the network code combined with the start<br>

&gt;&gt; and end time uniquely identifies a &quot;network&quot;, when the dates change and the<br>

&gt;&gt; network code is recycled it should be considered a new network.  Not sure I<br>

&gt;&gt; understood your question, did that help at all?<br>

&gt;&gt;<br>

&gt;&gt; Chad<br>

&gt;&gt;<br>

&gt;&gt; On Jun 14, 2011, at 2:00 PM, John D. West wrote:<br>

&gt;&gt;<br>

&gt;&gt; &gt; Hello.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I&#39;m using the station webservice in EMERALD to maintain a local cache of<br>

&gt;&gt; &gt; network, station, and component metadata. In the Network level, reuse of<br>

&gt;&gt; &gt; network codes makes it difficult to differentiate between new and modified<br>

&gt;&gt; &gt; networks, e.g., if a network EndDate changes, my system registers it as a<br>

&gt;&gt; &gt; new usage of the network code instead of modification of an existing<br>

&gt;&gt; &gt; network.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Is there some unique identifier for each network which can be included<br>

&gt;&gt; &gt; in the web service?<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Thanks!<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;      -- John<br>

&gt;&gt; &gt; _______________________________________________<br>

&gt;&gt; &gt; webservices mailing list<br>

&gt;&gt; &gt; <a href="mailto:webservices@iris.washington.edu">webservices@iris.washington.edu</a><br>

&gt;&gt; &gt; <a href="http://www.iris.washington.edu/mailman/listinfo/webservices" target="_blank">http://www.iris.washington.edu/mailman/listinfo/webservices</a><br>

&gt;&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; _______________________________________________<br>

&gt; webservices mailing list<br>

&gt; <a href="mailto:webservices@iris.washington.edu">webservices@iris.washington.edu</a><br>

&gt; <a href="http://www.iris.washington.edu/mailman/listinfo/webservices" target="_blank">http://www.iris.washington.edu/mailman/listinfo/webservices</a><br>

&gt;<br>

&gt;<br>

</div></div></blockquote></div><br></div>