[sac-dev] Subroutine interface to SAC XML datasets

Brian Savage savage at uri.edu
Thu Jan 31 06:46:08 PST 2008


George

I think moving away from the original header names might be a good idea.
The header names such at stlo, stla and kstnm can be cryptic upon  
first encounter.
I would suggest more readable names
<station latitude="" longitude="" elevation="" name="" network=""></ 
station>
or
<station>
	<latitude></latitude>
	<name></name>
	....
</station>

Also, would it be possible to include references to the sac binary  
files, as they are now, into the xml format.
<sacdataset>
	 <trace file="PAS.CI.BHZ.SAC" />
	 <trace file="PAS.CI.BHE.SAC" />
	 <trace file="PAS.CI.BHN.SAC" />
	 <trace id="HRV.IU.BHZ.SAC">
		<header >
			...
		</header>
		<trace>
			...
		</trace>
	</trace>
</sacdataset>
Which would allow for storing the data in either the current format  
(binary) or in the xml file.


Cheers,
Brian

On Jan 31, 2008, at  4:59 AM , George Helffrich wrote:

> Dear All -
>
> 	The key idea here, which is a good one, is subgroupings of the  
> information in the header:  1) station  information; 2) event  
> information; 3) data characteristics.  A fourth item, not well- 
> served by the present SAC file structure, is more complete response  
> information.
>
> 	Whether you express header information by <stel>500</stel> or <h  
> name="stel">500</h> is a stylistic choice.  The DTD description is  
> more concise in the latter case.
>
> On 31 Jan 2008, at 09:34, James Wookey wrote:
>
>> Hi Rob, George;
>>
>> I can see the point that the data format currently proposed is a  
>> terse one, basically a minimalist description of a set of SAC  
>> traces. This does have some significant advantages: it is  
>> efficient in file size, and it provides a direct connection to the  
>> header variables which, after all, SAC users (as well as  
>> programmers) still have to refer to by their short name. I don't  
>> think we should go the KML route, which as George says, makes my  
>> eyes water with all the detail that is required. As a  
>> 'consultation' format for SACML it is well designed, as it is  
>> simple to understand and is conceptually very close to the binary  
>> SAC format, and George has done sterling work implementing it.
>>
>> However, in the longer term, I can also see the value in a limited  
>> expansion of the structure of SACML, if it is going to represent a  
>> large step forward in the SAC file format. If we are going to pay  
>> the price of adopting a verbose format like XML (and I think we  
>> should) we might as well try to reap some of the rewards, and also  
>> build enough flexibility into the format to allow incorporation of  
>> future things (even if they are currently ignored by the current  
>> input routines - the ability to do that is one of the advantages  
>> of XML). It seems to me that one thing worth considering is  
>> structuring the header. So one possible format might look like:
>>
>> <sacdataset>
>>    <trace>
>>       <header>
>>          <station>
>>            <kstnm>TEST</kstnm>
>>            <stla>40</stla>
>> 	   ...
>>          </station>
>> 	 <event>
>>             <evla>-20</evla>
>>             ...
>>          </event>
>>          <trace_info>
>>             <delta>0.05</delta>	
>>             ...
>>          </trace_info>
>>       </header>
>>       <data>
>>          ...
>>       </data>
>>    </trace>
>> </sacdataset>
>>
>> This has the advantage of still being easy to 'one-sweep' read  
>> with event-driven parsers like SAX (because you simply ignore the  
>> container elements), plus providing a more object-oriented format  
>> for use with parsers like DOM or xpath. We might also want to  
>> include/allow a subgrouping of traces within the file: a  
>> <tracegroup> container element for example.
>>
>> Cheers,
>>
>> James
>>
>> On 31 Jan 2008, at 08:38, George Helffrich wrote:
>>
>>> Dear Rob -
>>>
>>> 	The XML DTD is versioned, and one could imagine defining a new  
>>> DTD with alternative element groupings that would reflect the  
>>> data structure.  One can obsess in describing data details, and  
>>> one view won't necessarily coincide with another person or  
>>> community's view.  Google Earth's KML comes to mind --  
>>> unbelievably baroque for putting points on a map for a  
>>> seismologist, but probably glibly expressive for GISers.
>>>
>>> 	Another view to take of the data is programming semantics,  
>>> however.  Programmers see 1) header variables that are peeked and  
>>> poked at; 2) data.  That was the view I took of the present DTD  
>>> definition.
>>>
>>> On 31 Jan 2008, at 00:12, Robert Casey wrote:
>>>
>>>>
>>>> 	Hi George-
>>>>
>>>> 	An interesting effort with SAC XML and you've made a lot of  
>>>> progress from the looks of it.  I hate to comment too harshly on  
>>>> something that may already be an established standard, so my  
>>>> comments are only meant as an observation:
>>>>
>>>> 	It seems to me that the SAC XML format only half-divorces  
>>>> itself from fixed-format files due to the naming scheme for the  
>>>> header elements you've provided.  Essentially, you've got just  
>>>> two entity names inside of <trace> that have no semantic quality  
>>>> to them: 'h' and 'd'.
>>>>
>>>> 	The nature of XML is such that the entities tend to me more  
>>>> grouped and self descriptive as far as their names go.  So  
>>>> instead of having <h name="STEL">, why could it not instead be  
>>>> <STEL>  ?  To indicate this as a subcomponent of a SAC header of  
>>>> a SAC trace, you'd form a hierarchy:
>>>>
>>>> <sacdataset>
>>>> 	<trace>
>>>> 		<header>
>>>> 			<stel>
>>>>
>>>> 	Even better would be to just call it <elevation>, maybe with a  
>>>> reference to its SAC field abbreviation as an attribute:   
>>>> <elevation id="STEL">.
>>>>
>>>> 	The reason for the comment is not just for human readability,  
>>>> but for the notion that many XML parsers will treat such  
>>>> elements as objects, carrying the element name with them.   
>>>> Having your fields broken down into meaningful names means that  
>>>> your objects will be more independent and have stronger  
>>>> encapsulation properties.
>>>>
>>>> 	If your example format is already set in stone, then please  
>>>> continue with what works.  I just felt the floor was open to  
>>>> address some naming aesthetics for consideration.  I can imagine  
>>>> that writing a parsing engine for XML in Fortran is headache  
>>>> enough, so I don't want to cause you a migraine on top of it.
>>>>
>>>> 	Cheers,
>>>>
>>>> 	-Rob
>>>>
>>>> On Jan 30, 2008, at 5:05 AM, George Helffrich wrote:
>>>>
>>>>> Dear All -
>>>>>
>>>>> 	I designed and implemented a subroutine interface to SAC XML  
>>>>> datasets in the latest release of MacSAC.  This message is to  
>>>>> make you aware of the design ideas for architectural comment.   
>>>>> I think that it shows the way forward to
>>>>> how SAC can move away from from a purely binary data format to  
>>>>> one that
>>>>> embraces current practice in structuring and delivering data.
>>>>>
>>>>> 	A test program illustrates the concepts.  Here is Fortran  
>>>>> source code of
>>>>> an actual program used for testing during development:
>>>>>
>>> 								George Helffrich
>>> 								george at geology.bristol.ac.uk
>>>
>>>
> 								George Helffrich
> 								george at geology.bristol.ac.uk
>
> _______________________________________________
> sac-dev mailing list
> sac-dev at iris.washington.edu
> http://www.iris.washington.edu/mailman/listinfo/sac-dev
>



More information about the sac-dev mailing list