[sac-dev] Subroutine interface to SAC XML datasets

George Helffrich george at gly.bris.ac.uk
Thu Jan 31 01:59:10 PST 2008


Dear All -

	The key idea here, which is a good one, is subgroupings of the 
information in the header:  1) station  information; 2) event 
information; 3) data characteristics.  A fourth item, not well-served 
by the present SAC file structure, is more complete response 
information.

	Whether you express header information by <stel>500</stel> or <h 
name="stel">500</h> is a stylistic choice.  The DTD description is more 
concise in the latter case.

On 31 Jan 2008, at 09:34, James Wookey wrote:

> Hi Rob, George;
>
> I can see the point that the data format currently proposed is a terse 
> one, basically a minimalist description of a set of SAC traces. This 
> does have some significant advantages: it is efficient in file size, 
> and it provides a direct connection to the header variables which, 
> after all, SAC users (as well as programmers) still have to refer to 
> by their short name. I don't think we should go the KML route, which 
> as George says, makes my eyes water with all the detail that is 
> required. As a 'consultation' format for SACML it is well designed, as 
> it is simple to understand and is conceptually very close to the 
> binary SAC format, and George has done sterling work implementing it.
>
> However, in the longer term, I can also see the value in a limited 
> expansion of the structure of SACML, if it is going to represent a 
> large step forward in the SAC file format. If we are going to pay the 
> price of adopting a verbose format like XML (and I think we should) we 
> might as well try to reap some of the rewards, and also build enough 
> flexibility into the format to allow incorporation of future things 
> (even if they are currently ignored by the current input routines - 
> the ability to do that is one of the advantages of XML). It seems to 
> me that one thing worth considering is structuring the header. So one 
> possible format might look like:
>
> <sacdataset>
>    <trace>
>       <header>
>          <station>
>            <kstnm>TEST</kstnm>
>            <stla>40</stla>
> 	   ...
>          </station>
> 	 <event>
>             <evla>-20</evla>
>             ...
>          </event>
>          <trace_info>
>             <delta>0.05</delta>	
>             ...
>          </trace_info>
>       </header>
>       <data>
>          ...
>       </data>
>    </trace>
> </sacdataset>
>
> This has the advantage of still being easy to 'one-sweep' read with 
> event-driven parsers like SAX (because you simply ignore the container 
> elements), plus providing a more object-oriented format for use with 
> parsers like DOM or xpath. We might also want to include/allow a 
> subgrouping of traces within the file: a <tracegroup> container 
> element for example.
>
> Cheers,
>
> James
>
> On 31 Jan 2008, at 08:38, George Helffrich wrote:
>
>> Dear Rob -
>>
>> 	The XML DTD is versioned, and one could imagine defining a new DTD 
>> with alternative element groupings that would reflect the data 
>> structure.  One can obsess in describing data details, and one view 
>> won't necessarily coincide with another person or community's view.  
>> Google Earth's KML comes to mind -- unbelievably baroque for putting 
>> points on a map for a seismologist, but probably glibly expressive 
>> for GISers.
>>
>> 	Another view to take of the data is programming semantics, however.  
>> Programmers see 1) header variables that are peeked and poked at; 2) 
>> data.  That was the view I took of the present DTD definition.
>>
>> On 31 Jan 2008, at 00:12, Robert Casey wrote:
>>
>>>
>>> 	Hi George-
>>>
>>> 	An interesting effort with SAC XML and you've made a lot of 
>>> progress from the looks of it.  I hate to comment too harshly on 
>>> something that may already be an established standard, so my 
>>> comments are only meant as an observation:
>>>
>>> 	It seems to me that the SAC XML format only half-divorces itself 
>>> from fixed-format files due to the naming scheme for the header 
>>> elements you've provided.  Essentially, you've got just two entity 
>>> names inside of <trace> that have no semantic quality to them: 'h' 
>>> and 'd'.
>>>
>>> 	The nature of XML is such that the entities tend to me more grouped 
>>> and self descriptive as far as their names go.  So instead of having 
>>> <h name="STEL">, why could it not instead be <STEL>  ?  To indicate 
>>> this as a subcomponent of a SAC header of a SAC trace, you'd form a 
>>> hierarchy:
>>>
>>> <sacdataset>
>>> 	<trace>
>>> 		<header>
>>> 			<stel>
>>>
>>> 	Even better would be to just call it <elevation>, maybe with a 
>>> reference to its SAC field abbreviation as an attribute:  <elevation 
>>> id="STEL">.
>>>
>>> 	The reason for the comment is not just for human readability, but 
>>> for the notion that many XML parsers will treat such elements as 
>>> objects, carrying the element name with them.  Having your fields 
>>> broken down into meaningful names means that your objects will be 
>>> more independent and have stronger encapsulation properties.
>>>
>>> 	If your example format is already set in stone, then please 
>>> continue with what works.  I just felt the floor was open to address 
>>> some naming aesthetics for consideration.  I can imagine that 
>>> writing a parsing engine for XML in Fortran is headache enough, so I 
>>> don't want to cause you a migraine on top of it.
>>>
>>> 	Cheers,
>>>
>>> 	-Rob
>>>
>>> On Jan 30, 2008, at 5:05 AM, George Helffrich wrote:
>>>
>>>> Dear All -
>>>>
>>>> 	I designed and implemented a subroutine interface to SAC XML 
>>>> datasets in the latest release of MacSAC.  This message is to make 
>>>> you aware of the design ideas for architectural comment.  I think 
>>>> that it shows the way forward to
>>>> how SAC can move away from from a purely binary data format to one 
>>>> that
>>>> embraces current practice in structuring and delivering data.
>>>>
>>>> 	A test program illustrates the concepts.  Here is Fortran source 
>>>> code of
>>>> an actual program used for testing during development:
>>>>
>> 								George Helffrich
>> 								george at geology.bristol.ac.uk
>>
>>
								George Helffrich
								george at geology.bristol.ac.uk



More information about the sac-dev mailing list