[sac-dev] Subroutine interface to SAC XML datasets
George Helffrich
george at gly.bris.ac.uk
Thu Jan 31 07:04:48 PST 2008
Dear Brian -
Indeed, header field names are cryptic and they could be improved on.
With a clearer idea of the intended users and/or viewers of the
elements, the name scheme should clarify.
I'm not sure about <trace file="PAS.CI.BHN.SAC" />, but it is an
interesting idea. It is analogous to a Unix symbolic link, with
similar semantic confusion (stat/lstat). Here it raises the issues of
1) whether a dataset is entirely self-contained; and 2) what it means
to "write" a dataset that contains links as well as trace data.
On 31 Jan 2008, at 14:46, Brian Savage wrote:
> George
>
> I think moving away from the original header names might be a good
> idea.
> The header names such at stlo, stla and kstnm can be cryptic upon
> first encounter.
> I would suggest more readable names
> <station latitude="" longitude="" elevation="" name=""
> network=""></station>
> or
> <station>
> <latitude></latitude>
> <name></name>
> ....
> </station>
>
> Also, would it be possible to include references to the sac binary
> files, as they are now, into the xml format.
> <sacdataset>
> <trace file="PAS.CI.BHZ.SAC" />
> <trace file="PAS.CI.BHE.SAC" />
> <trace file="PAS.CI.BHN.SAC" />
> <trace id="HRV.IU.BHZ.SAC">
> <header >
> ...
> </header>
> <trace>
> ...
> </trace>
> </trace>
> </sacdataset>
> Which would allow for storing the data in either the current format
> (binary) or in the xml file.
>
>
> Cheers,
> Brian
>
> On Jan 31, 2008, at 4:59 AM , George Helffrich wrote:
>
>> Dear All -
>>
>> The key idea here, which is a good one, is subgroupings of the
>> information in the header: 1) station information; 2) event
>> information; 3) data characteristics. A fourth item, not well-served
>> by the present SAC file structure, is more complete response
>> information.
>>
>> Whether you express header information by <stel>500</stel> or <h
>> name="stel">500</h> is a stylistic choice. The DTD description is
>> more concise in the latter case.
>>
>> On 31 Jan 2008, at 09:34, James Wookey wrote:
>>
>>> Hi Rob, George;
>>>
>>> I can see the point that the data format currently proposed is a
>>> terse one, basically a minimalist description of a set of SAC
>>> traces. This does have some significant advantages: it is efficient
>>> in file size, and it provides a direct connection to the header
>>> variables which, after all, SAC users (as well as programmers) still
>>> have to refer to by their short name. I don't think we should go the
>>> KML route, which as George says, makes my eyes water with all the
>>> detail that is required. As a 'consultation' format for SACML it is
>>> well designed, as it is simple to understand and is conceptually
>>> very close to the binary SAC format, and George has done sterling
>>> work implementing it.
>>>
>>> However, in the longer term, I can also see the value in a limited
>>> expansion of the structure of SACML, if it is going to represent a
>>> large step forward in the SAC file format. If we are going to pay
>>> the price of adopting a verbose format like XML (and I think we
>>> should) we might as well try to reap some of the rewards, and also
>>> build enough flexibility into the format to allow incorporation of
>>> future things (even if they are currently ignored by the current
>>> input routines - the ability to do that is one of the advantages of
>>> XML). It seems to me that one thing worth considering is structuring
>>> the header. So one possible format might look like:
>>>
>>> <sacdataset>
>>> <trace>
>>> <header>
>>> <station>
>>> <kstnm>TEST</kstnm>
>>> <stla>40</stla>
>>> ...
>>> </station>
>>> <event>
>>> <evla>-20</evla>
>>> ...
>>> </event>
>>> <trace_info>
>>> <delta>0.05</delta>
>>> ...
>>> </trace_info>
>>> </header>
>>> <data>
>>> ...
>>> </data>
>>> </trace>
>>> </sacdataset>
>>>
>>> This has the advantage of still being easy to 'one-sweep' read with
>>> event-driven parsers like SAX (because you simply ignore the
>>> container elements), plus providing a more object-oriented format
>>> for use with parsers like DOM or xpath. We might also want to
>>> include/allow a subgrouping of traces within the file: a
>>> <tracegroup> container element for example.
>>>
>>> Cheers,
>>>
>>> James
>>>
>>> On 31 Jan 2008, at 08:38, George Helffrich wrote:
>>>
>>>> Dear Rob -
>>>>
>>>> The XML DTD is versioned, and one could imagine defining a new DTD
>>>> with alternative element groupings that would reflect the data
>>>> structure. One can obsess in describing data details, and one view
>>>> won't necessarily coincide with another person or community's view.
>>>> Google Earth's KML comes to mind -- unbelievably baroque for
>>>> putting points on a map for a seismologist, but probably glibly
>>>> expressive for GISers.
>>>>
>>>> Another view to take of the data is programming semantics,
>>>> however. Programmers see 1) header variables that are peeked and
>>>> poked at; 2) data. That was the view I took of the present DTD
>>>> definition.
>>>>
>>>> On 31 Jan 2008, at 00:12, Robert Casey wrote:
>>>>
>>>>>
>>>>> Hi George-
>>>>>
>>>>> An interesting effort with SAC XML and you've made a lot of
>>>>> progress from the looks of it. I hate to comment too harshly on
>>>>> something that may already be an established standard, so my
>>>>> comments are only meant as an observation:
>>>>>
>>>>> It seems to me that the SAC XML format only half-divorces itself
>>>>> from fixed-format files due to the naming scheme for the header
>>>>> elements you've provided. Essentially, you've got just two entity
>>>>> names inside of <trace> that have no semantic quality to them: 'h'
>>>>> and 'd'.
>>>>>
>>>>> The nature of XML is such that the entities tend to me more
>>>>> grouped and self descriptive as far as their names go. So instead
>>>>> of having <h name="STEL">, why could it not instead be <STEL> ?
>>>>> To indicate this as a subcomponent of a SAC header of a SAC trace,
>>>>> you'd form a hierarchy:
>>>>>
>>>>> <sacdataset>
>>>>> <trace>
>>>>> <header>
>>>>> <stel>
>>>>>
>>>>> Even better would be to just call it <elevation>, maybe with a
>>>>> reference to its SAC field abbreviation as an attribute:
>>>>> <elevation id="STEL">.
>>>>>
>>>>> The reason for the comment is not just for human readability, but
>>>>> for the notion that many XML parsers will treat such elements as
>>>>> objects, carrying the element name with them. Having your fields
>>>>> broken down into meaningful names means that your objects will be
>>>>> more independent and have stronger encapsulation properties.
>>>>>
>>>>> If your example format is already set in stone, then please
>>>>> continue with what works. I just felt the floor was open to
>>>>> address some naming aesthetics for consideration. I can imagine
>>>>> that writing a parsing engine for XML in Fortran is headache
>>>>> enough, so I don't want to cause you a migraine on top of it.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -Rob
>>>>>
>>>>> On Jan 30, 2008, at 5:05 AM, George Helffrich wrote:
>>>>>
>>>>>> Dear All -
>>>>>>
>>>>>> I designed and implemented a subroutine interface to SAC XML
>>>>>> datasets in the latest release of MacSAC. This message is to
>>>>>> make you aware of the design ideas for architectural comment. I
>>>>>> think that it shows the way forward to
>>>>>> how SAC can move away from from a purely binary data format to
>>>>>> one that
>>>>>> embraces current practice in structuring and delivering data.
>>>>>>
>>>>>> A test program illustrates the concepts. Here is Fortran source
>>>>>> code of
>>>>>> an actual program used for testing during development:
>>>>>>
>>>> George Helffrich
>>>> george at geology.bristol.ac.uk
>>>>
>>>>
>> George Helffrich
>> george at geology.bristol.ac.uk
>>
>> _______________________________________________
>> sac-dev mailing list
>> sac-dev at iris.washington.edu
>> http://www.iris.washington.edu/mailman/listinfo/sac-dev
>>
>
> _______________________________________________
> sac-dev mailing list
> sac-dev at iris.washington.edu
> http://www.iris.washington.edu/mailman/listinfo/sac-dev
>
George Helffrich
george at geology.bristol.ac.uk
More information about the sac-dev
mailing list