SQL Syntax

tables | definitions | SQL syntax

Contents of this document:


INTRODUCTION

This document is written for users who wish to run their own queries of the database.

Before beginning your first "customized" query you would be wise to familiarize yourself with the TABLE and fieldnames and definitions - these will figure prominently in your query.

Why query the database? The DMC stores terabytes of data from networks and stations around the globe. Users wanting to request this data must first inform themselves of what networks and stations send their data to the DMC for archiving. Once this information is obtained it is in the users best interest to make sure, for example, that data is indeed available for a given time period or station channel BEFORE their request is submitted so as to avoid the "no data" message from the DMC.

It is also possible to query for "non-data" related information like seismometer type, station site names, problem reports, etc..

WHAT IS SQL?

SQL stands for Structured Query Language

QUERY SYNTAX

Wouldn't it be great if you could just type a simple question in plain english and get the answer you need from the database? Well - you can - sort of. The format of your query "question" will be like the format of a real sentence but you will only use nouns, verbs and conjunctions. The following is a list of commands. Examples of their usage can be found in the section titled EXAMPLES.

In the following examples or explanations we will use [ ] to denote optional parts and ( ) to denote one or the other required.

Here is the flow of a query:

select [distinct] ( [TABLE.]fieldname [, [TABLE.]fieldname] | *)
from TABLE [,TABLE]
[where condition [and condition]]
[order by [TABLE.]fieldname [, [TABLE.]fieldname]
OR [group by [TABLE.]fieldname [, [TABLE.]fieldname];

select - begins most queries (more on other options later). You can do something like "select distinct * from STATION;" - distinct is an option that will eliminate redundant lines and will cause the search to take longer because the data is sorted. You need to decide if you want to get all fields (*) or select fields (fieldnames) --- (see EXAMPLE 2).

from - one can select to query information from more than one TABLE. This is important as one of the main functions of a relational database is to link TABLEs together (see TABLE.fieldname Usage and EXAMPLE 3 ).

where - this is the condition identifier. You can make a query without setting any conditions (see EXAMPLE 1) but you will probably want to set some.

Comparison Operators:

Operator Purpose Example
= Equality test select * from STATION where
   STATION.station = 'ANMO'
!=, ^=, <> Inequality test. select * from STATION where
   STATION.station != 'ANMO'
>

<
"Greater than"
and
"less than" tests
select latitude, longitude, station
   from STATION where
   STATION.latitude > 0 and < 10
>=

<=
"Greater than or equal to"
and
"less than or equal to" tests
select latitude, longitude, station
   from STATION where
   STATION.latitude >= 0 and <= 10
IN "Equal to any member of" test.
Equivalent to "=ANY"
select * from STATION where
   STATION.station IN
   ('ANMO','GRFO')
NOT IN Equivalent to "!=ALL".
Evaluates to FALSE if any member of the set is NULL
select * from STATION where
   STATION.station NOT IN
   ('ANMO','GRFO')
BETWEEN Greater than or equal to x
and less than or equal to y
select * from STATION where
   STATION.starttime between
   '1990/01/01' and '1992/01/01'
x [NOT] LIKE y TRUE if x does [not] match the pattern y. Within y, the character '%' matches any string of zero or more characters except null. The character '_' matches any single character. select * from STATION where
   STATON.station like 'A%'

TABLE.fieldname USAGE

Because this is a relational database, TABLEs are often linked to other TABLEs with related information (hence the name). Many of the TABLEs contain fieldnames of the same name. When writing a query that will ask for information from more than one TABLE it is important to distinguish the fieldnames in your condition statement by always using the TABLE name as a prefix. For example, the STATION and CHANNEL TABLEs have several like fieldnames. If you were to set any conditions in your query statement you would need to be specific about for which TABLE you were setting the condition. Here is an example:

select STATION.station, channel, location
from STATION, CHANNEL
where STATION.station = 'ANMO';

This query will return information where the condition is matched in the STATION TABLE. You must include the TABLE name in the select fieldname list. The channel and location field only appear in the CHANNEL table so no need for the TABLE name prefix but it is a good habit to always use the TABLE.fieldname format to avoid confusion. There are instances where fieldnames with the same name will contain different information specific to a certain context. (Please view TABLEs and definitions for more information about TABLE and fieldnames and usage).

DATE AND TIME FORMATS

Writing a query using a date is not complicated but is, perhaps, a bit tedious. The default format is 'MM/DD/YYYY HH24:MI:SS'. If you are using a simple 'where' condition, you might write something like this:

select * from STATION where starttime > '04/01/1997 14:23:59';

To query using the 'between' condition is just as simple - just separate your times with 'and' .

The tricky part comes in if you want to use a different time format (like day of year). For this you need to use the 'to_date' function. To use to_date, give the date you want in single quotes as the first argument, the format for the date in single quotes as the second argument:

select * from STATION where
starttime between to_date('1998,091,00:00:00' , 'YYYY,DDD,HH24:MI:SS') and
                  to_date('1998,121,00:00:00' , 'YYYY,DDD,HH24:MI:SS')

This will also work:

select * from STATION where
starttime between to_date('08,01,98' , 'MM,DD,YY') and
                  to_date('08,02,98' , 'MM,DD,YY')

If you don't give an explicit time, the default is 12:00:00am.

WILDCARDS

For those of you used to the UNIX wildcards, get ready to be confused.

Operator Purpose Example
*
Returns ALL data for ALL fieldnames. ONLY used after select [distinct]. select * from CHANNEL
%
[like UNIX *]
Can match zero of more characters in value. Cannot match a null. select * from CHANNEL
   where CHANNEL.channel
   like '%H%'
_
[like UNIX ?]
Can match exactly one character in the value. select * from CHANNEL
   where CHANNEL.channel
   like 'BH_'

OUTPUT FORMAT

When using the on-line query tool (SeismiQuery) you will receive the output of your query in the form of an HTML table printed to your screen. This can be saved to your machine as TEXT, HTML source, PostScript or printed from the screen.

EXAMPLES

For the sake of clearity, in the following examples all TABLE names will be CAPITALIZED and all fieldnames will be lowercase.

EXAMPLE 1: a simple query:

select * from STATION;

The output of this query would be a very large list (table) with all the information from all the fields in the TABLE STATION and would look something like this:

NE STATI LATITUDE LONGITUDE ELEVATION SITE LONG_WORD WORD STARTTIME ENDTIME LOOKUP_GA
II AAK 42.639 74.494 1645 Ala Archa, Kyrgyzstan 3210 10 10/12/1990 00:00:00 04/17/1991 00:00:00 76
II AAK 42.639 74.494 1645 Ala Archa, Kyrgyzstan 3210 10 04/17/1991 00:00:00 09/11/1991 00:00:00 76
II AAK 42.639 74.494 1645 Ala Archa, Kyrgyzstan 3210 10 09/11/1991 00:00:00 01/12/1994 00:00:00 76
II ALE 82.5033 -62.35 60 Alert, N.W.T., Canada 3210 10 02/19/1990 00:00:00 04/09/1992 00:00:00 76
KN AAK 42.6333 74.4944 1680 Ala-Archa, Kyrgyzstan 3210 10 09/01/1991 00:00:00 12/31/2599 23:59:59 126
KN AML 42.1311 73.6941 3400 Almayashu, Kyrgyzstan 3210 10 09/01/1991 00:00:00 12/31/2599 23:59:59 126
MN AQU 42.354 13.405 710 L'Aquila, Italy 3210 10 07/31/1991 00:00:00 08/27/1995 00:00:00 523
and so on.....

EXAMPLE 2: Let's make a more select query from the TABLE STATION:

select distinct network, station, site from STATION
where STATION.station = 'ANMO';

First, we have chosen specific fieldnames (network, station, site) from one TABLE (STATION) instead of using the * wildcard. This will decrease our output by quite a bit. By using 'distinct' we will also eliminate duplicates. We have also used our first 'condition' by limiting the output to information related to the station 'ANMO'. The output looks like this:

NE STATI SITE
IU ANMO Albuquerque, New Mexico, USA
SR ANMO Albuquerque, New Mexico, USA

EXAMPLE 3: Querying more than one TABLE, using distinct and setting multiple conditions

select distinct STATION.network , STATION.station, B33.text
from STATION, B33
where STATION.lookup_ga = B33.id
and B33.text like '%USGS%'
and STATION.station in ('ANMO', 'GRFO', 'COLA');

A note from our database creator, Sue Schoch --- "The order in which you list the TABLEs in your query can directly effect the speed of your query. In the case of this example you can think of it like 'I have a station record, now what is the network affilition' rather than 'I have a network affiliation, now what is the station' - which would be the case if you reversed the STATION and B33 calls. The first in the list of TABLEs in the from clause is the driving TABLE."

Using the distinct option in the select clause will assure the elimination of duplicates BUT will also slow down the query because the output is sorted first. The output of the query would look something like this:

NETWORK

STATION

TEXT

IU GRFO (GSN) IRIS/USGS, S. Z.(Germany)
IU COLA (GSN) Global Seismograph Network (IRIS/USGS)
IU ANMO (GSN) Global Seismograph Network (IRIS/USGS)

EXAMPLE 4: Time and/or date queries (see Date and Time Formating for details)

select * from IU_1998 where
starttime between to_date('1998,091,00:00:00' , 'YYYY,DDD,HH24:MI:SS') and
                  to_date('1998,121,00:00:00' , 'YYYY,DDD,HH24:MI:SS') where IU_1998.channel = 'BHZ';

The output table would look something like this:

NETWORK STATION CHANNEL STARTTIME ENDTIME FILENAME BYTES LOCATION
IU ADK BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 ADK.IU.BHZ.98.091 2359296  
IU AFI BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 AFI.IU.BHZ.98.091 2125824  
IU ANMO BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 ANMO.IU.BHZ.98.091 1236992  
IU ANTO BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 ANTO.IU.BHZ.98.091 1966080  
IU CCM BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 CCM.IU.BHZ.98.091 1437696  
IU CHTO BHZ 1998-04-01 00:00:00.0 1998-04-02 00:00:00.0 CHTO.IU.BHZ.98.091 1314816  
and so on...

Return to TOC