Virtual Solar Observatory

VSO Data Model

Version 1.8



The VSO Data Model provides a set of template descriptions for information required to describe, access, and search solar data sets in a variety of archives. It is an abstract model, not a suggested set of keywords to be used in data nor in databases. Because of the ubiquity of the FITS standard and the wide use of certain conventions, we provide illustrative values of FITS keywords for certain data elements; but neither the adoption of any set of particular keywords nor the FITS data model at all are required for a data description to conform to the model. The VSO Element Names, are used at a level of abstraction once removed from the search parameters of the data providers. They should be completely internal to the VSO procedures for decoding information from user interfaces, not sent in in queries to data providers. We have deliberately avoided the use of FITS-compatible keyword names to emphasize this point.

VSO Search Parameters

VSO search parameters are those data descriptors for which queries are supported by the VSO in behalf of client applications or requests. These are the parameters that can best discriminate among a large collection of heterogeneous data. They must therefore be supported by the data providers as search parameters applicable to a large subset of the data archives. They must map to parameters in the server data dictionaries in a well-defined and meaningful way. They must also be selected so that the number of data sets meeting a particular selection criterion is small compared to the total number: for the VSO an astronomical type search parameter of Object (Sun) is not particularly useful as a discriminator.

The VSO search parameters are divided into a few groups, each described under one of the major subsections. These categories are understood to be orthogonal, in the sense that they can be used to construct non-trivial AND queries. Of course they are not strictly orthogonal: selection of a particular data source (instrument) may automatically restrict the available observing times for example, and vice-versa. Nonetheless it is useful to treat the major categories as if they were orthogonal and treat any dependencies as implicit selections or limits.

No particular set of search parameters is required. In the absence of a relevant element or group of elements in its data description, a dataset is assumed to match all queries. For example, if no wavelength information is supplied, then the server will return all records for any selected (or deselected) wavelength interval. If a parameter is not searchable but has a default value, then that value can be supplied directly in the data description. For example, an archive of data all taken at the same wavelength is unlikely to have wavelength as a searchable key in its database, but could (and should) supply that wavelength as a fixed value in its data description to avoid inappropriate satisfaction of client queries.

The current parameter list is not intended to be exhaustive, and it may be useful to add additional search elements and categories in future. The categories chosen are those for which the VSO either has attempted to implement a search service or contemplates doing so. So far, only a few of the parameters can be searched in the VSO, and these are marked with asterisks in the following list. The elements are described in detail by group under the following sections.

1. Observing Time

Observing time is by general consensus the most likely parameter to be used as a first case for searches, the most ubiquitous indexing parameter for data, and one on which there is widespread agreement and understanding of representations, scales, and units. Most of the complexity involved is in the descriptions of data translation. Here it is sufficient to specify a simple uniform description.

(Most observational data are expected to be associated with observing times, and so far all VSO query structures have been assumed to include a time search parameter. It is possible however that some data may not be; model data are an example. As described above, such data would automatically satisfy any time interval query, and at least one additional parameter would be required to make them selectable.)

type: time
FITS keyword: T_OBS
The time at which the data comprising an atomic data set were originally recorded. If the duration of the data in the atomic data unit is large compared with the search time resolution, the Observation_Time is to be understood to correspond to the center (mid-point) of the observation(s), weighted as appropriate. For purposes of the Data Model, Observation_Time is given in calendar-clock form, e.g.2004.03.08_16:25. Times are assumed to be UTC. The time resolution is one minute, so for much data the conversion from say start time of an exposure to Observing_Time should not matter. Likewise the conversions between UTC and other units such as ET, TAI, and GPS should not be a matter of much concern. A data match is assumed to include all data from 30 seconds before the target time to 30 seconds after, inclusive (closed at both ends), so that a data Observation_Time can in principle fall into two adjacent target times. Note that since Jan 1, 1999, TAI = UTC + 32 sec, and GPS = UTC + 13 sec.
type: number
unit: second
FITS keyword: T_LENGTH
The interval between the start and end of observation in the atomic data unit. For a single image or spectrum, this is simply the exposure time; for a movie, it is the time difference between the start of the first image and the end of the last.
type: number
unit: second
FITS keyword: T_STEP
The interval between succesive time samples (data records) in a dataset.

2. Target Location

Target location, by which is meant the spatial location of the target region of imaged or pointed observations on or around the Sun or in the heliosphere, has not yet been built into any VSO query models, although it is a fairly natural selection criterion for observations with a restricted field of view. It may suffice to specify a simple uniform description, although the multi-dimensionality of space makes this harder than one for time. For two-dimensional image data we assume a bounding circle as the simplest model. For this model it is sufficient to specify the center location and radius of the bounding circle. Most real image data are actually described by a bounding rectangle, but this requires specifying at least five parameters (e.g. the coordinates of opposite corners and a position angle).
type: number
unit: arc-second
FITS keyword: CENT_WST
type: number
unit: arc-second
FITS keyword: CENT_NRT
A pair of coordinates specifying the location of the center of the image data circle with respect to the Earth-Sun line at the nominal Observation_Time. This origin is close to the center of the apparent solar image for Earth-based or near-Earth observers, but not necessarily for deep space observations. The North coordinate is measured in the direction of the Carrington axis (RA 286°.13, δ 63°.87 J2000.0), and the West coordinate in the direction of solar rotation.
type: number
unit: arc-second
FITS keyword: R_BOUND
The radius of the bounding circle about the Observation_Center. For the VSO Data Model the bounding circle is to be understood as either the maximum inscribed circle in the bounding data rectangle (polygon), or the minimum circumscribed circle, depending on whether the query is for included data (presumably the normal default) or excluded data, respectively.

3. Observer Location

No Search Parameters have been defined to describe observer location. Two classes of description are appropriate, one for ground-based observations and one for space-based data, particularly in situ measurements. For Earth observatories, a straightforward geographic latitude / longitude / altitude description should suffice, but it is not clear how useful this would be as a discriminator for data searches. For space platforms, where the description of location for in situ data is especially important, we defer to the model (to be) adopted by the VSPO. It should be noted, though, that as stereoscopic imaging of the Sun from space observatories becomes more important, search parameters associated with observer location with respect to solar coordinate frames may have to be introduced.

4. Spectral Range

The electromagnetic wavelength interval or equivalent over which observations are made is the fundamental discriminator among many types of solar image and other data. The model needs to apply to both narrow-band ("monochromatic" or single-line) and broad-band data. Different branches of the field use different units depending on their spectral band -- frequency at the lowest ranges (of frequency), wavelength at intermediate ranges, energy at the highest. Again for the sake of simplicity we define a single model, assuming that the necessary conversions can be simply made.
type: menu
FITS keyword: WV_TYPE
The class of spectral data, relating to both the nominal spectral bandpass and the spectral target. Three values are recognized:
Indicates that the spectral range of the measurement is large compared to the width of absorption/emission lines within the range, and encompasses multiple lines as well as continuum (unless blanketed)
The spectral range of the measurement is of the same order or less than the width of the target line, and is centered on a wavelength within the wings of the line.
The spectral range of the measurement is of the same order or less than the typical width of lines in the neighborhood, but is centered on a continuum wavelength, outside of any significant lines. This designation is used to distinguish narrow-band continuum (or "white-light") data from true broad-band data. For data of this description, the matching spectral range should be much broader than the instrumental bandpass, on the understanding that the data are proxies for broadband measurements.
The exact definition of the bandpass (e.g. FWHM) is not prescribed, but is left up to the terminology of the data provider. In the absence of a provider definition, FWHM should be used.
type: number
unit: Ångström (10nm)
FITS keyword: WV_MIN
type: number
unit: Ångström (10nm)
FITS keyword: WV_MAX
The nominal minimum (maximum) of the observing spectral bandpass associated with the data. As discussed above, for narrowband continuum data, the range should be much larger than the instrumental bandpass; it should correspond to the spectral range over which the data are useful as a proxy, typically an octave or more.
type: number
FITS keyword: WV_NBAND
The number of wavelength bands in the observation
type: number
unit: Ångström (10nm) / pixel
FITS keyword: WV_STP
The spectral dispersion

5. Observable

It is in the description of the independent variables, what the data in fact measure, that there is the greatest variation in terminology among data archives. Most solar observational data consist of direct measurements of the intensity of radiation as a function of time, direction (location), wavelength, and polarization, or combinations of intensities associated with different independent variables (e.g. line shifts and splittings, Stokes parameters). These data may be interpreted as measurements of certain physical observables, such as temperature, velocity, emission measure, etc. via models. There are of course some important exceptions: some solar data archives include in situ measurements of such observables as particle fluxes and compositions and magnetic field strengths; some solar data sets represent not direct observation but the results of complex inversions or modeling, such as the frequencies of acoustic modes, or the interior structure; and there are catalogs, histories, and descriptions of features and events. As long as the various observable classes are orthogonal, however, these additional cases should present no problem.

The model of describing observables in terms of particular combinations of intensity measurements or the associated physical parameters to be derived from them is a natural one for data deriving from imaging spectrographs, such as magnetographs and helioseismic instruments. For cameras or radiometers measuring only intensity or flux at selected wavelengths, it is not so natural. People dealing with data from such instruments tend to think of the observables as being associated with the spectral wavelength or band selected, or for monochromatic instruments, even the spatial-temporal target of the observations. It is important to understand that the meaning of the term "observable" in the VSO Search Parameter model may not at all agree with the meaning of the term as used by the data providers.

type: menu
FITS keyword: PHYS_OBS
The following values are currently recognized:
the direct intensity, either integrated over the spectral observing range or as a function of wavelength (spectral density)
differences between intensities measured at nerbay wavelengths, typically in line cores, wings, and nearby continuum, whether measured as an intensity difference or an equivalent width
the net linear polarization
the frequency/wavelength Zeeman splitting between opposite circular polarizations of a magnetically-sensitive line
field strengths and directions inferred from Stokes polarimetry
the displacement of line center from rest wavelength/frequency in an arbitrary polarization state
two- or three-dimesnional velocities, typically inferred from helioseismic inversion or from directly measured velocities transverse to the line of sight, possibly combined with Doppler velocities
These all refer to solar internal or atmospheric acoustic-gravity wave measurements. The mode parameters could include frequencies, splittings, amplitudes, widths, etc.
in-situ observations
In addition to the above, the following classes have been suggested:

6. Data Organization

The data organization describes the physical meaning of the independent variable(s) with respect to which the observables are measured. This is useful for knowing whether and how different data sets can be directly compared, overlaid, mapped, or otherwise transformed.
type: menu
FITS keyword: DATA_ORG
The following values are recognized:
data organized by two dimensions corresponding to angular displacement along the axes; examples include photograms (digital or digitized photographs), spectroheliograms, magnetograms, and Dopplergrams
data organized by two dimensions corresponding to spatial displacement along the axes; examples include synoptic charts
data organized by one dimension corresponding to temporal displacement along the axis; note that this is not the same as a time-tagged set of data records, since it implies sampling uniformity and provision for data gaps
data organized by three dimensions corresponding to spatial or angular displacement along two axes and temporal displacement along the principal (most slowly varying) axis
data organized by one dimension corresponding to displacement in electromagnetic wavelength or frequency along the axis
data organized by one or more dimensions corresponding to the quantum numbers of oscillations
data organized by two dimension corresponding to displacement in wavelength or frequency along one axis and temporal diplacement along the other
data organized by two dimensions corresponding to spatial or angular image axes and one corresponding to electromagnetic spectral displacement

7. Wave Mode Sampling

These parameters relate to data sets derived from helioseismic analysis of solar image data, specifically to global-mode analysis. No such data sets are currently available from any of the providers, so these search parameters have not yet been implemented.
type: number
FITS keyword: L_MIN
type: number
FITS keyword: L_MAX
The nominal minimum (maximum) of the spherical harmonic degree range associated with the data.
type: number
unit: Ångström (10nm) / pixel
FITS keyword: L_STP
The spacing between spherical harmonic degrees in the data

8. Data Source

type: menu
An identifier of the observatory, space platform, or network of observatories (or spacecraft) from which the data originate. In the case of networks such as GONG or CLUSTER, the particular observatory site or spacecraft may be identified by Instrument if each member is single-instrument. In the case of multi-instrument multi-site networks, another Data Source search parameter (Site or Instance or Platform or Network) may be required. Note that network is used in the sense of functionally identical instruments deployed in different locations, and not coordinated data collections from distinct instruments, such as the H-alpha Network; that is considered a Provider.
Some recognized values (click HERE and select "source" at Level 0 for the current registry):
  • BBSO : Big Bear Solar Observatory
  • Evans Solar Telescope, Sacramento Peak
  • GONG : Global Oscillations Network Group
  • JSPO : Jeffreys South Pole Observatory
  • KANZ : Kanzelhöhe Solar Observatory
  • KPVT : Kitt Peak Vacuum Tower Telescope
  • McMath Solar Telescope, Kitt Peak
  • MEDN : Observatoire de Paris, Meudon
  • MLSO : Mauna Loa Solar Observatory
  • MtWilson : Mt. Wilson 60ft Tower Telescope
  • Nançay Radio Observatory
  • OACT : Osservatorio Astrofisico di Catania
  • PicMidi : Observatoire du Pic du Midi
  • SOHO : Solar and Heliospheric Observatory
  • SOLIS : Synoptic Optical Long-term Investigations of the Sun
  • OBSPM : Observatoire de Paris, Meudon
  • OVRO : Owens Valley Radio Observatory
  • TON : Taiwan Oscillations Network
  • YNAO : Yunnan Astronomical Observatory
  • Yohkoh
  • Instrument
    type: menu
    FITS keyword: INSTRUMT
    For multi-instrument space observatories, an identifier of the particular instrument from which the data originate. For observatories, the Instrument may refer to a particular telescope or to one of multiple standard configurations of telescope plus detectors. For the list of instruments registered, click HERE and select "instrument" at Level 0.
    type: menu
    The identifier of the data archive providing search and retrieval functions for the data in question. The same data may of course be mirrored at two or more archives. Since the provider id is at least implicit in a data registry, this just means that the same data set would appear in multiple registries. Some data providers may be virtual, that is the query (but not the archive and distribution) services may be handled by other servers with access to their database information as proxies.
    Recognized values:

    Suggestions for Additional Search Parameters

    The following search parameters or categories are under consideration for possible inclusion in future versions of the VSO Data Model:


    Nicknames for famous combinations od Search Parameters were introduced in Version 1.7 of the Data Model in a separate table. Here they are incorporated in the defining document. Certain problems remain to be resolved. For example, mechanisms are required for designating a logical OR of menu-type parameters, and for specifying whether a Bounding_Radius is an inner radius or an outer radius.
    White-light image
    Observable=intensity, Data_Layout=image
    Wave_Type={broad | narrow} Wave_Minimum≥3000, Wave_Maximum≤10000
    coronagraph image
    Observable=intensity, Data_Layout=image
    |Observation_Center_West|≤20, |Observation_Center_North|≤20, Bounding Radius≥950 (excluded)
    H-alpha image
    Observable=intensity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥6558, Wave_Maximum≤6568
    Ca-K image
    Observable=intensity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥3919, Wave_Maximum≤3952
    He 10830 image
    Observable=intensity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥10825, Wave_Maximum≤10833
    Na-D image
    Observable=intensity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥5888, Wave_Maximum≤5898
    Hard X-ray image
    Observable=intensity, Data_Layout=image
    Wave_Minimum≥0.2, Wave_Maximum≤10,
    Soft X-ray image
    Observable=intensity, Data_Layout=image
    Wave_Minimum≥5, Wave_Maximum≤150,
    EUV image
    Observable=intensity, Data_Layout=image
    Wave_Minimum≥100, Wave_Maximum≤1250,
    UV image
    Observable=intensity, Data_Layout=image
    Wave_Minimum≥900, Wave_Maximum≤3800,
    10.7 cm image
    Observable=intensity, Data_Layout=image
    Wave_Type=narrow, Wave_Minimum≥1.06*109, Wave_Maximum≤1.08*109,
    Continuum image
    Observable=intensity, Data_Layout=image
    Full-disk magnetogram
    Wave_Type=line, Data_Layout=image
    Observable={LOS_magnetic field|vector_magnetic field}
    |Observation_Center_West|≤20, |Observation_Center_North|≤20, Bounding Radius≥800
    LOS magnetogram
    Observable=LOS_magnetic field, Data_Layout=image
    vector magnetogram
    Observable=vector_magnetic field, Data_Layout=image
    Full-disk dopplergram
    Observable=LOS_velocity, Data_Layout=image
    |Observation_Center_West|≤20, |Observation_Center_North|≤20, Bounding Radius≥800
    Na-D dopplergram, Data_Layout=image
    Observable=LOS_velocity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥5888, Wave_Maximum≤5898
    Ni-6768 dopplergram
    Observable=LOS_velocity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥6767, Wave_Maximum≤6769
    K-7699 dopplergram
    Observable=LOS_velocity, Data_Layout=image
    Wave_Type=line, Wave_Minimum≥7698, Wave_Maximum≤7700
    EUV Spectrum
    Observable=intensity, Data_Layout=spectrum
    Wave_Type=broad, Wave_Minimum≥100, Wave_Maximum≤1250
    UV Spectrum
    Observable=intensity, Data_Layout=spectrum
    Wave_Type=broad, Wave_Minimum≥900, Wave_Maximum≤3800
    Visible Spectrum
    Observable=intensity, Data_Layout=spectrum
    Wave_Type=broad, Wave_Minimum≥3500, Wave_Maximum≤10000
    IR Spectrum
    Observable=intensity, Data_Layout=spectrum
    Wave_Type=broad, Wave_Minimum≥7000, Wave_Maximum≤3.5*106
    Atlas Spectrum
    Observable=intensity, Data_Layout=spectrum
    Wave_Type=broad (?)
    Helioseismic Time series
    Observable={wave_power | wave_phase | oscillation_mode_parameters}
    Light Curve Time series
    Observable=intensity, Data_Layout=time_series

    VSO Searching @ Home | GSFC | NSO | SAO | Stanford

    Mon, 2 May 2005 about 9 AM PDT Valid HTML 4.01!