SECTION 1: INTRODUCTION

1.1 OVERVIEW

Geo-EAS (Geostatistical Environmental Assessment Software) is a collection of interactive software tools for performing two- dimensional geostatistical analyses of spatially distributed data. Programs are provided for data file management, data transformations, univariate statistics, variogram analysis, cross validation, kriging, cokriging, post plots, and line/scatter graphs. Features such as hierarchical menus, informative messages, full-screen data entry, parameter files, and graphical displays are used to provide a high degree of interactivity, and an intimate view of results. Users may easily alter parameters and re-calculate results or reproduce graphs, providing a "what if" analysis capability.

Geostatistical methods are useful for site assessment and monitoring situations where data are collected on a spatial network of sampling locations, and are particularly suited to cases where contour maps of pollutant concentration (or other variables) are desired. Examples of environmental applications include lead and cadmium concentrations in soils surrounding smelter sites, outdoor atmospheric NO2 concentrations in metropolitan areas, and regional sulfate deposition in rainfall. Kriging is a weighted moving average method used to interpolate values from a sample data set onto a grid of points for contouring. The kriging weights are computed from a variogram, which measures the degree of correlation among sample values in the area as a function of the distance and direction between samples.

Kriging has a number of advantages over most other interpolation methods:

Estimation of the variogram from sample data is a critical part of a geostatistical study. The procedure involves interpretation and judgment, and often requires a large number of "trial and error" computer runs. The lack of inexpensive, easy-to-use software has prevented many people from acquiring the experience necessary to use geostatistical methods effectively. This software is designed to make it easy for the novice to begin using geostatistical methods and to learn by doing, as well as to provide sufficient power and flexibility for the experienced user to solve real-world problems.

1.2 EQUIPMENT REQUIREMENTS

The PC software was designed to run under DOS (Disk Operating System) on an IBM PC, XT, AT, PS2, or compatible computer. This is an adaptation of the original software for the UNIX environment. The program was converted on a SUN Sparc Station, and has been ported to a DG.

1.3 SOFTWARE AVAILABILITY

The DOS Geo-EAS software in its executable form is entirely in the public domain, and can be obtained by sending the appropriate number of diskettes (PREFORMATTED, PLEASE!) to the following address:

                     Evan J. Englund (Geo-EAS)

USEPA EMSL-LV, EAD

P.O. Box 93478

Las Vegas, NV 89193-3478

The executable files and example data sets take approximately 3 megabytes of storage, and require the following number of diskettes, depending on the type:

                         Type         Number

-------------------------

5 1/4" 1.2MB 3

5 1/4" 360KB 9

3 1/2" 1.44MB 3

3 1/2" 722KB 6

The source code is written in FORTRAN 77 for the Microsoft (Microsoft Corporation, Redmond WA) FORTRAN compiler (version 4.01). With the exception of slightly modified proprietary Graflib (Sutra Software, Sugarland TX) subroutines used for generating screen graphics, the source code is also in the public domain. For further information on the source code and programmer documentation, contact:

                     Geo-EAS

Computer Sciences Corporation

P.O. Box 93478

Las Vegas, NV 89193-3478

The UNIX Geo-EAS software is entirely in the public domain, and can be obtained by anonymous ftp at math.arizona.edu, currently at the incoming/unix.geoeas directory.

1.4 USER PROFILE

To use this system, you should have some familiarity with UNIX. For more information on these topics, consult a UNIX use's manual. It is assumed that you have a working knowledge of Geostatistics, and that you understand the basic Geostatistical concepts. For a list of references on the subject of geostatistics, refer to Appendix A, References.

SECTION 2: SYSTEM SUMMARY

2.1 INSTALLING THE SYSTEM

2.1.1 Extracting the code

The code comes in a tar file, called

Go to the directory where you want it installed, then do the following:

or

and away it will go.

to see how much luck you have in compiling the individual programs. Then, when things don't work out, type

and send your vituperative comments to me (Andy Long). We'll talk, maybe have lunch. On you.

Below is a list of the programs which are included in the distribution:

Miscellaneous files and programs:

2.2 Running it:

Go to the bin directory, type setup.sh, and it will take care of this for you. YOU NEED TO BE IN THE BIN DIRECTORY (of the unix_version directory) WHEN YOU DO THIS! Then "source" your startup files (alternatively, type "ksh" or "csh", or exit, then come back in, and things should be okay).

Type "geoeas", and away you go! Try

The other way:

You will need to set one global variable for your shell. If you use the ksh, like a wise person, then you will add the following to your $HOME/.kshrc file:

for example, my directory for this stuff is

so I have the line

in my .kshrc file. If you're a csh person, then you'll put the line

in your $HOME/.cshrc file.

You may also need to set your display variable to the machine you are working on: for example, on my machine (gila.math.arizona.edu) then

if you use sh or ksh, then at the prompt

if you use csh, then

2.2.1 Using the Geo-EAS System Menu

We use a program called "xgen", which comes with the public domain package GRASS, to serve as an interface for the programs. The xgen file we use is in the bin directory, and is called menu.xgen, so you'd type

xgen $GEO_EAS/bin/menu.xgen

to get the nice menu system. Try looking in your GRASS directory for it, in something like $GISBASE/../src.related/xgen/src (which is ours).

To obtain xgen, try

There are some readmes, and some binary stuff, so you'll have to get the ascii first, then switch to binary.

If you have xgen, and use this menu, then you'll probably soon want to remove that hideous yellow disclaimer : you need to edit the bin/menu.xgen file. Follow the instructions that you find there. Or the person who installed it can do it from the help menu.

2.2.2 Using the Programs From Batch Files

If you don't have xgen, then you could use our poor man's batch file: it does the same thing, but doesn't look as nice. It's called geoeas, and the idea is that it compiles or calls the files as needed. It is found in the bin directory also. You just give the name of the program you're interested in, e.g.

and it will run vario for you. If you just type

it will give you a list of program options.

SECTION 3: SYSTEM OPERATION

3.1 DATA

3.1.1 Geo-EAS Data Files

All programs in the system use a common format for data files. (Note: the term "Data File" is used to denote a specific type of file used by Geo-EAS programs, as opposed to Pair Comparison files or Parameter files). Data files are simple ASCII text files which may be created with any text editor. It is important to be familiar with this format, and to make sure your data files are compatible, or the programs will not be able to read them. An example data file has been included with the distribution diskettes. It is called "example.dat". Below is an explanation of the data file format.

Line 1 - Title

This line is a descriptive title which may contain up to 80 characters. Most programs display the title on the screen when the file is read into memory. Some programs will use the title as the default title for graphics screens.

Line 2 - Number of Variables (NVAR)

This line tells the programs how many variables are in the data file. The data are stored in rows and columns, where each column contains a different variable, or measured quantity, and each row represents a different sample location, time, etc. The data file may hold up to 48 variables (columns). Different programs have different limits on the number of samples (rows) which can be read. Typically, up to 1000 samples may be read. If a program encounters more than its limit of samples, the remaining samples will not be read into memory, and will not be used for computation.

Line 3 to NVAR+2 - Variable Names and Measurement Units

The lines following the number of variables should contain the names, and optionally the measurement units, for each variable (1 line per variable). The variable name used in many of the programs will be the first 10 characters of the line, and the units (optional) will be characters 11-20. When a data file is accessed by a program, the variable names are stored into toggle fields. This allows one to select variables by name, and provides some internal documentation of data file contents. Variable names will be used as default labels for graph axes, in graphic displays.

Line NVAR+3 To End of File - the Data Matrix

This is where the data are stored. Columns represent variables, and rows represent samples. The data may be in "free format", which means that in a given line in the file, variable values must be separated by at least one space, or a single comma. For readability, columns of numbers should line up, although this is not required. Variable values must be numeric with no embedded blanks. In many cases, several variables may be present in a data set, but for some reason a value could not be obtained for a particular variable in a particular sample. A special value may be given to the variable in this sample which will indicate to a program that the value is missing, so that it will not be used in calculations. The special value reserved for this is 1.E31. This is "scientific notation" for a 1 followed by 31 zeros. If your data set has missing values, be sure to type a 1.E31 where the real value would have appeared. Below is a portion of the file Example.dat. It contains 5 variables and 60 samples.

Example.dat - Geostatistical Environmental Assessment Software 5 Easting feet Northing feet Arsenic ppm Cadmium ppm Lead ppm 288.0 311.0 .850 11.5 18.25 285.6 288.0 .630 8.50 30.25 273.6 269.0 1.02 7.00 20.00 280.8 249.0 1.02 10.7 19.25 273.6 231.0 1.01 11.2 151.5 276.0 206.0 1.47 11.6 37.50 285.6 182.0 .720 7.20 80.00 288.0 164.0 .300 5.70 46.00 292.8 137.0 .360 5.20 10.00 ... ... ... ... ...

3.1.2 File Naming Conventions

Only valid UNIX file names will be accepted by the programs. For more information on UNIX file names, refer to the UNIX reference manual. All file names used by the Geo-EAS programs are associated with a File Prefix. The File Prefix provides a means of specifying a directory where data files should be accessed. This option is discussed in detail in the section below (Common Menu Options). Although the programs place no restriction on file extensions, it is good practice to use consistent naming conventions for file extensions. Below are the suggested extensions which are used as defaults in Geo-EAS programs.

Geo-EAS File Extensions:

.dat - a geo-eas data file

.pcf - a pair comparison file, created by prevar, read by vario

.grd - a gridded geo-eas data file (could be produced by krige)

.cpf - cokrig parameter file

.kpf - krige/vario/xvalid parameter file

.pol - polygon boundary file, used by krige

3.2 INTERACTIVE SCREENS

3.2.1 Screen Format

All Geo-EAS programs have similar interactive features. Each program uses interactive screens for selection of program options and display of results. The screens are composed of several common components. Figure 3-1 displays an example interactive screen from program Stat1. Below is a description of the common components.

A. The Screen Frame

This is the large double-line rectangle which encloses each screen. Program inputs and results are displayed in this area. Typically, the screen frame is subdivided into smaller single- line rectangles. Each of these smaller rectangles contains a functionally-related group of one or more input parameters, or program results.

B. The Message Line

This is the double line rectangle at the bottom of the screen frame. This area is used to display program error messages, yes/no prompts, prompts for additional information, or instructions for using a program option.

C. The Menu Line

This is the line of text located just below the screen frame. It contains a set of menu option names and a highlighted box (cursor bar). The cursor bar can be moved along the menu line by using the , and cursor control keys. As the cursor bar is moved over a menu option name, a short description of the menu option is displayed on the line just below the menu line. This line is called the menu description line. In addition, on the main screen for each program, more detailed descriptions of the menu options are displayed. You may explore the possible choices in a program by moving the cursor bar and reading the descriptive messages which accompany each menu option. To select a menu option, move the cursor bar over the desired menu option name, and press . An alternative (and faster) way to select menu options is to press the key which corresponds to the first letter in the menu option name. The result is the same as using the cursor control keys, and pressing . In program Stat1 for example, you would choose to enter the data file name by pressing (for the Data option) from the main menu.

D. Parameter Groups

Typically, a functionally-related group of program input parameters (fields) are enclosed together on the screen by a single-line rectangle. These groups of parameters are accessed through the menu. When a menu option is selected (as described above), a cursor bar appears at the screen field, and a message describing what action to take appears on the message line. When such a group contains several fields, the cursor control keys, or are used to move to subsequent fields. Exiting from the last field in the group will return the cursor bar to the menu line. In some programs, parameter groups are arranged in a tabular fashion (rows and columns). To return to the menu line from such a group, move the cursor bar to the left or bottom of the group with the or keys.

3.2.2 Types of Screen Input Fields

Several types of input fields are provided to allow flexibility in program parameter specification. Below is a list of these types, and an example of each field type in the Stat1 screen:

Alphanumeric Fields - These fields may contain character strings of alphabetic or numeric characters. Any alphanumeric characters may be entered. The "Prefix", and "Data" menu options in Stat1 require alphanumeric values to be entered. To specify a data file name, select the Data option on the menu, and type the name of the input data file.

Numeric Fields - Only numeric data may be entered into numeric fields. Some numeric fields will only accept integer (non- decimal) numbers. The programs will respond to any erroneous keystrokes (such as alphabetic keys) with a low- pitched error tone. An example of numeric fields in program Stat1 are the two fields accessed through the Limits option Only numeric values may be entered into these fields. Values must be entered in the conventional manner (legal characters are <0> through <9>, and <.>, exponential notation for numeric values is not allowed.)

Toggle Fields - A toggle field is a special type of field which contains a list of 2 or more preset choices. Only one of these choices is displayed in the field. The key is used to change the displayed choice, and the key is used to make the selection. Two examples of toggle fields in program Stat1 are the "Variable" field and the "Log" field. Once a file name has been specified, the "Variable" toggle field will contain the names of all variables in the file. When the Variable option on the menu line is selected, this field will be highlighted, and each time the key is pressed, a new variable name will appear in the field. When the desired variable name appears, press the key to select it. The "Log" field is an example of a toggle field with only two choices ("On", or "Off"). If "On" is chosen, then statistics will be calculated for the log of the selected variable.

Yes/No prompts, prompts for additional information - These prompts are for information which will not be displayed permanently on the screen. They will appear temporarily on the message line. A Yes/No prompt will typically have the form: "Question...?". To respond Yes, press the key, to respond in the negative, press any other key. A typical Yes/No prompt is the "Do you really want to Quit ?" prompt which is displayed after the "Quit" (terminate program) option is selected. Some menu choices will result in prompts for additional information. These prompts for additional information will appear on the message line and may be of the alphanumeric, numeric, or toggle type.

3.2.3 The Menu Tree

The programs in the Geo-EAS system require input typically from data files and through interaction on the screen. These program inputs are arranged in a hierarchy of functionally-related groups. Each group, or individual program parameter value is accessed through a menu of choices. Some choices will lead to other menus, while some will lead to prompts for groups of one or more inputs. Such an arrangement can be represented in a "menu tree" as illustrated below for program Stat1.

Example menu tree

Stat1 ____ Prefix

_ Data

_ Variable

_ Limits

_ Execute _________ Histogram ______ Type

_ _ _ Class Limits

_ _ _ Axes

_ _ _ Title

_ _ _ Results

_ _ _ View Graph

_ _ _ Quit

_ _

_ _ Probability Plot

_ _ Examine

_ _ Quit

_

_ Batch Statistics

_ Quit

In the Stat1 menu tree, as in other programs, some menu choices will lead to program inputs, and some will produce numeric or graphical results. This hierarchy of options and results is a natural and convenient way of providing choices for program use. The "menu tree" representation of program options provides a "road map" for each program which summarizes the functional capabilities of a program. You may explore the hierarchy of options by traversing the menu tree and reading the descriptive messages which appear.

3.2.4 Common Menu Options

Many of the programs in the Geo-EAS system share common menu options. These will be discussed in this section to avoid redundancy. Any minor differences which apply to a particular program will be discussed later in detail. The following is a list of options common to many programs:

Prefix

This option is common to all Geo-EAS programs. It is used to specify a string of up to 50 alphanumeric characters which are used as a prefix for all files accessed by a program. Typically, it is used to include a disk drive and/or a directory specification. Before a file is accessed by a program, a file name is constructed which consists of the File Prefix followed by the given file name. File name errors are not caught by the programs until they have attempted to access a file. Consequently, "file not found" error messages are not issued until an attempt to access the file has been made. This may be due to a mistake in the file prefix or file name specified.

Data

This option is common to most Geo-EAS programs. It is used to indicate the Geo-EAS input data file to be used by the program. File names consist of 14 alpha-numeric characters. Any valid UNIX file name may be used. The File Prefix (discussed above) is used to construct the entire file name when the file is accessed by the program. If any errors occur while the programs are accessing or reading a data file a message indicating the problem will be issued. If a "file not found" message is displayed, the problem may be with the file prefix (see above). If no errors occur, the variable names are read from the data file and stored into toggle fields for use by the Variables option.

Variable(s)

This option is common to all programs which use Geo-EAS data files. It allows you to specify the variable (or variables) which the program will use. Some programs only use one variable (e.g. Stat1) and others require more. Typically, this option will provide access to one or more toggle fields which contain the variable names. Some programs include additional fields for selection of other parameters related to the choice of variables. These will be explained in the particular section which describes the program.

Execute

This option is common to all Geo-EAS programs. It is used to initiate processing of data by the program. Although the processing and interaction subsequent to the selection of this option is different for each program, it shares the common function of initiation of processing. The individual differences in processing will be described in more detail in the subsections which describe the programs.

Read Parameters, Save Parameters

These options are common to all Geo-EAS programs which make use of "Parameter Files". Parameter Files are files which contain values for all parameter choices available in a particular program. If a program provides this feature, you may save the values of parameters for later use, by using the Save Parameters option. Selection of this option will result in a prompt for the output parameter file name. The File Prefix is used to create or access the file. The Read Parameters option is used to load the parameter values into the program. When this option is selected, an input parameter file name must be entered. Typically, a program will attempt to load all data and set all parameter values based upon the information in the input parameter file. It is assumed that the data file associated with the parameter file is in the same location (subdirectory, etc.) as it was when the parameter file was saved. If any errors occur while accessing or reading the parameter file or the associated data file, an error message will be issued and the program will re-initialize all parameter values to their defaults. Conventions should be used when naming parameter files so that they can be associated with the appropriate data files and programs. A suggested convention for file extensions is given in a previous section (File Naming Conventions). It is also suggested that the first part of the file name have some similarity to the associated data file name.

Quit

This option is common to all Geo-EAS programs. It is used to exit from a menu, or program. Using the analogy of the Menu Tree, the quit option allows you to "move up" one level in the tree. When the quit option is used from the main menu of a particular program, a Yes/No prompt is issued: "Do you really want to quit ?". The key is typically used to select this option. The Yes/No prompt is a means of ensuring that a series of keystrokes will not cause inadvertent termination of the program.

3.3 Geo-EAS GRAPHICS

3.3.1 On-Screen Graphics

Many Geo-EAS programs have graphics capability. Each such program uses graphics in one of two ways. Programs Stat1, Vario, Xvalid, and Krige plot graphics directly on the screen. This approach is used to provide a quick look at data, or program results. Such graphics displays may be saved to a postscript file. When a graphics screen is displayed, the program will wait for a key to be pressed. Pressing "q" will quit the graphics. Pressing R2 ("PrSc" on our keyboard) will produce a postscript file.

3.3.2 Postscript Output

You can get postscript files of your plots, by punching the R2 button on your keyboard (the PrSc key on some keyboards), before leaving the graphics window with a "q". Once you type "q" the file will be created. Typing the R2 key once you have typed "q" will only annoy the window, and it may get out of hand and bite you.

One other thing you should know: the default size for the window is about right for making ps output. The fonts and such don't scale properly, so if you use a bigger window the ps file can look worse, rather than better (as you might otherwise suspect). So if you're going to be taking ps output, consider using the default-sized windows.

3.4 ERROR AND RECOVERY PROCEDURES

Normal Error Processing

A great deal of effort has been put into error checking into the programs. This includes bounds checking on numeric parameters, file names, file Input/Output, and file existence. When errors of these types are encountered in programs, error messages are displayed on the message line at the bottom of the screen. These messages are displayed in a black-on-white format (reverse). A typical error message is "Error encountered while reading data file...(press any key)". These messages will remain on the screen until a key is pressed. To return to the interactive screen after such a message is displayed, press any key.

Bugs

As everyone knows, people make mistakes. Because computer programs are designed by people they usually contain mistakes, or errors in program(mer) logic called "bugs". The Geo-EAS programs have been extensively tested, and many bugs have been uncovered and corrected. No known bugs have been allowed to remain, however it is entirely possible that there are still a few bugs lurking in the depths of some of the programs. It is still possible that there are situations which will cause some programs to "crash", or "fail" (terminate prematurely), or to "lock-up" (pause indefinitely with no response). If a program terminates prematurely it is probably due to a bug.

Bug Reporting (how you can help):

If you encounter a problem and you think it may be a significant bug, a bug report would be greatly appreciated. When you suspect a bug and want to report it, there are several steps that you should take: First, you should try to reproduce it (make it happen more than once). If a bug is not reproducible, it will be very difficult to determine the cause. Secondly, you should make note of the exact sequence of inputs which caused the problem, including your hardware configuration (if known). This information should be included on paper or diskette along with a description of the problem, so that the bug may be corrected. A detailed description and all program inputs will allow the programmers to reproduce the error and solve it more quickly. Every effort will be made to correct significant program errors as soon as possible. Check the READ.ME file in the software distribution, to get the address where known bugs may be reported.

3.5: Important Additional Notes: -----------------------------------------------------------------------------

1) Crashes:

If your xterm is not big enough for the program it will crash most undiplomatically (segmentation fault). Make sure that your window is big enough before running the program! The menu.xgen and geoeas programs take care of this. If you're going to use geoeas without these interfaces, you have to do it yourself! No Mom!

Try

to be sure. That's what geoeas does.

2) Graphics windows:

The reason you need to give input through the graphics window (for those programs with graphics) is that by doing so we keep that window active (for resizing, redrawing, and the like). It is now the case that ALL input comes through the graphics windows (it used to be otherwise) for those programs with graphics.

2.5) Graphics windows and COLORS:

You may find that the colors act funny in the graphics windows. If this is the case, you probably have some other application running which has played with the color table. You may have to stop that one (e.g. GRASS, which does this) before running Geo-EAS. We're trying to fix this.

3) PS output:

You can get a postscript copy of most of your graphical output, by typing your PrSc key (if you have a sun) or the R2 key in general. You must do this right before quitting the plot via "q". In order to get ps output of the grids in the cases of krige and xvalid, you have to select the print option when you initialize the debug options.

For the moment, those files have obvious names (e.g. histogram.ps). In the future, you will have the option of naming them yourself.

4) corres 3-d-spin feature:

We are using a public domain program called XGobi to do our 3-d data spinning. It is available from a number of servers by anonymous ftp, and is well worth the trouble of obtaining it.

I got my version this way:

There is a big doc file too, which serves as a manual, called xgobi.doc.full.ps.Z.

Then go about detarring it, etc. The version I last got is included, in the geo-gobi directory. You will need to have it referenced by typing simply

at the prompt, as that is what corres will call. So alias it if necessary, or add it to your path. Your administrator can tell you how to do these things.

5) More on XGobi:

First check to see if it is already on your system. If not, the tar file we currently use is distributed in the same directory as our source code. If you want to use the version I've included, then ftp it over into the $GEO_EAS/geo-gobi directory: once in ftp at our site, get the xgobi.tar.Z file and put it there.

If it is still in the compressed (.Z) format, then

else, you simply type

(takes awhile),

(or wherever you want it in your path).

6) generic: To add your own program, which will look like the rest

In this directory is most of what you need to use the geoeas-type interface to your favorite program that needs no more than a prefix, filename, and (optionally) a selection of variables to run.

You are starting out with a generic program which will read in the data (note: double precision! change generic.inc and datascr.f files from real*8 to real*4 to make a single precision version), and optionally divide the data into three different subsets (kept, supplementary, and ignored).

You take it from there in the execscr.f program. This is called when you choose option "Yours" from the menu of screen three.

Also remove the lines from yours.f as noted there.

For an example of how this is done, consider the geo-gobi program. It is based on this idea. It simply calls another program, after writing a few files that xgobi finds handy.

7) Colors (weird things happening to the Xwindow displays);

Some programs, like GRASS, do funny things to color tables. If you are getting really outrageous looking plots, try turning off other programs. (Splus, etc.) If all else fails, you can replace the color.dec file in the common directory by

(In other words, all white!).

8) Recompiling:

Do a "make clean" in the $GEO_EAS directory if you change anything in the common directory, anything of the form *.inc or *.dec, etc. I haven't got all the dependencies right in the makefiles yet, so the safest thing to do is always to start from scratch, i.e.

(If you bring in another version, and de-tar it, any object files laying around from the previous version may get stuck into the new compiled programs, wreaking havoc!)

Basically, any time you change stuff in a directory, removing the *.o files should take care of any problems.

9) Core Dumps

One of the bugs of xgen is that it can leave you with big core dumps (see its man page). One of the ways around the whole "core dump thing" (for those of you who use ksh, and never look at them anyway) is to set this in your .kshrc file:

I got this idea out of the manpage for ksh, the appropriate entry below:
     ulimit [ -cdfmpt ] [ n ]

-c imposes a size limit of n blocks on the size of

core dumps (BSD only).

10) Window not accepting input:

You may have trouble getting the X Window to accept input if you have your flip focus set improperly. If you're using OpenWindows on a Sun, for example, you want to have the line

in your .Xdefaults file.

That's all for now folks!

-----------------------------------------------------------------------------

SECTION 4: USING Geo-EAS IN A GEOSTATISTICAL STUDY: AN EXAMPLE

4.1 OVERVIEW

This section will demonstrate how to use Geo-EAS software to conduct a geostatistical study. Starting with an example data set from a hypothetical pollution plume, you will work through a complete study, using many of the Geo-EAS programs in the process. Of necessity, this exercise will be somewhat abbreviated. We will conduct a relatively straightforward study, illustrating the options which are likely to be most commonly used. The data set (example.dat) has been included with the software, so that you may repeat the exercise as a tutorial, or to test the software.

The scenario for the example is that data has been acquired from analyses of 60 soil samples at a site contaminated with arsenic, cadmium, and lead. The basic objectives are to examine the data set for possible errors or outliers, and to construct contour maps of each of the variables to define the areas of highest concentration. In this example you will work primarily with the variable Cadmium; you are encouraged to try out these procedures with the arsenic or lead data.

The Example Data Set

The Example.dat data set is an ASCII file in the Geo-EAS format. It contains data from 60 sample locations. The file structure is described in Section 3 above. The first few lines are as follows:

     Example.dat - Geostatistical Environmental Assessment Software

5

Easting feet

Northing feet

Arsenic ppm

Cadmium ppm

Lead ppm

288.0 311.0 .850 11.5 18.25

285.6 288.0 .630 8.50 30.25

273.6 269.0 1.02 7.00 20.00

280.8 249.0 1.02 10.7 19.25

4.2 EXPLORATORY DATA ANALYSIS

The first order of business in any data analysis is to become familiar with the data set. You will use a combination of statistics and graphical displays to look at the range and shape of the frequency distribution, to look for data outliers which may be erroneous or unrepresentative, to look at the "spatial coverage" of the data, and to look for spatial patterns in the data.

Begin by taking a look at a map of the data produced by the program 3-d View. Assuming that you have already copied the software and data into a directory called Geoeas, and have used the command cd /Geoeas" to access the directory, you can run 3-d View either by: (1) using the command-line argument

or by (2) running the program directly from the menu system by tapping on the program 3-d View.

When the program begins execution, it first displays a screen with introductory information. When you press a key to proceed, you will see the program main screen and menu, as displayed in Figure 4-1.

Figure 4-1 3-d View Main Screen

The bottom line on the screen provides the list of available options. The Variables option allows you to select variables against which to plot glyph, color, or row name (make them supplementary, and moves them to the end of the xgobi file so that they can remain identifiable). The 3-d View option starts the actual processing portion of the program, writing the data in a format that xgobi likes, while Quit moves you to the preceding menu (or out of the program). All menus in the system operate in a similar way: options are selected by moving the highlighted bar to the desired option name with the arrow keys and pressing , or by typing the first character of the option name.

*** NOTE *** Whenever feasible, the programs will use default options and values. These may be preset, computed from the available data, or passed from a previously run program. Be careful! The computer doesn't understand your problem or your data. Defaults make it easy to get a result quickly, not necessarily to get an appropriate result. In this example we will usually use the defaults to get quick results. Try other options to get familiar with the full range of system capabilities.

In 3-d View, the minimum that you MUST do to obtain a plot is:

1. Use the Data option to enter the name of a Geo-EAS data file (or accept the default name, if one is provided).

2. Use the 3-d View option to run xgobi.

The program reads the data file, and automatically executes the Variable option. We want Cadmium as glyph, so we select Cadmium (make it supplementary). Now use the Execute option; it will write the necessary files, and call xgobi. After examining the plots, 3-d spins, and playing around in xgobi to your heart's content, exit, then hit "q" in 3-d View to return to the main menu, and use the Quit option to exit.

To simplify our explanations in the future, an abbreviated notation for the above sequence of events will be used. A general formula exists for each option: initiate the option; then take one or more actions, each of which may result in a screen field taking a particular value. The above sequence thus becomes:

OPTION      ACTION     FIELD          VALUE

---------------------------------------------------

DATA Enter Data File Example.dat

VARIABLE Select variable Cadmium

3-d View play around in xgobi; exit Q Q Y

The resulting Post plot is shown in Figure 4-2.

Figure 4-2: Xgobi, with glyph weighted by Cadmium

From this plot you can see that the samples fall in a rectangular area about 250 by 220 feet. The sample locations are irregularly spaced, and although there are some gaps and clusters, they provide relatively uniform coverage over the entire rectangle. The symbols represent the tenths for the cadmium values. A general trend can be seen in the cadmium values: The highest values occur in a rough E-W band through the center of the plot, while the lowest fall in parallel bands at the north and south margins. The * in the northwest corner of the area seems to be an exception to the trend; it seems too high compared to its surroundings. Such "spatial outliers" should be checked to confirm that their coordinates and data values are valid. For this example, you will assume that this sample is valid.

Now you should use program Stat1 to generate some statistics on the data. When Stat1 is initiated, the Main screen will be displayed as in Figure 4-3.

Figure 4-3 Stat1 Main Screen

The option sequence below is the minimum required to compute univariate statistics and display a histogram and a probability plot for the variable Cadmium. Note that a default file name (Example.dat) has been carried forward from the previous program. When you finish examining the histogram, you do not go directly back to the main menu; an intermediate menu lets you select alternate options for replotting the histogram.

COMMAND ACTION FIELD VALUE

----------------------------------------------------

DATA Accept Data File Example.dat

VARIABLE Select variable Cadmium

Accept weight None

Accept log option Off

EXECUTE

HISTOGRAM

QUIT

PROB.PLOT

The univariate statistics, histogram, and Probability plot are generated. Figure 4-4 displays the univariate statistics for cadmium. Figures 4-5 and 4-6 display the histogram and probability plot for cadmium.

Figure 4-4 Stat1 Results Screen

Figure 4-5 Histogram of Cadmium

Figure 4-6 Cadmium Probability Plot

From the histogram and the statistics, it can be seen that this data set is nearly symmetrical about the mean value (the mean is close to the median, and approximately halfway between the minimum and maximum values). There are no suspect outliers. The probability plot shows that the data set approximates a normal distribution (a probability plot is a cumulative frequency plot scaled so that a normal distribution plots as a straight line). Whether a distribution is normal, log-normal, or something else has no particular geostatistical significance, except that it is often more difficult to interpret variograms for highly skewed distributions such as the log-normal, and in such cases it may be useful to also compute variograms on log-transformed data.

4.3 VARIOGRAM ANALYSIS

The computation, interpretation, and modeling of variograms is the "heart" of a geostatistical study. The variogram model is your interpretation of the spatial correlation structure of the sample data set. It controls the way that kriging weights are assigned to samples during interpolation, and consequently controls the quality of the results.

All interpolation and contouring methods make the assumption that some type of spatial correlation is present, that is, they assume that a measurement at any point represents nearby locations better than locations farther away. Variogram analysis attempts to quantify this relationship: How well can a measurement be expected to represent another location a specific distance (and direction) away? Experimental variograms plot the average difference (actually, one-half the squared difference, or variance) of pairs of measurements against the distances separating the pairs. If you had measurements at all possible sample locations, you could compute the "true variogram" for a site, i.e., the variance of all pairs of measurements which satisfy each combination of distance and direction. In practice, with limited data, you compute the variances for groups of pairs of measurements in class intervals of similar distance and direction. You then plot a graph of the variances versus distance for a particular direction, and fit a model curve to the graph; the model is assumed to be an approximation of the "true variogram".

Continuing with the example, we will use Prevar to create an intermediate file of data pairs, and Vario to compute, plot, and model variograms. No automatic model fitting is provided; we will use Vario to superimpose plots of various model curves on the experimental variogram until we find one that looks right.

Prevar is a simple program with only a few options to allow you to reduce the number of sample pairs in the output file to the maximum allowed by Vario by setting minimum and maximum limits on X and Y, and by setting a maximum distance for pairs. This is necessary when the number of samples in the data set exceeds 181. Upon initiating Prevar, the main screen will be displayed, as in Figure 4-7.

Figure 4-7 Prevar Main Screen

The option sequence below creates the pair comparison file Example.pcf.

OPTION      ACTION     FIELD          VALUE

----------------------------------------------------

FILES Accept Data File Example.dat

Accept Pairs File Example.pcf

EXECUTE

QUIT Answer Y

Next, initiate Vario, and the Vario Main screen is displayed, as in Figure 4-8.

Figure 4-8 Vario Main Screen

The following option sequence reads the pair comparison file into memory, and moves to the next menu:

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

DATA Accept Pairs File Example.pcf

VARIABLE Toggle Variable Cadmium

Accept Log Option Off

OPTIONS/EXECUTE

The Options/Execute menu allows us to specify how we want the experimental variogram to be computed. This screen and menu is displayed in Figure 4-9.

Figure 4-9 Vario Options Screen

First, specify the distance class intervals (lags) and directional tolerances for computing the variogram. Finding the "right" combination is a trial and error exercise, but a systematic approach can be helpful:

To start, you will use the default direction, which is an "omnidirectional" variogram. The angular tolerance of 90 degrees on either side of any specified direction line allows all pairs to be included regardless of direction. This maximizes the number of pairs in each distance class, which usually gives the "best" or smoothest variogram. See Figure 10-3 in Section 10 for an more detailed illustration of the direction parameters. From this omnidirectional variogram we can usually get the best estimate of the y-intercept (nugget) and maximum value (sill) parameters for the variogram model, as well as the best idea of what type of model(s) should be fitted.

Next, try several different lag intervals for plotting the experimental variograms. You are trying to obtain the maximum detail at small distances, (i.e., small lags) without being misled by structural artifacts due to the particular class interval used. You will have more confidence in a model if it fits experimental variograms computed at several different lag intervals.

The default lag intervals are computed from a rule-of-thumb which states that variograms are generally not valid beyond one-half the maximum distance between samples. The maximum pair distance is therefore divided by two, and then subdivided into ten equal distance classes. Round these to the more convenient numbers of 150 and 15, and plot the resulting variogram (Figure 4-10), as follows:

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

NEW LAGS Accept Minimum 0

Input Maximum 150

Input Increment 15

EXECUTE

PLOT

Figure 4-10 Variogram of Cadmium

This variogram shows a well defined structure. Except for the fifth point, which is too low, the shape is typical of a "spherical" model variogram, i.e., an initial linear increase from the Y-intercept curving relatively sharply into a horizontal constant value. The spherical type of variogram is observed frequently in experimental variograms, and is one of the model options available in Geo-EAS. To fit a spherical model to a variogram, you need to estimate the "nugget" or Y-intercept, the "sill" or difference between the nugget and the maximum value, and the "range" or distance at which the model reaches the maximum value. With a little practice, good fits can usually be obtained within two or three tries.

Try an initial model with a nugget of 5, a sill of 11, and a range of 80, using the following option sequence (Note that after variogram model parameters have been entered, the arrow key can be used to exit the model option). The resulting graph is displayed in Figure 4-11.

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

MODEL

MODEL Input Nugget 5

Toggle Type Spherical

Input Sill 11

Input Range 80

PLOT

Figure 4-11 Variogram with model: Nugget=5, Spherical, Sill=11, Range=80

This model fits reasonably well at the nugget and sill, but the initial slope is too steep, indicating that the range is too low. It appears that the curve would fit well if it were shifted about 25% to the right, so you should try again with a range of 100:

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

MODEL Accept Nugget 5

Accept Type Spherical

Accept Sill 11

Input Range 100

PLOT

The resulting graph is displayed in Figure 4-12.

Figure 4-12 Variogram model: Nugget=5, Spherical, Sill=11, Range=100

This is an excellent fit; about as good as you can get with a spherical model. That low fifth point, however, suggests that an exponential model which has a more gentle curvature may also provide a good fit. (If you are unfamiliar with the four types of models available in VARIO, repeat the option sequence above three more times, changing the model type each time.) A bit of trial and error leads to an exponential model with a nugget of 4.5, sill of 13.5, and range of 160. This graph is displayed in Figure 4-13.

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

MODEL Input Nugget 4.5

Toggle Type Exponential

Input Sill 13.5

Input Range 160

PLOT

Figure 4-13 Variogram with Exponential model: nugget=4.5, Sill=13.5, Range=160

Some obvious questions that come up at this point are: Which one of these models is best? How do you decide which one to use? What happens if you pick the wrong one? Unfortunately there aren't any simple answers. The best model is the one which most closely matches the true variogram for the site, but of course, you will never know what that is unless you exhaustively sample the site. Although some form of least squares criteria could be used, in Geo-EAS the selection must be made subjectively, simply by picking the model that looks like the best fit. Sometimes the error distributions obtained from cross-validation can help you to decide which. Fortunately, the differences between the spherical and exponential models above will only cause minor differences in the kriged estimates, so that either one would be an acceptable choice.

If you look at variograms computed at different lags, it will become obvious that the experimental variograms contain quite a bit of noise. The shape of the experimental variogram changes as the lag spacing changes, and the model which appears to fit best at one lag spacing may appear to be the worst at another. Try a lag spacing of ten units. The graph of Figure 4-14 will be displayed.

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

QUIT

QUIT

NEW LAGS Accept Minimum 0

Accept Maximum 150

Input Increment 10

EXECUTE

MODEL

PLOT

Figure 4-14 Variogram with Exponential Model, Lag Spacing=10

Now repeat the above option sequence with a lag increment of 25. The resulting graph is displayed in Figure 4-15.

Figure 4-15 Variogram with Exponential Model, Lag Spacing=25

As you can see, the ability to define the "true" variogram structure is limited by the particular set of data available. The best you can do is to find a model which fits reasonably well over a range of lag spacings. Both of the two models proposed earlier are satisfactory. In practice, kriging estimates calculated with the two variogram models will be almost identical. Kriging standard deviations however, are more sensitive than the estimates to changes in the variogram model, as well as to differences between the "real world" and the assumptions underlying the kriging equations. For this reason, it is generally not wise to interpret kriging standard deviations as a true measure of the estimation error. For the remaining discussion of variograms the exponential model will be used.

At this point in the structural analysis, anisotropy is the major remaining question. When you looked at the post plot of the data, there appeared to be a tendency for similar values to form elongated E-W bands. Now you want to see if directional variograms confirm this effect. Like lag spacings, directions and angular tolerances require a trade-off between resolution and precision. If you plot four directional variograms at angles of 0, 45, 90, and 135 degrees, with a tolerance of 22.5 degrees, you have effectively divided the pairs in our omnidirectional variogram into four subsets. This causes an increase in noise comparable to reducing the lag spacing by a factor of four. It is therefore advisable to use a larger lag interval for computing directional variograms. You should use a lag spacing of 25 to compute the four directional variograms listed above, superimposing the omnidirectional exponential model on the plot. Run the option sequence below four times, changing only the direction angle. Figures 4-16 through 4-19 display the directional variograms for 0, 45, 90, and 135 degrees.

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

QUIT

QUIT

DIRECTION Input Direction 0

Input Tolerance 22.5

Accept Bandwidth MAX

EXECUTE

MODEL

PLOT

Figure 4-16 Variogram , Exponential Model, Direction = 0

Figure 4-17 Variogram, Exponential Model, Direction=45>

Figure 4-18 Variogram, Exponential Model, Direction=90

Figure 4-19 Variogram, Exponential Model, Direction=135

These four variograms provide a good illustration of why you usually model the omnidirectional variogram first. In the directional variograms, most of the definition of shape, nugget, and sill has been obscured. You can see, however, a general confirmation of the assumption about the anisotropy. The points on the 0 degree variogram all fall below the omnidirectional model, suggesting the range in that direction should be longer. The opposite is true for the 90 degree variogram, while the 45 and 135 degree variograms are reasonably well fitted by the omnidirectional model. Obviously, one cannot be too precise about fitting ranges to these directional variograms. Likewise, it would not be worthwhile to attempt a more precise definition of the direction of maximum and minimum range.

The program Krige assumes that the directional variogram model ranges form an elliptical pattern. It is therefore only necessary to fit models to the major and minor axis directions to define the entire 2-D structure. One could make a case for the range of the major axis (0 degrees) of the exponential model being anywhere between 250 and 400 units, and the minor axis (90 degrees) being between 60 and 120 units. We will settle on a model with major and minor axes of 300 and 100 units, respectively, and move on to kriging.

A note on alternate types of variograms -- The Type option on the Variogram Results screen allows you to select and plot any of three alternate estimators of spatial variability. These are sometimes less sensitive to outliers, skewed distributions, or clustered data than ordinary variograms and may help you recognize a structure when the ordinary variogram is too noisy. The relative variogram is analogous to the relative standard deviation often used to measure analytical variability. The "madogram" plots the mean absolute differences. The non-ergodic variogram is a relatively new method (Srivastiva, 1987) based on estimates of covariance rather than variance. Non-ergodic variograms have the same units (measurement units squared) as ordinary variograms and may be modeled and used for kriging in the same way. Relative variograms are unitless (decimal fraction squares). When modeled and used for kriging the relative kriging standard deviations must be multiplied by the estimated values to be comparable with kriging standard deviations produced with ordinary variogram models. Madograms are not "true variograms" because they are not based upon squared differences. In general, kriging with madogram models is not recommended.

4.4 KRIGING AND CONTOURING

The program Krige produces a regular grid of interpolated point or block estimates using either "Ordinary" or "Simple" kriging. The default option, ordinary block kriging, is recommended for most environmental applications. Point kriging usually provides estimates very similar to those from block kriging, but if a point being estimated happens to coincide with a sampled location, the estimate is set equal to the sample value. This is not appropriate for contour mapping, which implicitly requires a spatial estimator. Ordinary kriging estimates the point or block values with a weighted average of the sample values within a local search neighborhood, or ellipse, centered on the point or block. Simple kriging also assigns a weight to the population mean, and is in effect making a strong assumption that the mean value is constant over the site; it also requires that the available data be adequate to provide a good estimate of the mean.

In order to execute Krige we must provide the names of the data file and an output grid file, and we must select a variable and enter a variogram model. The program computes a default 10x10 grid, which we will usually want to override with more convenient "round" numbers. The default search ellipse is a circle with a radius about one fourth the maximum x or y dimension of the site, which should be adequate for most cases. The purpose of the search is to reduce computation time by eliminating from the kriging system of equations those samples which are unlikely to get "significant weights". The default search strategy is to treat the search circle as a single "sector", to examine all samples within it and use at least one, but not more than the closest eight. The number of samples required for kriging is related to the value of the nugget term in the variogram compared to the maximum variogram value possible within the search area. The higher the nugget, the more likely that more distant samples will get significant weights. A rough rule of thumb would be to use eight samples when the nugget is near zero, increasing to twenty when the nugget is more than 50% of the maximum value. The more complex sector search options may be useful when you have unusual patterns or clusters of data. We will accept the default search for now, and check during kriging to see how well it works. Initiate program Krige, and the main screen will be displayed (Figure 4-20).

The Krige main menu allows you to retrieve previously saved kriging parameters, or to save the parameters you have just used before exiting the program. Because this is your first attempt with this data set, you must use the Options/Execute option to go directly to the Options screen and menu. This screen is displayed in Figure 4-21.

Figure 4-20 Krige Main Screen

Figure 4-21 Krige Options Screen>

Use the following option sequence to specify the file names and grid parameters, and to proceed to the next menu (Figure 4-22):

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

OPTIONS/EXECUTE

DATA Accept Data File Example.dat

Accept Grid File Example.grd

GRID Accept X variable Easting

Accept Y variable Northing

Input X origin 260

Input Y origin 120

Input X cell size 20

Input Y cell size 20

Input X # cells 13

Input Y # cells 11

VARIABLES/MODELS

Figure 4-22 Krige Variables/Models Screen

The following option sequence selects Cadmium as the variable to be kriged, enters the variogram model, executes the kriging routine, and saves the parameter file:

OPTION         ACTION     FIELD            VALUE

----------------------------------------------------------

NEW VARIABLE Toggle Variable Cadmium

Input Nugget 4.5

Toggle Type Exponential

Input Sill value 13.5

Input Major range 300

Input Minor range 100

Accept Angle 0

QUIT

EXECUTE

QUIT

SAVE PARAM. Accept Parameter file Example.kpf

Upon selection of the Execute option, the program first displays a color-coded sample location map on the screen, and then overlays this with a color-coded/shaded grid cell as each kriged estimate is computed. At the bottom of the screen, a summary line for each estimate is displayed. As you watch this proceed, you may note that the number of samples being used is only less than the specified 8 for the exterior blocks of the grid, indicating that the default search radii are adequate to obtain 8 neighbors. You can use the "debug" options during the kriging computations to help you understand what is actually happening in the program, and to decide whether you need to change the search options. Activating "n" provides you with a map showing the search ellipse and the samples selected for kriging the current block. The goal of the search is to include all of the samples which are relevant to the estimate, while avoiding spending a lot of time computing negligible weights for samples that do not matter. The "w" key lets you look at a list of the selected samples' coordinates, distances from the block center, and kriging weights. Use to continue to subsequent displays.

Given that sample weights must sum to 1.0, it seems reasonable to conclude that samples with weights of less than 0.01 can be neglected without significantly affecting the kriging results. The goal of your search would therefore be to consistently find a set of samples such that the lowest two or three weights would be at or just below 0.01. When you examine the lists of weights for a number of blocks during this run, you find the lowest weights to generally be in the range of 0.02 or 0.03. This is not really bad, but it might have been better to raise the maximum number of samples to 10 or 15 (and increase the search radii, if necessary).

The kriged results have been written to the file Example.grd. Proceed to contour this grid of kriged values using the your favorite contouring program

4.5 SUMMARY AND EXERCISES

The example exercise just completed contained the basic elements of any geostatistical study. You started with a sample data set, conducted an exploratory statistical analysis, interpreted the spatial correlation structure of the data and inferred an underlying variogram model, and used the model to interpolate a grid of kriged estimates. In the process you had to make a number of "judgment calls" which affected the results. You treated the data set as representing a single population. You chose not to delete any outliers. You chose to represent the spatial correlation structure of the data with an anisotropic exponential model plus a nugget term. You accepted the default kriging option of ordinary block kriging (with blocks approximated by a 2x2 grid of points). Finally, a conclusion was drawn that the default maximum of 8 samples in the kriging program should be increased.

Readers with little or no previous experience in geostatistical analysis may not feel comfortable with this entire process. It is not always obvious which of these factors are most significant and which if any can be ignored. Nor is it easy to define the point at which to conclude that you have done the best you can, given the quality and quantity of data. The best remedy for this situation is practice: Rerun KRIGE with Example.dat, using various combinations of the variogram model, search strategy, kriging type, grid size, etc., until you get a feel for how these factors interact. The two exercises below suggest ways of comparing the results from different kriging options, and also utilize some of the other Geo-EAS programs.

EXERCISE 1 - Compare Anisotropic vs. Isotropic variograms

Step 1. Run Trans to read Example.dat and write a new file named Compare.dat. Use the Create option to create a new variable Cd1 equal to the old variable Cadmium (by adding a constant 0 to the variable Cadmium). This step will help to avoid the problem of creating two kriged variables with the same name. Repeat the process for another new variable Cd2. Delete the variables Arsenic, Cadmium, and Lead, and save the result in Compare.dat.

Step 2. Run Krige with the data file Compare.dat to create a file of kriged estimates called Compare.grd. Krige both variables Cd1 and Cd2 in the same run. Krige Cd1 with the anisotropic exponential variogram model you just used in the example. Krige Cd2 with the equivalent isotropic model (major and minor ranges both equal 160).

Step 3. Run Trans with the grid file Compare.grd to create a new variable called *Cd1-*Cd2. Save the results back into Compare.grd.

Step 4. Run your favorite contouring program with the grid file Compare.grd to plot contour maps of *Cd1, *Cd2, and *Cd1-*Cd2.

Step 5. Run Scatter with the grid file Compare.grd to plot *Cd1 vs. *Cd2.

Step 6. Run Stat1 with the grid file Compare.grd to plot histograms of *Cd1, *Cd2, and *Cd1-*Cd2.

EXERCISE 2 - Compare Point vs. Block kriging

Step 1. Run Krige with the data file Compare.grd to create a grid file of kriged estimates called Comp1.grd. Use the isotropic exponential model, and krige Cd1 with ordinary point kriging.

Step 2. Repeat Step 1 with a grid file called Comp2.grd. Krige Cd2 with ordinary block kriging (2x2), keeping all other parameters the same.

Step 3. Run Dataprep to merge Comp2.grd into Comp1.grd.

Step 4. Repeat Steps 3-6 from Exercise 1, using Comp1.grd

These exercises can be repeated to make other interesting comparisons. For example, compare the results of kriging with a variogram model consisting of only a nugget term vs. a spherical or exponential model with zero nugget. Or compare kriging with a maximum of 4 samples vs. 20 samples, all other parameters being equal.

SECTION 5: DATAPREP

5.1 WHAT DATAPREP DOES

Dataprep provides utilities for Geo-EAS data files. Dataprep has two divisions, the UNIX Utilities and the File Operations. The UNIX Utilities allows access to commonly used UNIX commands. The File Operations include utilities to manipulate Geo-EAS data files.

The Dataprep File Operations use temporary files called Scratch Files for storing data read from a Geo-EAS data file and for processing. The Append and Merge operations require an additional temporary file for processing. The temporary files are called ZZSCTCH1.FIL and ZZSCTCH2.FIL. Each time an operation is executed the temporary files are generated and then deleted after processing is complete.

5.2 DATA LIMITS

Dataprep reads and generates Geo-EAS data files containing a maximum of 48 variables and 10,000 samples. If a file should contain more than 10,000 samples then only the first 10,000 samples are read and a warning message is displayed. The file operations will generate output data files containing up to 10,000 samples. If a file operation should generate an output data file requiring more than 10,000 samples then only the first 10,000 samples are written. In such a case no warning messages are displayed.

5.3 THE MENU HIERARCHY

Dataprep__ Prefix

_ UNIX Utilities ____ Directory ________ Directory

_ _ _ Execute

_ _ _ Quit

_ _ Print ____________ File

_ _ _ Execute

_ _ _ Quit

_ _ List _____________ File

_ _ _ Execute

_ _ _ Quit

_ _ Copy _____________ Files

_ _ _ Execute

_ _ _ Quit

_ _ Rename ___________ Files

_ _ _ Execute

_ _ _ Quit

_ _ Delete ___________ File

_ _ _ Execute

_ _ _ Quit

_ _ UNIX Command

_ _ Quit

_

_ File Operations __ Append____________ Files

_ _ _ Execute

_ _ _ Quit

_ _ Column Extract____ Files

_ _ _ Variables

_ _ _ Execute

_ _ _ Quit

_ _ Row Extract_______ Files

_ _ _ Subsetting Condition

_ _ _ Execute

_ _ _ Quit

_ _ Compress__________ Files

_ _ _ Execute

_ _ _ Quit

_ _ ID Variable_______ Files

_ _ _ Execute

_ _ _ Quit

_ _ Merge_____________ Files

_ _ _ Execute

_ _ _ Quit

_ _ Report____________ Files

_ _ _ Variables

_ _ _ Execute

_ _ _ Quit

_ _ Sort______________ Files

_ _ _ Variable

_ _ _ Execute

_ _ Quit _ Quit

_ Quit

5.4 THE MAIN MENU

The Main screen and menu (Figure 5-1) provides options which allow you to access the UNIX Utilities and File Operations menus. The menu line appears as follows:

Prefix UNIX Utilities File Operations Quit

Prefix

The Prefix option is used to enter the prefix strings for the data file and the scratch file.

UNIX Utilities

The UNIX Utilities option provides access to the UNIX Utilities menu. The UNIX Utilities menu provides access to the UNIX commands such as Directory, Print, List (same as the UNIX "MORE" command), Copy, Rename, and Delete. Refer to your UNIX manual for further information on these UNIX commands. The UNIX Utilities menu is discussed below.

File Operations

The File Operations option provides access to the File Operations menu. The File Operations menu provides the following file operations: Append, Column (variable) extract, Row extract, Compress, ID Variable, Merge, Report and Sort. The File Operations menu is discussed below. .pa

5.5 THE UNIX UTILITIES MENU

The UNIX Utilities screen and menu (Figure 5-2) provides commonly used UNIX commands. These commands operate just as the UNIX commands do. Refer to your UNIX manual for further information. To select an option from the vertical menu, use the or arrow key (not the or arrow key) to position the cursor bar then press the key.

Directory

Upon selection of this option the Directory menu is displayed. The Directory menu provides the options necessary to list all directory entries or only those for specified files. The directory option is similar to the UNIX command "Dir". The menu line appears as follows:

Directory Execute

Directory - This option results in a prompt for a character string. The string can be either a directory or file name. The string is entered into an alphanumeric field that is 55 characters long. If the field is left blank then the current directory is used by default.

Execute - Selection of this option causes the directory to be displayed on a new screen. An error message appears if the directory is non-existent. After the directory is listed the message "Press to return to menu" is displayed. The key returns control to the Directory menu.

Print

Upon selection of this option the Print menu is displayed.

The Print menu provides the options necessary to print a file.

The Print option is similar to the UNIX command "Print".

The menu line appears as follows:

Files Execute Quit

Files - Upon selecting this option, you are prompted for the name of the file that is to be printed.

Execute - Selection of this option causes the specified file to be printed. A yes/no prompt appears asking if the printer is ready. If yes is indicated then the UNIX Print command is executed. When printing begins the Print menu screen is displayed and a message appears indicating that the file is being printed.

List

Upon selection of this option the List menu is displayed. The List menu provides the options necessary to list a file. The List option is similar to the UNIX command "More". The menu line appears as follows:

File Execute Quit

File - You will be prompted for the name of the file that is to be listed.

Execute - Upon selection of this option UNIX executes the "More" command. The screen is cleared and the contents of the file are displayed. When the screen is filled the message "More" appears on the last line. The key causes another page of the file's contents to be displayed. After the entire file has been listed, press to display the List menu screen and control is returned to the List menu.

Copy

Upon selection of this option the Copy menu is displayed. The Copy menu provides the options necessary to copy a file. The menu line appears as follows:

Files Execute Quit

Files - You are prompted for two file names. Upon execution, the contents of the first data file are copied over to the second data file.

Execute - Upon selection of this option UNIX executes the "Copy" command. When the file has been copied a message is displayed. An error message appears whenever the "Copy " command could not be successfully executed.

Rename

Upon selection of this option the Rename menu is displayed. The Rename menu provides the options necessary to rename a file. The menu line appears as follows:

Files Execute Quit

Files - You are prompted for two file names. The name of the file specified first is renamed to the second file name given. If a data file exists with the second file name, then an error message appears.

Execute - Upon selection of this option, UNIX executes the "Rename" command. A message stating that the file has been renamed appears after the process is complete.

Delete

Upon selection of this option the Delete menu will be displayed. The Delete option uses the UNIX command "Del". The menu line appears as follows:

File Execute Quit

File - You will be prompted for the name of the file to be deleted.

Execute - Upon selection of this option, UNIX executes the "Delete" command. After the file has been deleted an informative message is displayed.

UNIX Command

Upon selection of this option you will access UNIX. To return to Dataprep type the command "Exit ".

5.6 THE FILE OPERATIONS MENU

The File Operations screen and menu (Figure 5-3) provides a set of useful operations for manipulating Geo-EAS data files. To select an option from the vertical menu, use or (not or ) to position the cursor, and press to select the option. The two Geo-EAS input data files called Demo1.dat and Demo2.dat are used to demonstrate each file operation discussed below.

Demo1.dat - ficticious data set 1

3

Easting feet

Northing feet

Arsenic ppm

320.0 311.0 .850

119.0 119.0 .630

115.0 111.0 .560

114.0 269.0 1.020

114.0 269.0 1.020

431.0 137.0 .67

Demo2.dat - ficticious data set 2

3

Easting feet

Northing feet

Lead ppm

102.0 164.0 .300

122.0 137.0 .360

116.0 119.0 .700

150.0 315.0 .500

148.0 291.0 .710

Append

Upon selection of this option the Append menu will be displayed. The Append menu provides the options necessary to append two Geo- EAS data files. A discussion on the append operation follows in the Execute option. The menu line appears as follows:

Files Execute Quit

Files - You are prompted for three Geo-EAS data file names. Upon execution the second file name specified is appended to the first file name entered. The first and second file names cannot be the same. If they are then an error message appears. The third file name entered is the name of the output file.

Execute - Upon selection of this option, Dataprep will append the second file to the first file and store the results in the output file. After the operation is complete the output variables are displayed on the screen and a message is displayed. Upon pressing any key, control is returned to the Append menu. A discussion of the Append operation follows:

Assume the first input file will be called File1 and the second input file called File2. Variables that exist in both input files (i.e. have identical variable names) will be combined into one variable with the records of File1 preceding those of File2. The difference is that for File1 variables the records normally occupied by File2 variables are filled with missing values(1.E+31). For File2 variables the records normally occupied by File1 variables are filled with missing values. As an example the output file Out1.dat was generated when Demo1.dat (above) as File1 and Demo2.dat (above) as File2 were appended. Out1.dat is shown below.

Out1.dat

Demo1.dat - ficticious data set 1

4

Easting feet

Northing feet

Arsenic ppm

Lead ppm

320.0 311.0 .850 .10E+32

119.0 119.0 .630 .10E+32

115.0 111.0 .560 .10E+32

114.0 269.0 1.02 .10E+32

114.0 269.0 1.02 .10E+32

431.0 137.0 .670 .10E+32

102.0 164.0 .10E+32 .300

122.0 137.0 .10E+32 .360

116.0 119.0 .10E+32 .700

150.0 315.0 .10E+32 .500

148.0 291.0 .10E+32 .710

Column Extract

Upon selection of this option the Column Extract menu is displayed. The Column Extract menu provides the options necessary to create a Geo-EAS data file with variables extracted from an input file. The menu line appears as follows:

Files Variables Execute Quit

Files - This option is used to specify two file names. The first, an input file, is a Geo-EAS data file. The second file is a Geo-EAS output file.

Variables - This option is used to specify one or more variables which will be extracted and written to the output file when the Execute option is selected. The variables are selected from a toggle field. After each selection, the variable name is displayed on the screen and a yes/no prompt appears asking if you want to select another variable. If yes is indicated then you are prompted for another variable. If no is indicated then control is returned to the Column Extract menu.

Execute - Upon selection of this option, the selected variables and related data are copied and stored in the output file. When the operation is complete, a message is displayed. Upon pressing any key, control is returned to the Column Extract menu.

Row Extract

Upon selection of this option the Row Extract menu is displayed. The Row Extract menu provides the options necessary to perform row extraction. This operation extracts samples from the specified input Geo-EAS data file based upon a test condition. The menu line appears as follows:

Files Subsetting Condition Execute Quit

Files - This option is used to specify two files. The name of first file entered, an input file, is a Geo-EAS data file. The name of the second file entered is an output file.

Subsetting Condition - This option is used to specify a test condition. The test condition is of the form , where is a variable and can be a variable or a constant. can be:

Upon selection of the Subsetting Condition option

the variable names stored in the input file are displayed. You are prompted for a variable for the Operand1 from a toggle field. After the selection you are prompted for a logical operator from the toggle field. After this selection, control is passed to the Row Extract Operand menu. The Row Extract Operand menu is the next menu discussed. After a selection is made for the second operand, control is returned to the Row Extract

menu.

The Row Extract Operand menu provides the options necessary to select a variable or enter a constant for Operand2. The menu line appears as follows:

Constant Variable

Constant - This option is used to specify a constant(floating point) number for Operand2. The constant is entered into a numeric field. If a constant causes a numeric overflow, then a message appears informing you of the error (pressing any key returns control to the Row Extract Operand menu). After a valid constant has been entered, pressing returns control to the Row Extract menu.

Variable - This option is used to specify a variable for the Operand2. The variable is selected from a toggle field. After the selection, pressing returns control to the Row Extract menu.

Execute - When selecting the Execute option, the Row Extraction process is initiated. Each sample of Operand1, described above, is checked to see if it satisfies the test condition. If so then that entire line of data is extracted from the input data file and stored in a temporary file. After this process is complete, the data from the temporary file are stored in the output file. When the data have been stored to the output file and the execution complete, you are informed with a message stating that the data has been sent to the output file. If no samples satisfy the test condition, then a message appears and no data are stored. In both cases, pressing any key returns control to the Row Extract menu.

As an example of the Row Extraction operation, the following output data file, Out2.dat, is generated when Demo1.dat is the specified input file and the test condition is: Easting .LE. Northing.

Out2.dat

Demo1.dat - ficticious data set 1

3

Easting feet

Northing feet

Arsenic ppm

119.0 119.0 .630

114.0 269.0 1.02

114.0 269.0 1.02

Each data record in Demo1.dat is examined by the program. If the test condition is true, then that line in the file is extracted and stored in the specified output file. Only those records satisfying the specified logical expression will be extracted and stored in the output file.

Compress

Upon selection of this option the Compress menu is displayed. The Compress menu provides the options necessary to compress a Geo-EAS data file. This operation eliminates any duplicate data records that appear in a specified input Geo-EAS data file and stores the results in the specified output Geo-EAS data file. An example is shown in the Execute option. The menu line appears as follows:

Files Execute Quit

Files - This option is used to specify two data file names. The first, an input file, is a Geo-EAS data file. The second is an output file. After the two files have been specified, the variable names stored in the input file are displayed.

Execute - Upon selection of this option, the data from the input file are stored in a temporary file. Duplicate records are deleted and the data are stored in the specified output file. When the entire process is complete, a message stating that the data have been sent to the output file is displayed. Upon pressing any key, control is returned to the Compress menu.

As an example of the Compress operation, the following output file, Out5.dat, was generated when the input file, Demo1.dat, was compressed. Out5.dat follows.

Out5.dat

Demo1.dat - ficticious data set 1

3

Easting feet

Northing feet

Arsenic ppm

320.0 311.0 .850

119.0 119.0 .630

115.0 111.0 .560

114.0 269.0 1.02

431.0 137.0 .670

Note that all duplicate records have been deleted.

ID Variable

Upon selection of this option the ID Variable menu is displayed. The ID Variable menu provides the options to create the variable "Sequence #". The Sequence # denotes the sequential position of the data in the input file. An example of this is shown in the Execute option. The menu line appears as follows:

Files Execute Quit

Files - This option is used to specify two Geo-EAS data file names. The first is an input data file. The second is an output file. After the two files have been specified, the variable names stored in the input file are displayed.

Execute - This option appends a variable called "Sequence #" to the current listing of variables displayed on the screen. The variable name "Sequence #" is appended to the variable names read from the input file. These variable names are then stored in the specified output file. An example of such an output file is Out6.dat.

Out6.dat

Demo1.dat - ficticious data set 1

4

Easting feet

Northing feet

Arsenic ppm

Sequence #

320.0 311.0 .850 1.

119.0 119.0 .630 2.

115.0 111.0 .560 3.

114.0 269.0 1.02 4.

114.0 269.0 1.02 5.

431.0 137.0 .670 6.

Merge

Upon selection of this option the Merge menu is displayed. The Merge menu provides the options necessary to merge two Geo-EAS data files. The details of this operation are discussed in the Execute option. The menu line appears as follows:

Files Execute Quit

Files - This option is used to specify the names of two Geo- EAS data files which are to be merged, and the name of an output file, which will contain the results. The files to be merged cannot have the same file name, or an error message appears.

Execute - When this option is selected, the Merge process is initiated. After this process is complete the variables from the output file are displayed and a message indicating that the data have been sent to the output file appears. Pressing any key returns control to the Merge menu. To describe the merge process, assume that File1 is the first file name entered and File2 is the second file name entered. If one input file is smaller than the other input file, then the smaller file has missing values added to its variables. Variables that appear in both input File1 and input File2 are combined with the samples of File1 preceding those of File2. The specified output file stores the results of the merged files. An example best demonstrates the merge operation. The output file, Out3.dat, which follows, displays the merging of input files Demo1.dat as File1 and Demo2.dat as File2.

Out3.dat

Demo1.dat - ficticious data set 1

4

Easting feet

Northing feet

Arsenic ppm

Lead ppm

320.0 311.0 .850 .300

119.0 119.0 .630 .360

115.0 111.0 .560 .700

114.0 269.0 1.02 .500

114.0 269.0 1.020 .710

431.0 137.00 .6700 1.E31

*** NOTE *** The variables appearing in both input files are combined into one variable (Easting and Northing), and these variables took on the values from the first file.

Report

Upon selection of this option the Report menu is displayed. The Report menu provides the options necessary to generate a listing of specified variables in "Report" form. The "Report" listing is described in the Execute option. The menu line appears as follows:

Files Variables Execute Quit

Files - This option is used to specify two file names, the input and output files. After the file names have been entered, you are prompted for the sequence option from the toggle field. If "on" is selected for the sequence option, then the observation (row) number is indicated on the "Report" listing.

Variables - This option is used to specify those variables that are to be included in the "Report" listing. You are prompted for a variable to be selected from a toggle field. A yes/no prompt provides an opportunity to select another variable. Indicating no causes control to be returned to the Report menu. The list of variables are displayed on the screen after each selection.

Execute - The "Report" listing will be generated upon selection of this option. This listing is then stored in the specified output file. When this operation is complete a message appears indicating that the data have been sent to the output file. Upon pressing any key, control is returned to the Report menu. The following is a description of the "Report" listing.

The "Report" listing can be generated with or without the sequence option enabled. If the sequence option is enabled then each data record will be preceded by an observation number (record number). Each page lists up to four variables with 50 data records each. If more than four variables are selected then all the data records for the first four variables are printed. The page numbering is reset to one and the next four variables are printed. An example report with the sequence option enabled follows.

Example output from the Report option:

                                                              Page      1

Demo1.dat - ficticious data set 1

Easting Northing Lead Arsenic

Obs. feet feet ppm ppm

1. 320.0000 311.0000 .8500000 1.000000

2. 119.0000 119.0000 .6300000 2.000000

3. 115.0000 111.0000 .5600000 3.000000

. . . . .

. . . . .

Sort

Upon selection of this option the Sort menu is displayed. The Sort menu provides the options necessary to sort a variable in ascending order. If a selected variable's sample has to be relocated, then the entire record associated with that sample is also moved. An example of the Sort operation is shown in the Execute option. The menu line appears as follows:

Files Variable Execute Quit

Files - This option is used to specify two Geo-EAS file names, the first is the input file and the second is an output file.

Variable - This option is used to specify the variable to be sorted. You are prompted for a variable which is selected from a toggle field. After the selection, pressing any key returns control to the Sort menu.

Execute - Upon selection of this option, the specified variable is sorted in ascending order. When the process is complete, the data are stored in the specified output file. A message appears informing you that the data have been sent to the output file. Upon pressing any key, control is returned to the Sort menu.

As an example the following output file, Out4.dat, was generated when the variable Easting of the input file, Demo1.dat, was sorted.

Out4.dat

Demo1.dat - ficticious data set 1

3

Easting feet

Northing feet

Arsenic ppm

114.0 269.0 1.02

114.0 269.0 1.02

115.0 111.0 .560

119.0 119.0 .630

320.0 311.0 .850

431.0 137.0 .670

The variable Easting appears in ascending order. Note that not only have the values of the variable Easting been relocated, but all values (the line) associated with Easting has been moved.

SECTION 6: TRANS

6.1 WHAT TRANS DOES

Trans was designed to create, delete, or modify Geo-EAS data file variables. Refer to the section on Geo-EAS data files for more information on input data. The operations may be unary (one operand, one operator), binary (two operands, one operator), or a indicator transform operation, described below. An operand may be either a variable or a constant. The operator maybe an operation, such as addition or finding the square root. The results generated by the specified operation may replace the contents of an existing variable or a new variable may be created. The variable specified to accept the results is called the result variable. Missing values may be generated in two circumstances: when an operand is a missing value, or when an operation is undefined (as in division by zero).

Trans uses a temporary file called a Scratch File to store the data read from a Geo-EAS data file. The temporary file is called ZZSCTCH1.FIL. The Read option in the Main menu (described below) reads the Geo-EAS data file and stores the data in the temporary file. Each time an operation is performed, the required data are retrieved from the temporary file and the newly generated data are stored in the temporary file. A list of variable names is displayed on the screen to indicate which variables reside in the scratch file. If a variable is deleted using the Delete option from the Main menu, then that data are deleted from the temporary file. The variable name is also deleted from the screen. The Save option from the Main menu is used to move the data from the temporary file to the specified Geo-EAS output data file.

6.2 DATA LIMITS

Trans reads as well as generates Geo-EAS data files containing a maximum of 48 variables and 10,000 samples. If a file should contain more than 10,000 samples then only the first 10,000 samples are read and a warning message is displayed. If an attempt is made to create a 49th variable then an error message

is displayed.

6.3 THE MENU HIERARCHY

Trans __Prefix

_

_Read

_

_Title

_

_Create __ New Variable ____ Unary Operation ->

_ _ _ _

_ _ Old Variable __ _ Binary Operation ->

_ _ _

_ _ _ Indicator Transform ->

_ _ _

_ _ _ Quit

_ _

_ _ Quit

_

_ Delete

_

_ Save

_

_ Quit

-> Unary Operation ______ Operation __ Constant ____ Execute

_ _ _ _

_ _ Variable __ _ Quit

_ _

_ _ Quit

_

_ Quit

-> Binary Operation _____ Constant ____ Operation __ Constant ____ Execute

_ _ _ _ _ _

_ Variable __ _ _ Variable __ _ Quit

_ _ _

_ _ _ Quit

_ _

_ _ Quit

_

_ Quit

-> Indicator Transform __ Variable _ Cutoff

_

_ Execute

_

_ Quit

6.4 THE MAIN MENU

The Main screen and menu (Figure 6-1) provide the options necessary to read and save Geo-EAS data files and to create, delete, or modify variables. The menu line appears as follows:

Prefix

The Prefix option is used to enter the prefix for the data file and scratch file.

Data

The Data option is used to enter the name of a Geo-EAS data file.

Title

The Title option is used to specify a descriptive title for the output Geo-EAS data file. The title can be up to 66 alphanumeric characters. No error checking is performed.

Create

The Create option provides access to the Create menu. The Create menu (described below) is the first in a series of menus that provide options used to specify a new or existing variable as the result variable and to perform a specified operation (unary, binary, transform indicator).

Delete

The Delete option is used to select an existing variable that is to be deleted. The variable is selected from a toggle field. Upon making the selection the message "Do you really want to delete this variable?...(Y/N)" is displayed. If is pressed the variable is deleted from the temporary file and from the list of variable names on the screen. Subsequent to entering your choice, control is returned to the Main menu.

Save

When the Save option is selected, the data stored in the temporary file are written to a specified data file. A prompt is issued for the output Geo-EAS data file name. If blanks are entered then an error message is displayed. If the output file already exists then a Yes/No prompt will appear asking if the file should be overwritten. The data in the temporary file is copied to the output Geo-EAS data file. If an error should occur while opening the output file then an error message is displayed. While the file is being saved the message "Writing data..." appears. After the data has successfully been written a message is displayed (press any key to continue).

6.5 THE CREATE MENU

The Create menu provides the options necessary to perform transformations to a Geo-EAS data file. The menu line appears as follows:

New Variable Old Variable Quit

New Variable

When the New Variable option is selected a new variable is created in the scratch file. The new variable takes on values generated from the specified operation (unary, binary, indicator transform). You are prompted for the new variable name. If the given variable name is blank then a message to this effect appears (pressing any key returns control to the Create menu). If the variable name already exists then a warning message appears (pressing any key passes control to the Operation menu). If the variable name is unique to the scratch file then you are prompted for a description of the measurements. The field for the measurements description can accept up to 10 alphanumeric characters. After entering the measurements description, control is passed to the Operation menu (described below).

Old Variable

This option is used to specify an existing variable whose contents are replaced by the results of the specified operation (unary, binary, indicator transform). You are prompted for a variable from the toggle field. After making the selection, you are asked if you wish to change the variable's current name or measurements description. When indicating yes, you may retain or change the name by entering up to 10 alphanumeric characters into the field. If blanks are entered then an error message is displayed (you are then returned to the alphanumeric field and prompted for another variable name). If the given variable name is the same as the existing variable, then a message appears and you are prompted for a new variable name. You are then prompted for a measurements description. The measurements description of the old variable is displayed by default. You may retain or enter another description. Pressing

causes the Operation menu (described below) to be activated.

6.6 THE OPERATION MENU

The Operation menu provides a selection of transformation operations. The menu line appears as follows:

Unary Operation Binary Operation Indicator Transform Quit

Unary Operation

This option provides access to the Unary Operation menu. The Unary Operation menu (described below) is the first of three menus that provides options needed to complete a unary operation. After a selection is made from one menu, control is passed to the succeeding menu. The Unary Operation menu provides a selection of unary operators. Next, the Unary Operand menu provides the choice of a constant or an existing variable for the operand. Finally, the Execute menu provides the Execute option.

The unary operations perform an operation on one operand. The

operations are shown below as they would appear in the toggle

field:

The Unary Operation menu provides an option to select a unary operation. The menu line appears as follows:

Operation - This option is used to specify the unary operator. You are prompted for a unary operator from the toggle field. Upon making the selection, control is passed to the Unary operand menu (described below).

The Unary Operand menu provides the option to select an operand for the unary operation. The menu line appears as follows:

Constant - This option is used to assign a constant value to the unary operand. You are are prompted for the constant value. The constant value is entered into a numeric(floating point) field. The default value is initially zero. If the value entered results in a numeric overflow then a message is displayed. Upon pressing any key, control is returned to the Unary Operand menu. If the constant value entered is acceptable then control is passed to the Execute menu (described below).

Variable - This option is used to select an existing variable for use as the unary operand from a toggle field. Upon making the selection, control is passed to the Execute menu (described below).

The Execute menu provides the option necessary to initiate any unary operation. The menu line appears as follows:

Execute - This option is used to initiate the unary operation. During processing the message "Processing data..." is displayed on the screen. After processing is complete the minimum and maximum values of the variable will be displayed on the message line. After pressing any key, you are asked for the number of significant digits for the output format. The range of digits is 1 through 12 inclusive. A FORTRAN format (which is the output format) is constructed based on the expression Gx + 7.x where x is the digit entered by you, ie., significant digits. For example, if 7 is entered then G14.7 is the output format. For more information on the FORTRAN formats refer to a FORTRAN reference manual. If the digit for x is outside the acceptable range then 9 will be used by default. If the given format is not appropriate for writing the variable to the output file then a message is displayed and you are prompted for a new digit for x. You can best select x by determining (using the minimum and maximum values displayed on the message line) the maximum number of digits to the left of the decimal point. Then decide the maximum number of digits to the right of the decimal point. The sum of the two values is x. Increase the sum by one if the value is negative. If any missing values are generated then the number of missing values and the name of the variable are displayed on the screen. After processing is complete, control is returned to the Main menu. If an operand is a constant and contains a missing value then the result of that operation will be a missing value.

If the operand is a variable as opposed to a constant, and a sample from that variable results in an undefined operation, then a missing value is generated. Whenever an error message is displayed, processing is halted and control is returned to the Operation menu. As noted previously, the operation (exp) may result in a numeric overflow causing the program to abnormally terminate (crash).

Binary Operation

This option provides access to the Binary Operation menu. The Binary Operation menu is the first of four menus that provides options needed to complete a binary operation. After a selection is made from one menu, control is passed to the succeeding menu. First, the Binary Operand One menu (described below) provides a selection of a constant or an existing variable for operand one. Second, the Binary Operation menu provides a selection of binary operators. Next, the Binary Operand Two menu is similar to the Binary Operand One menu with the exception that the selection is for operand two. Finally, the Execute menu provides the Execute option.

The binary operations perform an operation requiring two operands. The operations are shown below as they would appear in the toggle field:

The Binary Operand One menu provides the options to select operand1 for the binary operation. The menu line appears as follows:

Constant - This option is used to assign a constant value to the binary operand one. This option is similar to the constant option noted in the Unary Operand menu section (described earlier). The exceptions are that the wording in the messages refers to the binary operand one and not the unary operand, and upon entering a valid constant value, control is passed to the Binary Operation menu (described below). If the constant value is unacceptable then control is returned to the Binary Operand one menu.

Variable - This option is used to assign an existing variable to the binary operand one. This option is similar to the variable option noted in the Unary Operand menu section (described earlier). The exceptions are that the wording refers to binary operand one and not the unary operand, and upon making the selection, control is passed to the Binary Operation menu (described below).

The Binary Operation menu provides the option to select a binary operator. The menu line appears as follows:

Operation - This option is used to specify the binary operator. You are prompted for a binary operator from a toggle field. Upon making the selection control is passed to the Binary Operand Two menu (described below).

The Binary Operand Two menu provides the options to select operand2 for the binary operation. The menu line appears as follows:

Constant - This option is used to specify a constant for the binary operand two. This option is similar to the binary operand one constant option described previously in the Binary Operand One menu. The exceptions are that the wording refers to operand two and not operand one, and upon entering an acceptable value, control is passed to the Execute menu (described below).

Variable - This option is used to specify an existing variable for binary operand two. It is similar to the binary operand one variable option described previously in the Binary Operand One menu. The exceptions are that the wording refers to operand two and not operand one, and upon making a selection, control is passed to the Execute menu (described below).

The Execute menu provides the option needed to initiate the binary operation. The menu

line appears as follows:

Execute - This option initiates the binary operation. During processing the message "Processing data..." is displayed on the screen. The Execute(binary) option causes the same prompts as in the Execute option of the unary operation. Refer to the Execute option of the Execute menu discussed previously in the Unary operation section. As noted previously, the operation exponentiation may result in a numeric overflow causing the program to abnormally terminate(crash).

Indicator Transform

The indicator transform is an operation requiring two operands.

The first operand is an existing variable and the second operand

is a constant called the threshold value. The result variable takes on the value 1.0 if the input variable is greater than or equal to the threshold value. The result variable takes on the value 0.0 if the input variable operand is less than the threshold value. The Indicator Transform option provides access to the Indicator Transform menu.

The Indicator Transform menu provides the options to select the operands and to execute the operation. The menu line appears as follows:

Variable - This option is used to specify two operands. The first operand is an existing variable and the second operand is the threshold value which is a constant. The message "Select variable for operation (use bar)" prompts for the variable. After selection and pressing the message "Enter constant value for threshold value" is displayed. If the constant entered causes numeric overflow then an error message is displayed.

Execute - This option is used to initiate the indicator transform operation. During processing the message "Processing data..." is displayed. If after processing any missing values were generated then the number of missing values and the variable name are displayed on the message line. After processing is complete the message "Processing is complete...(press any key)" is displayed. Upon pressing any key control is returned to the Main menu.

SECTION 7: STAT1

7.1 WHAT STAT1 DOES

Stat1 is an interactive program which computes basic univariate statistics and displays histograms or probability plots for variables in a Geo-EAS data set. Options are available for calculating statistics on the natural log of the selected variable, for specifying a variable to be used as a "weighting factor", and for performing calculations on subsets of the input data through the use of upper and lower limits. A "Batch Statistics" option has been included which will produce a report of statistics for all variables in the specified data file.

7.2 DATA LIMITS

Stat1 requires that the maximum number of variables in the input data file not exceed 48. The data file may contain up to 10,000 samples. If the data file contains more than 10,000 samples, then only 10,000 will be used by Stat1.

7.3 THE MENU HIERARCHY

Stat1 __ Prefix

_ Data

_ Variable

_ Limits

_ Execute ___________ Histogram ____ Type

_ _ _ Class Limits

_ _ _ Axes

_ _ _ Titles

_ _ _ Results

_ _ _ View Graph

_ _ _ Quit

_ _ Probability

_ _ Examine

_ _ Quit

_ Batch Statistics

_ Quit

7.4 THE MAIN MENU

The main menu and screen (Figure 7-1) has options to allow specification of the data file names, the selection of the variable to be used, the selection of upper and lower limits for the variable, calculation of statistics (univariate), and generation of a batch statistics report. The menu line appears as follows:

Prefix Data Variable Limits Execute Batch Statistics Quit

Prefix

The prefix option is used to enter the prefix for the data file name.

Data

The Data option is used to enter the name of a Geo-EAS data file.

Variable

The variable option allows the selection of a variable for which univariate statistics are to be generated. The "weighting factor" variable can be selected at this time. The choices available for variables are the variable names specified in the data file. The natural log transform may be chosen to compute log statistics. Both the weight and log parameters may be used simultaneously. The screen fields accessed from this option are:

Variable - A toggle field for selecting the variable name whose values are to be used to compute the univariate statistics. The default value is the first variable in the data set.

Weight - A toggle field for selecting an optional variable name whose values are to be used as the weighting factor. If the weighting factor variable is chosen, then the resulting univariate statistics are "weighted" (the statistics are calculated for Weight value multiplied by Variable value). The default weighting factor variable is "None"(in which case the weighting factor is 1).

Log Option - A two valued toggle (On/Off) field to enable or disable the Log option. If the Log option is enabled and a weighting factor variable selected then the statistics are calculated for the Weight value multiplied by the (natural log of the Variable value). Whenever the Log option is enabled sample values less than or equal to zero are counted but not used in the computation. The default value is "Off".

Limits

The Limits option allows computation of statistics for a subset of the data which lie between the specified minimum and maximum sample values. You may specify the upper and lower limits placed on the values used for computing the basic statistics.

Minimum - A numeric field which contains the lower limit on the sample values used in the computation. The default value is the minimum value of the selected variable.

Maximum - A numeric field which contains the upper limit on the sample values used in the computation. The default value is the maximum value of the selected variable.

Execute

The Execute option provides access to the Results screen and menu. When the Log option is enabled the number of samples less than or equal to zero is displayed. If this occurs, pressing any key provides access to the Results screen. See the section on the Results menu below for more information.

Batch Statistics

The Batch Statistics option allows generation of a report of univariate statistics for all variables in the data set with no need for interaction. The statistics can be saved to a file or printed. A two valued toggle field ("printer" or "file") is used to make this selection and appears on the message line. If the selection is "printer" then the printer must be on and "online". If "file" is selected then you are prompted for a file name. The field accepts up to 14 alphanumeric characters. If the file exists then a yes/no prompt asks if you wish to overwrite. Indicating no will return you to the menu. If the field is blank then an error message is displayed. In such a case pressing any key returns you to the Main menu.

*** NOTE *** the file by produced by the Batch Statistics option is not a Geo-EAS data file.

7.5 THE RESULTS MENU

The Results screen and menu provides options to display a probability plot (described below), to display a ranked listing of data values and order statistics (shown below as "Examine Data"), and to display the Histogram screen and menu. The Results Screen displays univariate statistics for the selected variable. The menu line appears as follows:

Histogram Probability Plot Examine Quit

Histogram

The Histogram option provides access to the Histogram screen and menu. First a default histogram is displayed. See the section on the Histogram menu below for more information.

Probability Plot

When the Probability Plot option is selected a probability plot (Figure 7-3) is computed and displayed on the screen. The plot is a graph of the ranked variable values, plotted against their cumulative percentiles. The vertical axis is scaled in units of the variable and the horizontal axis is scaled in units of cumulative percent. A boxplot appears at the right side of the plot area along with univariate statistics, the quartiles, and the minimum and maximum values.

The boxplot (Figure 7-4) is a graph which depicts the limits, quartiles, median, and mean of a set of values. Boxplots are used in the Probability Plot and Histogram displays. A Boxplot is comprised of a rectangle containing an "X" and a dividing line. A line extends outward from each end of the rectangle. The rectangle represents the interquartile range (the range of values between the 1st and 3rd quartiles). The dividing line marks the position of the median in the interquartile range, and the "X" marks the arithmetic mean. The endpoints of the outward extending lines depict the minimum and the maximum values.

Examine

The Examine option allows access to a screen that displays a ranked listing of data values and order statistics. An example of this screen is displayed in Figure 7-5. The RecNo column indicates the samples sequence in the data file. The , , , , , and keys may be used to scroll the display to the desired position. Pressing will cause the Results screen and menu to appear.

7.6 THE HISTOGRAM OPTIONS MENU

The Histogram Options screen and menu (Figure 7-6) provides the options necessary to generate a histogram (frequency distribution plot) of the data. You can examine the histogram results or view the graph. A histogram is displayed before the Histogram menu is accessed. Pressing any key will access the Histogram menu from the histogram display. The histogram plot is discussed below in the View Graph option of the Histogram menu. The menu line appears as follows:

Type Class Limits Axes Titles Results View Graph Quit

Type

The Type option is used to select the Frequency type. The Frequency type is selected from a two valued toggle field containing the choices "Absolute" and "Relative". The choice "Absolute" will generate a traditional histogram; "Relative" will cause the frequencies to be displayed as a percentage of the total number of samples (or weight of the samples) retained.

Class Limits

The Class Limits option allows the specification of the class upper and lower limits. You will be prompted for the minimum value, the class width, and the number of classes. The upper limit of the first class is the sum of the minimum value and the class width. The upper limit for all classes is the sum of the minimum value and (class width multiplied by the number of classes). The screen fields accessed from this option are:

Minimum - A numeric field whose default value is the minimum value which was specified in the Limits option of the Main menu discussed earlier.

Class Width - A numeric field whose default value is calculated. This value must be greater than zero. If it is not then an error message appears and you are prompted for a new value.

# Classes - A numeric field for which the value entered must be in the range of 1 to 100. If the entry is erroneous then an error message appears and the default value is set to 100.

Axes

The Axes option allows the specification of the coordinate limits for the horizontal (X) and vertical (Y) axes. Tic spacing for the X and Y axes is also specified at this time. The screen fields accessed from this option are:

Minimum - Two numeric fields used for entering the minimum coordinate values to be used on the X and Y axes. The default value for X is determined from the data file for the variable for which univariate statistics are computed and from the upper and lower class limits. The default value for Y is zero.

Maximum - Two numeric fields used for entering the maximum coordinate values to be used on the X and Y axes. If the maximum value exceeds the minimum value then an error message is displayed and the default value is the previous field entry. The default values displayed are determined from the data file for the variable for which univariate statistics are computed and from the upper and lower class limits.

Tic Spacing - Two numeric fields used for entering the tic spacing to be used on the X and Y axes. The default values displayed are determined from the data file for the variable for which univariate statistics are computed and from the upper and lower class limits.

Titles

The Titles option allows you to enter the title and labels for the graph. The Hershy character sets of 33 fonts are used for plotting alphanumeric labels. The file HERSHY.BAR contains this information and is included with the software.

Main Title - An alphanumeric field which may contain up to 60 characters for the title on the graph. The default title is "Histogram". When a weighting factor has been selected then the title is "Weighted Histogram".

Subtitle - An alphanumeric field which may contain up to 60 characters for the subtitle of the graph. The default sub- title is "Data file: ".

X Axis - An alphanumeric field which may contain up to 60 characters for the X axis. The default label is " ()" where the variable name and the measurements description are taken from the variable selected for univariate statistics. If the Log option is on then "LN" precedes the variable name.

Y Axis - An alphanumeric field which may contain up to 60 characters for the Y axis. The default label is "Frequency".

Results

The Results option allows access to the Histogram Results screen and menu (Figure 7-7). This screen displays the histogram results. The results are in tabular form. The display

may be scrolled as in the Examine option (described above).

View Graph

The View Graph option allows access to a screen that displays the histogram (Figure 7-8). The resulting histogram is accompanied by a box plot (described in the Probability Plot Option of the Results menu), and some univariate statistics.

SECTION 8: SCATTER

8.1 WHAT SCATTER DOES

Scatter produces scatter plots of variable pairs in a Geo-EAS data file. Options allow for log and semi-log plots and for a regression line to be calculated. Scaling and numeric tickmark labeling for the axes, and titles are computed automatically.

8.2 DATA LIMITS

Scatter requires that the input data file contain at least three but not more than 48 variables. These should consist of an X and Y coordinate and a third variable which will be posted. The data file may contain up to 10000 samples. If the data file contains more than 10000 samples, only 10000 will be used by Scatter.

8.3 THE MENU HIERARCHY

  Scatter ___ Prefix

_ Data

_ Variables

_ Options

_ Execute

_ Quit

8.4 THE MAIN MENU

The Main menu and screen (Figure 8-1) has the options to allow specification of the data file name, the selection of the variables to be used and other program options. The menu line appears as follows:

Prefix Data Variables Options Execute Quit

Prefix

The Prefix option is used to enter the prefix for file names.

Data

The Data option is used to enter the name of a Geo-Eas data file.

Variables

The Variables option allows the selection of variables that are to be used as the X and Y coordinate values, and the sample values to post. The choices available are the variable names as specified in the data file. The screen fields accessed from this option are:

X Variable - A toggle field for selecting the variable name whose values will be used as the X-coordinates. The default X Variable is the first variable in the data file.

Y Variable - A toggle field for selecting the variable name whose values will be used as the Y-coordinates. The default Y Variable is the second variable in the data file.

Log - Two two-valued toggle fields to enable or disable a logarithmic transformation of the X and/or Y coordinate values. The choices available are "On" (use logarithmic scaling), and "Off". If the log option is set and the number of missing data (data values less than zero, or the missing value) is equal to the number of data records, an error message is displayed. The default value is "Off"

Options

The Options option allows selection of linear regression, and a scaling option. The screen fields accessed from this option are:

Regression - A two-valued (Yes/No) toggle field to enable or disable the calculation of a regression line. The regression line and the coefficients are plotted on the graph. The coefficients are the slope and intercept of the line represented by the equation Y = Slope * X + Intercept. The Slope, Intercept and R Squared value (a measure of correlation) are displayed to the right of the graph. The default value for Regression is "Yes".

Equal Scaling - A two-valued (Yes/No) toggle field to enable or disable the use of equal scaling on the plot. If Equal Scaling is selected the true X and Y proportions are maintained on the screen. If this option is not enabled, the graph will be scaled to fill the screen. The default value is "No" (disabled).

Execute

The Execute option is used to display the plot on the screen. After the graph has been displayed, type to clear the screen and return to the Main menu. Figure 8-2 displays an example scatter plot.

SECTION 10: VARIO

10.1 WHAT VARIO DOES

Vario is a two-dimensional variogram analysis and modeling program. Vario uses a pair comparison file (PCF) produced by Prevar to calculate variogram values and other statistics for a specified set of pair distance intervals (lags). Tolerances may be specified for pair direction and lag distance intervals. Plots of variogram values vs. distance may be displayed. Several graphs of the individual lag results may also be viewed, such as lag-histograms, box plots, postplots and lag-scatter plots. Variograms may be fitted with a model of up to 4 nested (additive) variogram structures. Lag results for individual lags may be saved in a Geo-EAS data file for analysis.

10.2 DATA LIMITS

Vario requires that the pair comparison file contains no more than 48 variablesand 1000 samples. If there are more than 48 variables the data file may not be used. If there are more than 1000 samples only the first 1000 will be used. Up to 24 lag intervals may be defined. As many as 2000 pairs may be used for an individual lag. If more than 2000 pairs exist per lag, only the first 2000 are used.

10.3 THE MENU HIERARCHY

 Vario ___ Prefix

_ Data

_ Variable

_ Limits

_ Options/Execute ___ Direction

_ _ New Lags

_ _ Change Lags

_ _ Post Plot

_ _ Execute _____ Type

_ _ _ Plot

_ _ _ BoxPlot

_ _ _ Lag Results ____ Histogram

_ _ _ _ Scatter Plot

_ _ _ _ Examine

_ _ _ _ Write

_ _ _ _ Quit

_ _ _

_ _ _ Model _________ Model

_ _ _ _ Plot

_ _ _ _ Options __ Titles

_ _ _ _ _ Tic Spacing

_ _ _ _ _ Limits

_ _ _ _ _ Quit

_ _ _ _ Quit

_ _ _ Quit

_ _ Quit

_ Quit

10.4 THE MAIN MENU

The Main screen and menu (shown in Figure 10-1) has options to allow specification of the pair comparison file, the selection of the variable to be used with associated options, and the variable limits. The menu line appears as follows:

Prefix Data Variable Limits Options/Execute Quit

Prefix

The Prefix option is used to enter the prefix for file names.

Data

The Data option is used to enter the pair comparison file name. The pair comparison file contains distances, directions and pair pointers for pairs of (2D) sample points in a Geo-EAS data file. This file is produced by the program Prevar and is a binary (non-readable) file, to conserve disk space.

Variable

The Variable option allows for the selection of the variable that is to be used to compute the variogram. The choices available are the variable names as stored in the pair comparison file. When the variables are selected the input data are read from the data file and defaults are computed for sample value limits and lag spacing. If and error occurs while reading the file an error message is displayed. The screen fields accessed from this option are:

Variable - A toggle field for selecting the variable whose values are used as the sample values. The default Variable is the third variable in the pair comparison file.

Log Option - A two valued toggle field to enable or disable a logarithmic transformation of the samples. The choices available are "On" (use logarithmic scaling), and "Off". The default is "Off".

Limits

The Limits option allows you to enter the values that specify the limits for the sample values. The screen fields accessed from this option are:

Minimum - A numeric field for entering the minimum variable value to use in computation. The default value is the minimum value of the variable selected.

Maximum - A numeric field for entering the maximum variable value to use in computation. The default value is the maximum value of the variable selected.

Options/Execute

The Options/Execute option provides access to the Options screen and menu (Figure 10-2), described below.

10.5 THE OPTIONS MENU

The Options screen and menu (Figure 10-2) provides a means to specify variogram options, view a post plot of the data and compute the variogram results. The menu line appears as follows:

Direction New Lags Change Lags Post Plot Execute Quit

Direction

The Direction option allows you to specify the pair orientation (selection) criteria. Figure 10-3 illustrates how these parameters affect the grouping of pairs within a lag interval. The screen fields accessed from this option are:

Direction - A numeric field for entering the pair direction in trigonometric degrees. Acceptable values range from 0 to 180 degrees (excluding 180). The default is zero degrees, which a direction parallel to the X axis.

Tolerance - A numeric field for entering the direction tolerance in trigonometric degrees. Acceptable value range from zero to 90 degrees inclusive. The default is 90 degrees. The tolerance is plus or minus. For example, a variogram computed with a direction of 90 degrees and a tolerance of 10 degrees will include all pairs with an orientation between 80 and 100 degrees.

Max Bandwidth - A numeric field for entering the maximum bandwidth. The maximum bandwidth is the maximum perpendicular distance from the direction centerline to the second point in a pair. The default value is MAX, meaning that no such constraint is imposed.

New Lags

The New Lags option allows you to choose new pair distance intervals. Pairs are included in a lag if the distance for the pair is greater than the previous cutoff value and less than the cutoff value for that lag. The screen fields accessed from this option are:

Minimum - A numeric field for entering the minimum inter-pair distance. The first lag will contain pairs strictly greater than this value.

Maximum - A numeric field for entering the maximum inter-pair distance.

Increment - A numeric field for entering the increment between lag cutoff values.

The defaults for these fields are calculated as: Minimum = 0.0, Maximum = one half the maximum interpair distance, and Increment = Maximum divided by 10.0. These values may not be appropriate for the data configuration. Once these parameters are specified, the lag cutoff distances are displayed in columns on the Options screen.

Change Lags

The Change Lags option allows you to change the lag cutoff distances on the Options screen. This provides a means of specifying unequal lag intervals. The screen fields accessed from this option are a group of numeric fields for entering new lag cutoff values. After these have been entered the program will sort the values by increasing distance and re-display them, if necessary.

Post Plot

The Post Plot option allows you to view a post plot of the data. This plot shows the actual locations of the sample points. Each point is labeled with a "+" character. The X and Y axes are automatically scaled and labeled. This graph is useful in determining the lag cutoff distances for the New Lags option. Figure 10-4 displays an example post plot for Example.dat.

Execute

When the Execute option is selected the program displays the

Results screen and computes the lag results before providing access to the Results screen and menu (Figure 10-5). This menu is described below.

10.6 THE RESULTS MENU

The Results screen and menu (Figure 10-5) has options to select the type of estimator to be computed, to view a variogram and box plots, to recompute detailed results for a specific lag, and to display the Modeling screen and menu. The menu line appears as follows:

Type Plot Box Plot Lag Results Model Quit

Type

The Type option allows for the selection of the type of estimator to display and model. The screen field accessed from this option is a toggle field. The choices available are: "Variogram", "Relative", "Madogram", and "Non-Ergodic". The default is "Variogram". See the glossary for a definition of these terms.

Plot

The Plot option allows you to view a variogram plot of the selected estimator. The distance (h) is plotted along the horizontal axis, and the variogram is plotted on the vertical axis. The type of estimator and the variable name are displayed as the graph title. Vertical and horizontal axes scaling and the tick mark spacing are calculated by the program. Displayed on the right side of the graph are the number of pairs, the lag minimum and maximum, the direction, tolerance and maximum bandwidth, the sample limits, and the mean and variance of the sample values. An example graph is displayed in Figure 10-12 near the Plot option in the Modeling screen.

Box Plot

The Box Plot option allows you to view a variogram boxplot, displayed in Figure 10-6. This is a plot which displays statistical information about each lag. The vertical lines represent the range of values in the lag, with the minimum value at the bottom and the maximum value at the top. The rectangle superimposed over the vertical range line is the inter-quartile range. The bottom on this rectangle is the first quartile, and the top is the third quartile. Therefore, 50% of the data falls within the range represented by the rectangle. The mean is represented by "x" and the median is represented by the horizontal line through the rectangle. The distance (h) is plotted along the horizontal axis, and the difference squared is plotted on the vertical axis. The variable name is displayed in the title of the graph. Vertical and horizontal axes scaling and the tick mark spacing are calculated by the program. Displayed on the right side of the graph is the number of pairs, the lag minimum and maximum, the direction, tolerance and max bandwidth, the variable minimum and maximum, and the mean and variance of the data.

Lag Results

When Lag Results option is selected you will be prompted to select a lag number before access is provided to the Lag Results screen and menu. This is accomplished by using the and keys to move the cursor bar to the desired lag number and pressing . See the section on the Lag Results menu below for more information.

Model

The Model option provides access to the Model screen and menu. See the section on the Modeling menu below for more information.

10.7 THE LAG RESULTS MENU

The Lag Results screen and menu has options to allow you to view detailed results for a specific lag. The menu line appears as follows:

Histogram Scatter Examine Write Quit

Histogram

The Histogram option allows you to view a lag-histogram plot. The

bars on the histogram represent the number of squared differences (increments Z(x)-Z(x+h)) in each histogram class. A box plot is appears at the top of the graph to display the frequency distribution of the entire set of differences. The variable name is displayed in the title of the graph. Vertical and horizontal axes scaling and the tick mark spacing are calculated automatically by the program. Displayed on the right side of the graph is the lag number, the number of pairs, the variogram value, the minimum and maximum distance used in computing the lag, the direction, tolerance and maximum bandwidth, and the sample value limits. An example lag-histogram plot is displayed in Figure 10-8.

Scatter

The Scatter option allows you to view a lag-scattergram. This is a plot of pairs of sample values. Every pair of sample values is represented as a point in the scatter plot, where the X coordinate is the value of the first point in the pair and the Y coordinate is the value of the second point in the pair. Points are plotted for all pairs in the lag, subject to the limits criteria (direction, tolerance, bandwidth, sample value limits and interpair distance) which have been specified. Displayed on the right side of the graph is the lag number, the number of pairs, the variogram value, the minimum and maximum distance, the direction, tolerance, and maximum bandwidth, and the sample value limits. Figure 10-9 displays an example lag-scattergram.

Examine

The Examine provides access to the Examine Lag Results screen, displayed in Figure 10-10. The Examine Results screen displays in a tabular form a pair index number, the first and second values, the distance, the direction, and the difference squared for each pair in the lag. These are sorted in order of difference squared. You can use the arrow keys or the <1> to <9> keys to scroll to a position on the screen. Press to position the list at the largest squared difference and to view the smallest. Type to clear the screen and return to the Lag Results menu.

Write

The Write option allows you to save the lag results in a Geo-EAS data file. The Variable, From___, To___ Distance, Direction, Difference and Difference^2 will saved in a file. The file name is entered on the message line, when the Write option is selected. The default file name is "LagResult.dat". If the file already exists, a Yes/No prompt provides an alternative to quit or proceed. The lag results file may be used with other Geo-EAS programs for a more detailed analysis of the pair information.

10.8 THE VARIOGRAM MODELING MENU

The Variogram Modeling screen and menu has options to allow you to specify the variogram model to display, the graph options, and to plot the variogram estimates and the specified model. This screen is displayed in Figure 10-11. The menu line appears as follows:

Model Plot Options Quit

Model

The Model option allows you to edit or enter the parameters for the variogram model. Up to four nested variogram structures and a "nugget" component can be included in the model. The screen fields access from this option are:

Nugget - A numeric field for entering the nugget effect. The value entered must be greater than or equal to 0.

Type - Four toggle fields used for selecting the type of variogram structure. The choices available for each are "Spherical", "Gaussian", "Exponent", "Linear", and " " If " " is selected the structure will be ignored.

Sill - Four numeric fields for entering the sill for each structure.

Range - Four numeric fields for entering the range of influence of the structure. The range of the variogram structure. The range in a spherical model is the distance at which the model curve becomes horizontal. In a gaussian and exponential model the range parameter entered is a "practical range" at which the model attains 95% of its maximum value. In the linear model, the range and sill are used to define the slope of a linear structure. The practical effect of this is that the model type can be changed without changing the "apparent range" of the model curve.

Plot

The Plot option allows you to view a plot of the estimates with specified model superimposed. An example plot is displayed in Figure 10-12. The distance is plotted along the horizontal axis, and the variogram is plotted against the vertical axis. The type of estimator and the variable name is displayed as the graph title. Vertical and horizontal axis scaling and tick mark spacing are calculated by the program. Displayed on the right side of the graph are the total number of pairs, the minimum and maximum distances, the direction, tolerance and maximum bandwidth, the sample value limits, and the mean and variance of the sample values. Press to clear the screen and access the Variogram Modeling menu.

Options

The Options option provides access to the Graph Options screen and menu, displayed in Figure 10-13.

10.9 THE GRAPH OPTIONS MENU

The Graph Options screen and menu (Figure 10-13) has options to select the graph titles and labels, the spacing of the tick marks, and the graph limits for the graph produced with the Plot option of the Variogram Modeling menu. The menu line appears as follows:

Titles Tic Spacing Limits Quit

Titles

The Titles option allows you to enter the title and labels for the graph. The screen fields accessed from this option are:

Title - An alphanumeric field which may contain up to 60 characters each, for the title on the graph. The default title contains the type of estimator and the variable name.

Subtitle - An alphanumeric field which may contain up to 60 characters for the graph subtitle.

X Label - An alphanumeric field which may contain up to 60 characters for the X axis label. The default label is "Distance".

Y Label - An alphanumeric field which may contain up to 60 characters for the Y axis label. The default label is the type of estimator selected.

Tic Spacing

The Tic Spacing option allows the specification of the spacing of tick marks on the X and Y axes. The screen fields accessed from this option are:

X Tickmark Spacing - A numeric field for entering the spacing between the X axis tickmarks.

Y Tickmark Spacing - A numeric field for entering the spacing between the Y axis tickmarks.

Limits

The Limit option allows you to specify the limits for the X and Y axes. The screen fields accessed from this option are:

X Axis Minimum - A numeric field for entering the minimum coordinate value to be used on the X axis. The default value is zero.

X Axis Maximum - A numeric field for entering the maximum coordinate value to be used on the X axis. The default value is the maximum distance calculated by the program.

Y Axis Minimum - A numeric field for entering the minimum coordinate value to be used on the Y axis. The default value is zero.

Y Axis Maximum - A numeric field for entering the maximum coordinate value to be used on the Y axis. The default value is the maximum variogram value calculated by the program.

SECTION 11: XVALID

11.1 WHAT XVALID DOES

The name Xvalid stands for "cross-validation". Cross- validation involves estimating values at each sampled location in an area by kriging with the neighboring sample values (excluding the value of the point being estimated). The estimates are compared to the original observations in order to test if the hypothetical variogram model and neighborhood search parameters will accurately reproduce the spatial variability of the sampled observations. The estimated values, associated kriging errors, residuals, and other useful statistics are displayed on a summary screen. Scatter plots and histograms may be obtained for a quick summary of these results. Results may be stored in a Geo-EAS data file for further analysis.

11.2 DATA LIMITS

Xvalid requires that the input data file contains at least 3 but no more than 48 variables. Two of these variables must represent the coordinates of sample locations. No more than 1000 samples may reside in the data file. If more than this number are encountered, only the first 1000 values will be used for cross-validation.

11.3 THE MENU HIERARCHY

Xvalid ___ Prefix

_ Data

_ Variables

_ Options/Execute ____ Type

_ Quit _ Search

_ Model

_ Execute ____ Error Map

_ Debug _ Scatter Plot

_ Quit _ Histogram

_ Write

_ Examine

_ Quit

11.4 THE MAIN MENU

The Main screen and menu for Xvalid, shown in Figure 11-1, provides options to specify the file prefix, the input data file and variables, and to access the Options/Execute menu. The menu line appears as follows:

Prefix Data Variables Options/Execute Quit

Prefix

The Prefix option is used to enter the prefix for the data file name.

Data

The Data option is used to specify the name of the Geo-EAS data file to be used for cross-validation. This file must contain at least three variables consisting of two coordinates and a sampled value.

Variables

The Variables option is used to select the coordinate and sample value variables. They are selected from toggle fields which contain the variable names from the specified data file. The data file is accessed when these variables are selected. If an error occurs while reading the data file an error message will be generated. The screen fields accessed from this option are:

X Coordinate - A toggle field for selecting the variable whose values represent the X coordinates for sample points. The default X Coordinate is the first variable in the data file.

Y Coordinate - A toggle field for selecting the variable whose values will be used as the Y coordinates for sample points. The default Y Coordinate is the second variable in the file.

Variable to Krige - A toggle field for selecting the variable to be estimated. The default is the third variable in the data file.

Log Option - A two-valued (On/Off) toggle field for enabling or disabling the Log Option. If the Log Option is set to "On", kriging will be performed on the natural log of the sample values. If it is set to "Off", no log transformation will occur. The default value for the Log Option is "Off".

Options/Execute

The Options/Execute option provides access to the Options screen and menu. See the section below for more information.

11.5 THE OPTIONS MENU

The Options screen and menu, shown in Figure 11-2, has the options to specify the parameters to be used for kriging, and to initiate the cross-validation process. The menu line appears as follows:

Type Search Model Execute Debug Quit

Type

The Type field is a toggle field used for selecting the type of kriging used in cross-validation. The choices are "Ordinary", and "Simple". If "Ordinary" is chosen, ordinary point kriging will be performed. If "Simple" kriging is chosen, simple kriging is performed and a value must be entered for the Global Mean when the Model option is selected.

Search

The Search option provides a means of controlling the neighborhood search used during kriging. Parameters may be specified to define an elliptical search area. Constraints may be placed upon the number of sectors and the number of samples to be retained in each sector of the search area, and the type of distance measure to use when eliminating neighbors from a search sector. Figure 11-3 depicts the search parameters which define the shape of the search ellipse. The screen fields accessed from this option are:

R Major - a numeric field for indicating the length of the major radius (half the length of the longest axis) of the search ellipse.

R Minor - a numeric field for indicating the length of the minor (shortest) radius of the search ellipse. This value must be less than or equal to R Major, non-zero, and non-negative. The default value is the value given for R Major. If R Major is equal to R Minor, the search area will be a circle.

Angle - a numeric field for indicating the orientation of the search ellipse. It is given in trigonometric degrees in the range from zero up to (but not including) 180, and indicates the angle between the longest axis of the ellipse (specified by R Major) and the sample coordinate X axis. If R Minor is equal to R Major, a circle search is used and the Angle parameter is ignored.

Min. Dist. - a numeric field for specifying the minimum distance from the estimated sample location to the nearest neighbor sample. If a minimum distance of zero (the default) is specified, then any neighboring sample will be used, subject to the complete set of search constraints.

Distance Type - a two valued toggle field for selecting the type of distance measure to use when eliminating neighbors. The choices available are "Euclidean" (the default), and "Variogram". Neighboring samples are eliminated from consideration when the Max Pts/Sector (Maximum points per sector) criterion is exceeded in a given sector. If this should occur, only the "closest" neighbors are kept. If "Euclidean" distance type is chosen, neighbors are eliminated based upon the euclidean distance from the point to be estimated ellipse center. If "Variogram" distance is chosen, the variogram function value (as specified by the Model parameters) for the computed distance is used as the criterion for elimination of neighbors.

Num. Sectors - a toggle field for selecting how many sectors in which to divide the search ellipse. The choices available are "1" (the default), "4", and "8". The combination of the Number of Sectors, and the Max Points per Sector parameters indicate the maximum number of samples to be used for kriging. This parameter also serves to indicate the number of groups to use for classification of neighbors. The search ellipse is divided into the chosen number of equally- sized sectors. If a sample is found to be within the search ellipse, it is flagged with a sector number. These sector numbers and sample distances are used for elimination of samples which exceed the Max Points per Sector criterion.

Max. Pts/Sector - a toggle field for selecting the maximum number of points which a sector may contain. The choices range from "1" to a maximum which depends on the number of sectors chosen. If one sector is specified, up to 24 neighboring points may be used. If four or eight sectors are selected, the choices are constrained such that a maximum of 64 neighbors may be retained. If the number of neighbors in a sector exceeds the specified value, the "farthest" samples (as determined by Distance Type) are eliminated from consideration.

Min Pts. to Use - a toggle field for selecting the minimum number of neighboring samples to use for kriging. The default value or this parameter is "1". If fewer than the specified number are found kriging is not performed and a missing value is generated for the estimate and kriging standard deviation.

Empty Sectors - a toggle field for selecting the maximum allowable number of consecutive sectors with no neighbors. The choices available are determined by the Number of Sectors parameter. If one sector is chosen, then this field is disabled. If more than the specified number of consecutive sectors are empty, no value is kriged; missing values are generated in place of an estimate and kriging standard deviation.

Model

The Model option allows specification of the variogram model to use when kriging. Screen fields are provided for a nugget effect value and up to four nested variogram structures. Each structure is specified with a structure type, a sill value, and an ellipse of influence. If simple kriging is chosen an additional field is provided for entering the Global Mean. Each of the four structures has five associated screen fields. Selecting the Model option will cause a cursor bar to appear in the upper left corner of the models area. The arrow keys may be used to move the cursor bar to fields in the Model area. To exit the Model area, move the cursor bar out of the top or off to the left of the area, using , or . If any errors are made when entering variogram model parameters an error message will be displayed and the cursor bar will be placed at the problem field. The major and minor ranges, and the angles for the additive variogram structures defined in this parameter group are similar to the search ellipse ranges and angles. Figure 11-3 illustrates how these parameters define the shape of the ellipse. The screen fields accessed from this option are:

Nugget - a numeric field for entering the nugget value for the variogram model. Only values greater than or equal to zero may be entered. The default value is zero.

Global Mean - a numeric field for specifying the global mean for simple kriging. If ordinary kriging is chosen this field is disabled and cannot be accessed. The default value for the global mean is zero.

The following five fields are present for each of the four nested variogram structures:

Type - a toggle field for indicating the type of the structure. The toggle field choices for type are " " (none), "Spherical", "Gaussian", "Exponential", "Linear". The default type for all four structures is "none". If a structure is entered and the type is subsequently changed to "none" the structure will be deleted from the variogram model. The order of the variogram structures on the screen is unimportant; neither do they need to be in a contiguous order on the screen.

Sill - a numeric field for entering the sill value for a variogram structure. A non-zero, non-negative value is required here. In a linear variogram structure the "sill" must be chosen so that the corresponding "range" parameter value(s) will result in the desired slope(s); the actual model continues to increase indefinitely with distance.

Major Range - a numeric value for entering the longest range of influence of the variogram structure. The Major Range must be non-zero, and non-negative. This may be thought of as similar to the R Major parameter described in the Search option above. In fact, the variogram ellipse of influence is defined exactly as the search ellipse: with two ranges (radii) and an angle. Note that the second ellipse is used to select a reasonable subset of neighbors for efficient kriging - it has no relationship to the variogram model ellipse(s).

Minor Range - a numeric field for indicating the length of the minor (shorter) range of the variogram ellipse. This value must be less than or equal to the Major Range, non-zero, and non-negative. The default value is the Major Range value. If the two ranges are equal, an isotropic variogram structure is defined. If they are not equal, the two ranges are used to determine the anisotropy ratio.

Ellipse Angle - a numeric field for indicating the orientation of the ellipse for the variogram structure. It is given in trigonometric degrees in the range from zero up to (but not including) 180, and indicates the angle between the longest axis of the ellipse (specified by the Major Range) and the sample coordinate X axis. If the two ranges are equal (isotropic structure) then the angle is ignored.

Execute

The Execute option is used to initiate kriging. All parameters must be specified before kriging may begin. Several "Debug Options" (described below) are enabled or disabled with the "n", "w", and the "s" keys. These keys are used to toggle the debug displays on or off. If your keyboard has status lights for the keys, it is easy to determine the state of each key. It is important to disable all three Debug Options prior to using the Execute option, or intermediate results screens will be generated and kriging will proceed more slowly. If your personal computer is equipped with the proper graphics hardware, a graphics display is generated (Figure 11-4) when the Execute option is selected. In this display the original sample locations are represented by symbols. The symbol coding is used to classify the input data into their respective quartiles. As each point is kriged the estimates and associated results are displayed at the bottom of the screen and the original symbols are over-plotted by symbols which represent the value of the estimate. On EGA equipped computer systems, the symbols are also color coded. If your computer system has no graphics capability or has a Hercules graphics card, no graph is displayed, and the results are displayed at the bottom of the Options screen. During the kriging process the debug displays (described below) may be activated or de-activated to view intermediate kriging results. Once all sample values have been kriged, a tone signals that the kriging has been completed. Pressing at this time will cause the Results screen and menu (described below) to be displayed.

Debug

The Debug option is provided on the menu as a means of identifying the keys used for enabling or disabling the Debug displays. This option does not actually do anything, but moving the cursor bar to this menu option will display the Debug display names, and the corresponding keys which are used to activate them. These displays provide a means of looking at intermediate kriging results during the cross-validation process. If the "n", "w", or "s" keys are activated during kriging, the corresponding displays will be generated on the screen. To continue kriging without interruption, "q" is pressed. To disable the generation of such screens, the corresponding keys should be de-activated. Refer to the section on program Krige for a detailed discussion of the debug displays.

How to Cancel Kriging

At any time during the kriging process, kriging may be cancelled by pressing the "q" key. If this is done a message is displayed indicating that kriging has been terminated. This is useful when the debug screens reveal a problem with the search or variogram parameters and you wish to change them and re-start. It is important to remember that the "terminate kriging" key will not work when any of the three debug display keys is active: it turns the debug option off, but doesn't stop kriging.

11.6 THE RESULTS MENU

The Results menu and screen, shown in Figure 11-5, is displayed when kriging has been completed. The Results screen contains information about the data file, variables, and the type of kriging used. Additionally, descriptive statistics are provided for the original sample values, the kriged estimates, the kriging standard deviations, the differences (between estimate and observed), and the zscore. The "zscore" is computed as the ratio of the difference to the kriging standard deviation. The Results menu appears at the bottom of the screen, and provides options to display several graphs, to examine the individual results, and to save the results to a Geo-EAS data file. Graph scaling, titles, and labeling are performed automatically. On a non-graphics computer system the graphs are not generated. The menu line appears as follows:

Error Map Scatter Plot Histogram Write Examine Quit

Error Map

The Error Map option provides a graph of the kriging error, or "Differences". An example plot is displayed in Figure 11-6. Sample locations are marked with a "+" symbol for over-estimation (Estimate-Observed > 0), and an "x" symbol for negative differences. The size of the symbol is proportional to the error, so that large positive or negative differences are easily noticed. Descriptive statistics for the differences are displayed to the right of the graph.

Scatter Plot

The Scatter Plot option provides a choice of displaying one of two possible scatter plots. When this option is chosen a two- valued toggle field containing the choices "Observed vs. Estimate", and "Estimate vs. Error" is displayed on the message line. It is used to select the type of scatter plot to be displayed. Once this choice has been made the appropriate graph will be displayed. Box plots are drawn opposite to each coordinate axis to convey information about the frequency distributions of the observed values and estimates, or estimates and differences. In both types of graph "+" and "x" symbols are used to indicate positive and negative estimation errors (same as the Error Map option described above). An example plot is displayed in Figure 11-7.

Histogram

The Histogram option provides a histogram (frequency distribution graph) of the estimation error. An example plot is shown in Figure 11-8. Histogram class intervals are computed automatically by the program. A box plot of the differences appears at the top of the graph. Descriptive statistics are displayed to the right of the histogram.

Write

The Write option is used to store the result to a Geo-EAS data file. When this option is chosen a prompt for the file name appears on the message line. If the specified file exists, a Yes/No prompt will provide an option to overwrite, or quit. If any errors occur while saving the results to the file, appropriate error messages will be displayed on the message line. If results were successfully saved, a message will be displayed, and pressing any key will re-activate the Results menu. The file created by this option contains seven variables. The first three are the two sample coordinate variables and the sample value variable chosen for kriging. The remaining four variables are named "Zstar", "Zsdev", "Zstar-Z", and "Zscore". They contain, respectively, the estimate, the kriging standard deviation, the estimation error, and the zscore (estimation error divided by the kriging standard deviation). This file may be used with other Geo-EAS programs for further analysis.

Examine

The Examine option provides a means of directly examining individual results of kriging on a scrolling display, called the Examine Results screen. This screen is shown in Figure 11-9. The observed values, estimates, errors, kriging standard deviations, and zscores are displayed in columns on this screen. These values are ranked in order of estimation error, so that the largest negative difference is at the top of the list, and the largest positive difference is at the bottom of the list. The leftmost column contains the sample sequence number in the input data file. The , , , , , and keys may be used to position the list on the screen. The numeric keys <1> through <9> are used to scroll the list in increments of 10% (e.g. pressing <5> would position the middle of the list on the screen). The key is used to exit the Examine Results screen and return to the Results screen and menu.

SECTION 12: KRIGE

12.1 WHAT KRIGE DOES

Krige is an interactive program which performs two- dimensional kriging. A rectangular grid of kriged estimates is created and stored in a Geo-EAS data file. Contour plots may be generated from these gridded estimates with the contour package of your choice. Options are provided to control the type of kriging, the neighborhood search area, the grid spacing and extents, and the variogram model for each variable kriged. Up to ten variables may be kriged in each program execution. The program parameters may be stored in a parameter file and retrieved for later use. During the kriging process, debug displays may be activated or de-activated for the purpose of viewing intermediate kriging results.

12.2 DATA LIMITS

Krige requires that the input data file contains at least three but no more than 48 variables. Two of these variables must represent the coordinates of sample locations. Up to ten variables may be selected for kriging. No more than 1000 samples may reside in the data file. If more than this number are encountered only the first 1000 values will be used for kriging.

12.3 THE MENU HIERARCHY

Krige ___ Prefix

_ Read Parameters

_ Options/Execute ____ Data

_ Save Parameters _ Polygon

_ Quit _ Type

_ Grid

_ Search

_ Variables/Models __ New Variable

_ Title _ Edit

_ Execute _ Delete

_ Quit _ Quit

12.4 THE MAIN MENU

The Main screen and menu for Krige, shown in Figure 12-1, provides options to specify the file prefix, to read or save program parameter values in a parameter file, and to access the Krige Options menu, where program parameter values are specified and kriging is initiated. The menu line appears as follows:

Prefix Read Parameters Options/Execute Save Parameters Quit

Prefix

The Prefix option is used to enter the prefix for the data file name.

Read Parameters

The Read Parameters option provides a means of loading program parameter values from a parameter file. When this option is selected a prompt is issued for the Input Parameter File name. A default name is constructed from the most recently used Geo-EAS data file name using the file extension "kpf" (krige parameter file). Once the name has been specified the parameter file is accessed and the parameters are retrieved from the file. The data file is also accessed at this time so that the coordinate variable values may be loaded by the program. If an error occurs while trying to access or read the parameter or data file, a message will be displayed indicating the problem, and any key may be pressed to return to the Main menu. If parameters and data are successfully loaded a message will be displayed, and pressing any key will cause the Krige Options screen and menu to be displayed.

Options/Execute

The Options/Execute option provides access to the Krige Options screen and menu. See the section below for more information.

Save Parameters

The Save Parameters options provides a means of storing program parameter values in a parameter file for later use. When this option is selected a prompt is issued for the Output Parameter File name. A default name is constructed with a "kpf" file extension as described above for the Read Parameters option. If the named file already resides on disk, a Yes/No prompt provides a means of overwriting the old file or exiting the option. If an error occurs while trying to create or write to the Output Parameter File a message will be displayed, and pressing any key causes re-activation of the Main menu. If the parameters are successfully stored in the file, a message is displayed and a keystroke re-activates the Main menu.

12.5 THE OPTIONS MENU

The Options screen and menu (Figure 12-2) has the options to specify the parameters to be used for kriging, and to initiate the kriging process. The menu line appears as follows:

Data Polygon Type Grid Search Variables/Models Title Execute Quit

Data

The Data option is used to specify the name of the Geo-EAS data file whose values will be used for kriging, and the name of the Geo-EAS output file of gridded estimates. The screen fields accessed through the Data option are:

Data File - A 14 character alphanumeric field in which the name of the input data file is entered. The most recently used Geo-EAS data file name is the default. Once the name is given the program reads the variable names from the file into several toggle fields used for selecting the coordinate variables and the variables to krige. If the data file cannot be located or an error occurs while accessing the file an error message is generated. If the variable names are loaded successfully the Krige Options menu will be re- activated.

Output File - A 14 character alpha-numeric field for entering the name of the output file of gridded estimates. A default name is constructed which consists of the input data file name with a ".grd" extension (signifying gridded data). If the specified file already resides on disk, a Yes/No prompt provides the alternative of overwriting the file or exiting the option.

Polygon

The Polygon option is used to specify the name of a file containing polygonal boundaries which limit the area in which estimates are produced. This file should contain one or more lists of polygon vertices which form closed polygons. Each vertex list is preceded by a flag which indicates if it is to be treated as an inclusive or exclusive polygon. No estimates may be kriged within an exclusive polygon boundary, and none may be produced outside an inclusive boundary. The polygon file is read when the Execute option is used to initiate kriging. If an error occurs while reading the Polygon file, a message will be generated, the file will be ignored and an attempt to krige all grid cells will be made. See the appendices for a detailed description of the polygon file format.

Type

The Type option is used to select the type of kriging to use and whether to krige point or block estimates. The screen fields accessed from this option are:

Type of Kriging - A toggle field containing the choices "Ordinary", and "Simple" which is used to specify the type of kriging to perform. If "Ordinary" is chosen, ordinary kriging will be performed. If "Simple" kriging is chosen, simple kriging is performed and a value must be specified for the Global Mean when entering the variogram model parameters on the Variables/Models screen.

Point or Block - A toggle field containing the choices "Point", "Block 2x2", "Block 3x3", and "Block 4x4", which is used to indicate point or block kriging. If one of the "Block NxN" choices is selected, block estimates will be produced. The "N" refers to the number of discretization points used to approximate the area of the block. Large "N" values give somewhat better approximations of the blocks at the expense of increased computing time. The choice of point or block also determines the meaning of the Grid Origin parameters. If block kriging is chosen these parameters refer to the center of the lower left-hand block in the grid.

Grid

The Grid option is used to specify the variables to use as coordinate values, the origin of the grid, the size of grid cells, and the number of cells in the X and Y directions. These are specified with four screen fields for each of the two directions. The screen fields for the X and Y directions are:

Variable - Two toggle fields containing the variables names from the specified input data file which are used to specify the variables to be used as the sample coordinate values when kriging. The default values for the X and Y variables are the first and second variables in the data file. Once the variables have been chosen the coordinate values are retrieved from the data file. If an error occurs during the retrieval of data, a message is displayed, and pressing any key will re-activate the Krige Options menu, but it is assumed that the file is corrupted and cannot be used. A new data file name must be specified. If no grid parameter values have been previously specified, default values for the Origin, Cell Size, and Number of Cells parameters are computed.

Origin - Two numeric fields used for entering the X and Y coordinate values for the origin of the kriging grid. If block kriging (the default) were selected the origin is taken as the center of the lower left-hand grid block.

Cell Size - Two numeric fields for specifying the grid cell size. For point kriging these values will indicate the distance between points in the grid. For block kriging these parameters will indicate the size of the blocks to be kriged (the distance between block centers in each direction). Both values must be non-zero, and non-negative.

# Cells - Two numeric fields for selecting the number of points or blocks to be produced in each of the two directions. These values must be non-zero, non-negative, and may not exceed 100.

Search

The Search option provides a means of controlling the neighborhood search used during kriging. Parameters may be specified to define an elliptical search area. Constraints may be placed upon the number of sectors and the number of samples to be retained in each sector of the search area, and the type of distance measure to use when eliminating neighbors from a search sector. The screen fields accessed from this option are:

R Major - A numeric field for indicating the length of the major radius (half the length of the longest axis) of the search ellipse.

R Minor - A numeric field for indicating the length of the minor (shortest) radius of the search ellipse. This value must be less than or equal to R Major, non-zero, and non-negative. The default value is the value given for R Major. If R Major is equal to R Minor, the search area will be a circle.

Angle - A numeric field for indicating the orientation of the search ellipse. It is given in trigonometric degrees in the range from zero up to (but not including) 180, and indicates the angle between the longest axis of the ellipse (specified by R Major) and the sample coordinate X axis. If R Minor is equal to R Major, a circle search is used and the Angle parameter is ignored.

Distance Type - A two valued toggle field for selecting the type of distance measure to use when eliminating neighbors. The choices available are "Euclidean" (the default), and "Variogram". Neighboring samples are eliminated from consideration when the Max Pts/Sector (Maximum points per sector) criterion is exceeded in a given sector. If this should occur, only the "closest" neighbors are kept. If the "Euclidean" distance type is chosen, neighbors are eliminated based upon the euclidean distance from the point to be estimated ellipse center. If "Variogram" distance is chosen, the variogram function value (as specified by the Model parameters) for the computed distance is used as the criterion for elimination of neighbors.

Num. Sectors - A toggle field for selecting how many sectors in which to divide the search ellipse. The choices available are "1" (the default), "4", and "8". The combination of the Number of Sectors, and the Max Points per Sector parameters indicate the maximum number of samples to be used for kriging. This parameter also serves to indicate the number of groups to use for classification of neighbors. The search ellipse is divided into the chosen number of equally- sized sectors. If a sample is found to be within the search ellipse, its' sector number is stored. These sector numbers and sample distances are used for elimination of samples which exceed the Max Pts/Sector value.

Max. Pts/Sector - A toggle field for selecting the maximum number of points which a sector may contain. The choices range from "1" to a maximum which depends on the number of sectors chosen. If one sector is specified, up to 24 points neighbors may be used. If four or eight sectors are selected, the choices are constrained such that a maximum of 64 neighbors may be retained. If the number of neighbors in a sector exceeds the specified value, the "farthest" samples (as determined by Distance Type) are eliminated from consideration.

Min Pts. to Use - A toggle field for selecting the minimum number of neighboring samples to use for kriging. The default value or this parameter is "1". If fewer than the specified number of samples are found then kriging is not performed and a missing value is generated for the estimate and kriging standard deviation.

Empty Sectors - A toggle field for selecting the maximum number of consecutive sectors with no neighbors. The choices available are determined by the Number of Sectors parameter. If one sector is chosen, then this input is ignored. If more than the specified number of consecutive sectors are empty, no value is kriged and missing values are generated in place of an estimate and kriging standard deviation.

The Variables/Models option is used to access the Variables/ Models screen and menu. This screen and menu are used for selecting the variables to krige and the variogram model to use for each variable. Up to ten variables may be selected for kriging. At least one must be specified prior to selection of the Execute option. The Variables/Models screen and menu is discussed below.

Title

The Title option is used to indicate the descriptive title to store in the output file of gridded estimates. It provides access to a 60 character alphanumeric field for storing the title. A default title is constructed from the data file name and the output file name. Any valid alphanumeric character string may be entered.

Execute

The Execute option is used to initiate kriging. All parameters must be specified before kriging may begin. Several "Debug Options" (described below) are enabled or disabled with the "n", "w", and "s" keys. These keys are used to toggle the debug displays on or off. It is important to disable all three Debug Options prior to using the Execute option, or intermediate results screens will be generated and kriging will proceed more slowly.

If your personal computer is equipped with the proper graphics hardware, a graphics display (shown in Figure 12-3) is generated when the Execute option is selected. In this display the original sample locations are represented by the "x" symbol. As each point is kriged, the estimates and associated results are displayed at the bottom of the screen (except in Hercules- equipped systems) and the point or block estimates are plotted with a symbol which indicates the quartile of the estimate. On EGA equipped computer systems, these symbols and the sample values are also color coded. A legend is displayed at the right of the screen showing the symbols and corresponding quartile cutoff values. If your computer system has no graphics capability, no graph is displayed, and the results are displayed at the bottom of the Krige Options screen.

During the kriging process the debug displays (described below) may be activated or de-activated to view intermediate kriging results. Once all grid cells have been kriged, a tone signals that the kriging has been completed. Pressing will cause the Krige Options screen and menu to be displayed and a message will be generated to indicate that the results were successfully written to the output file. If an error occurs while attempting to write to the file, kriging will be halted and an error message will be displayed.

How to Cancel Kriging

At any time during the kriging process, kriging may be canceled by pressing the "q" key. If this is done a message is displayed indicating that kriging has been terminated. This is useful when the debug screens reveal a problem with the search or variogram parameters and you wish to change them and re-start. Note that in this situation the output file will not contain a completed grid of estimates. It is important to remember that the "terminate kriging" keys will not work when any of the three debug display keys is active.

12.6 THE VARIABLES/MODELS MENU

The Variables/Models screen and menu (Figure 12-4) is provided for selection of the variables to krige and to specify the variogram model for each of the selected variables. The Variables/Models screen is divided into three areas. The area on the left of the screen is used to display the list of variables selected for kriging, called the Kriging List. This list is used as a menu from which to select a variable when deleting or editing a variable/model specification. On the right, the top area is for selecting new variables to add to the Kriging List and the bottom area is for entering variogram model parameters. The Variables/Models menu provides options to add or delete a variable and model to the Kriging List, and to edit a set of variogram model parameters. The menu line appears as follows:

New Variable Edit Delete Quit

New Variable

The New Variable option is used to select a variable to add to the Kriging List. A toggle field is displayed in the top portion of the screen which contains the variable names from the input file. The default value for this field is the third variable name in the file. Once this selection is made the variable name is added to the Kriging List and the variogram model parameters (described below) must be entered.

*** Note *** it is possible to krige the same variable more than once; the output file will then contain duplicate variable names.

Edit

The Edit option allows you to edit the specified variogram model for the specified variable. When this option is selected the Kriging List menu is activated. A cursor may be moved to the appropriate member of the list with the or keys and selected for editing with . Once the variable has been specified the parameters will be loaded onto the screen and the Variogram Model Parameters area will be activated. In this screen area fields are provided for a nugget effect value and up to 4 additive variogram structures. Each structure is specified with a structure type, a sill value, and an ellipse of influence. If simple kriging is chosen an additional field is provided for entering the Global Mean. Each of the four structures has five associated screen fields. Selecting the Model option will cause a cursor bar to appear in the upper left corner of the models area. The arrow keys may be used to move the cursor bar to fields in the Model area. To exit the Model area, move the cursor bar out of the top or off to the left of the area, using the , or keys. If any errors are made when entering variogram model parameters an error message will be displayed and the cursor bar will be placed at the problem field. The screen fields accessed from this option are:

Nugget - A numeric field for entering the nugget value for the variogram model. Only values greater than or equal to zero may be entered. The default value is zero.

Global Mean - A numeric field for specifying the global mean for simple kriging. If ordinary kriging is chosen this field is disabled and cannot be accessed. The default value for the global mean is zero.

The following five fields are present for each of the four additive variogram structures:

Type - A toggle field for indicating the type of the structure. The toggle field choices for type are " " (none), "Spherical", "Gaussian", "Exponential", "Linear". The default type for all four structures is "none". If a structure is entered and the type is subsequently changed to "none" the structure will be deleted from the variogram model. The order of the variogram structures on the screen is unimportant; neither do they need to be in a contiguous order on the screen.

Sill - A numeric field for entering the sill value for a variogram structure. A non-zero, non-negative value is required here. If a linear variogram type is selected the sill value is used together with the variogram ellipse ranges to determine a slope for a given direction. In a linear variogram structure the sill must be chosen so that the corresponding range parameter values will result in the desired slope.

Major Range - A numeric value for entering the longest range of influence of the variogram structure. The Major Range must be non-zero, and non-negative. This may be thought of as similar to the R Major parameter described in the Search option above. In fact, the variogram ellipse of influence is defined exactly as the search ellipse: with two ranges (radii) and an angle. Note however that the two ellipses have fundamentally different purposes, although the parameters which describe them are the same

Minor Range - A numeric field for indicating the length of the minor (shorter) range of the variogram ellipse. This value must be non-zero, non-negative, and less than or equal to the Major Range. The default value is the Major Range value. If the two ranges are equal, an isotropic variogram structure is defined. If they are not equal, the two ranges are used to determine the ratio of anisotropy.

Ellipse Angle - A numeric field for indicating the orientation of the ellipse for the variogram structure. It is given in trigonometric degrees in the range from zero up to (but not including) 180, and indicates the angle between the longest axis of the ellipse (specified by the Major Range) and the sample coordinate X axis. If the two ranges are equal (isotropic structure) then the angle is ignored.

Delete

The Delete option is used to delete a variable and model from the kriging list. When this option is selected, the Kriging List menu will be activated, and the variable to delete may be selected as described in the Edit option. Once this selection is made, a Yes/No prompt provides the alternative of canceling the deletion. If is chosen, the variable and model will be deleted and the Kriging List and screen will be regenerated. This option is disabled when the Kriging List is empty.

12.7 THE DEBUG DISPLAYS

The debug displays are provided as a means of viewing intermediate kriging results during the kriging process. Since such displays slow the kriging process, the displays may be activated, or de-activated at any time during kriging. If the "n" (neighbors), "w" (weights), or "s" (system) keys are activated during kriging, the corresponding displays will be generated on the screen. To return to kriging, the "q" key is pressed. To disable the generation of these displays, the corresponding keys should be de-activated. The following is an explanation of each display, and the key which activates it:

"n" - This activates the Search Area display, shown in Figure 12-5. For each estimate, the search ellipse is displayed along with all sample locations in the sampled area. The neighbors which were chosen for the estimate are marked. The coordinates of the estimated point (center of the search ellipse) is displayed at the top of the screen. If more than one sector was chosen, the sector boundaries will be plotted. This display may be used to check if the search ellipse is of the proper size and orientation, and that the desired number of samples are used as neighbors. On a non-graphics system this display is a text display showing the list of neighbors, the sample locations, and the sector number for each neighbor.

"s" - This key activates a display (Figure 12-6) which shows the system of equations used to produce the estimate. A one-dimensional array of values on the left of the screen shows the covariances between the estimate location and the neighbors, and the matrix of values in the remaining portion of the screen shows the covariances between neighboring samples. If the number of neighbors used in the system of equations is more than eight, each row of the matrix will "wrap-around" to the next line and produce an undesirable result (oh well!). This display allows you to see the actual covariance values used for kriging, for comparison with other programs or for verification of results.

"w" - Similar to the system key, it shows the point at which kriging is taking place, the value at that point, the number of neighbors in the search area, and the weights that krige is assigning to the neighbors involved.

SECTION X: CORRES

X.1: WHAT Corres DOES

Corres is a correspondence analysis program. Correspondence analysis is a technique for displaying the rows and columns of a data matrix (primarily, a two-way contingency table) as points in dual low-dimensional vector spaces. Options allow selection of a subset of variables and supplementary variables (additional classifications of the samples) from a Geo-EAS data file, and the number of factors (reasons for variability) to consider in the correspondence analysis. Corres produces outputs for:

In order to produce output compatible with Geo-EAS programs, these outputs have been isolated into four separate Geo-EAS format result files. These files can be viewed graphically with the public domain program XGobi from within the program. An option allows a subset of any one of the four output files to be merged with data from an existing Geo-EAS file to produce a "custom" output file.

X.2 DATA LIMITS

Corres requires that the input data file contain at least two but not more than 48 variables. A maximum of 25 variables, 10 supplementary variables and 7 factors can be used in the analysis. The data file may contain up to 10000 samples. If the data file contains more than 10000 samples, only 10000 will be used by Corres.

Corres also requires that the sample values be greater than or equal to zero, and that there are no missing sample values.

X.3 THE MENU HIERARCHY

Corres --Prefix
       | Data  --File
       |       |-Variables
       |       |-Options/Execute --File Names
       |       |                 |-Execute  --Samples  ------- 3-d-spin
       |       |                 |          |-Variables  ----- 3-d-spin
       |       |                 |          |-Reconstructed  --- 3-d-spin
       |       |                 |          |-Residuals  ----- 3-d-spin
       |       |                 |          |-3-d-spin             
       |       |                 |          |-Write  ----------Results File
       |       |                 |          |                |-Secondary File
       |       |                 |          |                |-Write
       |       |                 |          |                |-Quit
       |       |                 |          |-3-D-Spin
       |       |                 |          |-Quit
       |       |                 |-3-d-spin
       |       |                 |-Quit
       |       |-Quit
       |-Quit

X.4 THE MAIN MENU

The Main screen menu (Figure X-1) has the options to allow specification of the file prefix and access to the Data screen. The menu line appears as follows:

Prefix Data Quit


Menu Option and Comments:


Prefix Data

X.5 THE DATA MENU

The Data screen menu (Figure X-2) has the options to allow specification of the data file name and selection of a subset of variables to be used. The menu line appears as follows:

File Variables Options/Execute Quit


Menu Option and Comments


File Variables Options/Execute

The Options screen menu (Figure X-3) is used to view Nontrivial Eigenvalues, Percent of Variation and Cumulative Variation, to allow specification of the result file names and the number of factors to compute results for. The menu line appears as follows:

File Names Execute Quit


Menu Option and Comments


File Names Execute
X.7 THE RESULTS MENU

The Results screen menu (Figure X-4) is used to view the Reconstruction and Representation Errors for the input variables, and to provide access to specific results. The menu line appears as follows:

Write 3-D-Spin Quit


Menu Option and Comments:


Write 3-d-spin
X.8 THE WRITE MENU

The Write screen menu (Figure X-5) provides options to store subsets of result files permanently, and to optionally select variables from a second file which will be "carried along" with the results. The menu line appears as follows:

Results File Secondary File Write Quit


Menu Option and Comments:


Results File Secondary File Write

SECTION Y: COKRIG

0. NOTICE

COKRIG has been written to operate in the same mode as the programs in the Geo-EAS and was produced with support from EPA but as yet has not been officially accepted or released by EPA, no official status or approval should be inferred nor is it being supported as part of the Geo-EAS package . It does use the same Screen Management Utilities as Geo-EAS and it reads, writes Geo-EAS files.

1. GENERAL INFORMATION

COKRIG is the cokriging analogue of KRIGE in the Geo_EAS package. Because the program has been constructed and functions in a manner very similar to the programs in Geo-EAS, the user should consult the Geo-EAS manual for questions pertaining to file formats and questions about the utilities in Geo-EAS as well as general mode of operation of those programs. The file COKRIG.SCR must be in the same directory as the code COKRIG.EXE, this file contains the screen definitions for the different menus.

The program is based on the work of Myers (Math. Geology 1982, 1983), Myers (NATO ASI 1983), Myers (Sciences de la Terre 1988) and is an extension of the program given in Carr,Myers and Glass (Computers and Geosciences 1985). The main cokriging component was written by Gerald Jalkanen, the adaptation to the Geo-EAS format was the joint work of Nabil Chbouki, Renduo Zhang and Gerald Jalkanen.

For further information contact:

2.WHAT COKRIG DOES

COKRIG is an interactive program which performs three dimensional cokriging, all variables are cokriged with automatic adjustment for undersampled variables, it is the cokriging counterpart of KRIGE. Except for missing values all data at all locations in the search neighborhood is used for the estimation of all variables. The program incorporates universal cokriging, the drift(s) being represented as polynomials in the position coordinates. A rectangular grid of cokriged estimates and cokriging standard deviations is created and stored in a Geo-EAS file, either point or block cokriging may be used. A grid file is produced, suitable for contouring or 3-d viewing. Options are provided to specify the dimension, the number of variables, the type (point or block), the neighborhood search area, the grid spacing, the variogram model for each variable and the cross-variogram model for each pair of variables. The program parameters may be saved in a parameter file for later use. If the Listing option is activated then an ASCII file *.OUT will be generated which echos all the parameter choices as well as tabulating all the output, if Listing is not activated then only the parameter choices are echoed. If the Debug option is activated, debug information is written to the *.OUT file, if the Dual option is activated then in addition to the cokriged estimates, the Dual weight matrices are generated and saved.

3.DATA LIMITS

COKRIG reads a standard Geo-EAS file except that it allows for three dimensions, the third dimension being denoted as Elevation. Missing data is indicated as in Geo-EAS files and the program automatically uses the undersampled form when required. The program is designed for at most four variables, but any four variables in the data set may be used. A grid of up to 100 x 100 may be created.

3.THE MENU HIERARCHY

Cokriging      --Prefix
            Read Parameters
            Options/Execute---Data
            Save Parameters   Type
           \Quit                 Grid
                            Search
                            Variograms---Select
                                       Add
                                       Edit
                                       Variable
                                      \Quit

                            Drift--------Edit
                            Title     \Quit
                            Execute
                           \Quit

Prefix

The Prefix option is used to enter the prefix for the data file name.

Read Parameters

The Read Parameters option provides a means for loading program parameter values from parameter file. When this option is selected a prompt is issued for the Input Parameter file name. A default name is constructed from the most recently used Geo-EAS data file name together with the extension 'kpf'.

Options/Execute

The Options/Execute option provides access to the COKRIGING OPTIONS menu which is described below.

Save Parameters

The Save Parameters option provides a means for saving program parameter values in a parameter file for later use. 5. The COKRIGING OPTIONS menu

Data

The Data option is used to specify the number of variables (1,2,3 or 4), the dimension (1,2 or 3), the names of the data, *.OUT and *.GRD files and to indicate whether the Debug and Listing will be activated. The screen fields acccessed through the Data option are:

.No. Variables - a single digit integer specifying the number of variables used in the analysis. Allowable choices are 1,2,3 or 4

.No. Dimensions - a single digit integer specifying the number of dimensions for the space of the data locations. Allowable choices are 1,2 or 3 and it is assumed that the position coordinate names are Easting, Northing, Elevation.

.Data file - a 14 character aphanumeric field in which the name of the input data file is entered. Once the name is given the program reads the header of the data file, if the data file can not be located or an error occurs while accessing the file then an error message is displayed.

.Out file - a 14 character alphanumeric field in which the name of the output file is entered. A default name is constructed consisting of the name of the data file with the extension '.OUT. This will be an ASCII file suitable for printing which will echo the parameter choices made in executing COKRIG. If Debug is activated it will also contain the listing of the the locations used for each point/block cokriged as well as the cokriging matrices and the resulting cokriging weights. If Listing is activated then the cokriging results will be echoed to the *.OUT file.

.Grd file - a 14 character alphanumeric field in which the name of the Geo-EAS output file is entered. A default name is constructed from the name of the data file together with the extension '.GRD'. This file is the counterpart of the *.GRD produced by KRIGE in Geo- EAS and contains for each grid point/block, the coordinates of the point/block, the cokriged value for each variable and the cokriging standard deviation. This is a Geo-EAS format file and can be read other utilities in Geo-EAS.

.Debug - a single digit integer field indicating whether the Debug results are to be saved in the *.OUT file

Type

The Type option is used to specify whether point or block cokriging is used and whether Dual Cokriging is used. In the case where block cokriging is selected it is also necessary to specify the number of grid points to be used in each block for the numerical integration used in computing the average values of variograms and cross-variograms. In the case of point cokriging the default values are shown. The screen fields accessed through the Type selection are:

.Type of kriging - a single digit integer field indicating whether the Dual Cokriging weights are to be computed and saved. If the cokriging estimator is written in Dual form (see Myers, Sciences de la Terre, 1988) then for each sample location there is a (Dual) weight matrix, these will appear in the *.OUT file. If non-unique neighborhoods are used for cokriging then there is a Dual weight matrix for each sample location for each neighborhood.

.Point or Block - a single digit integer field indicating whether point or block cokriging is desired. If block cokriging is selected then the program must compute average values of the variograms and cross-variograms, this is done numerically. The numerical integration uses the values of the variograms and cross-variograms on a grid superimposed on the block (possibly in three dimensions), the grid is specified by indicating the number of points on the grid in each of the directions.

.X - a three character integer field specifying the number of points on the numerical integration grid in the Easting dimension.

.Y - a three character integer field specifying the number of points on the numerical integration grid in the Northing dimension

.Z - a three character integer field specifying the number of points on the numerical integration grid in the Elevation dimension.

If point cokriging is selected then the default values for X,Y,Z are all '1'.

Grid

The Grid option is used to specify the variables to use as coordinate values, the origin of the grid, the size of the grid cells and the number of cells in the X,Y and Z directions. These are specified with four screen fields for each of the three directions. If in the Data option the dimension selected is 1 or 2 then default values are given for the four screen fields for the other dimension(s). Default values are read into the fields based on the data file information when the Grid option is selected. The screen fields for the X,Y and Z directions are:

.Variable - a 10 character alphanumeric field denoting the name used in the data file for the position coordinate. Ordinarily these are Easting, Northing and Elevation respectively for X, Y, Z.

.Origin - a 10 real character field prescribing the X, Y, Z coordinates of the origin of the grid wherein the estimates are to be made.

.Spacing - a 10 real character field prescribing the grid mesh size in the X, Y, Z directions respectively.

.Begin Num - a three character integer field specifying the number of the cell in the X, Y, Z directions respectively at which cokriging is to start.

.End number - a three character integer field specifying the number of the last cell in the X, Y, Z directions respectively at which cokriging is to occur.

Search

The Search option is used to specify the size and orientation of the three dimensional search neighborhood and to specify the minimum, maximum number of sample locations to be used in cokriging each point. If a search neighborhood does not contain the specified minimum number of sample locations then that grid point/block will not be cokriged. The Search neighborhood is assumed to be an ellipsoid centered at the point to be cokriged. Default values for the screen fields are entered after the Search option is selected. The Screen fields for the Search neighborhood are:

.Radiusa - a 10 character real field specifying the length of the semi-axis in the X direction for the ellipsoidal search neighborhood.

.Radiusb - a 10 character real field specifying the length of the semi-axis in the Y direction for the ellipsoidal search neighborhood.

.Radiusc - a 10 character real field specifying the length of the semi-axis in the Z direction for the ellipsoidal search neighborhood.

.Dip angle -a 10 character real field specifying the angle of rotation of the ellipsoid in the vertical plane.

.Azimuth angle - a real character real field specifying the angle of rotation of the ellipsoid in the horizontal plane.

.Min. # of data - a three character integer field specifying the minimum number of sample locations to be used in the cokriging, these locations to be inside the search neighborhood. If for a particular grid point, the search neighborhood does not contain the prescribed minimum number of sample locations that grid point will not be cokriged.

.Max. # of data - a three character integer field specifying the maximum number of sample locations to be used in the cokriging, these locations to be inside the search neighborhood. If the search neighborhood contains more than the prescribed maximum then only the closest locations will be used.

Variograms

The Variograms option provides access to the VARIOGRAMS AND CROSS-VARIOGRAMS menu described below. The program does not compute or model variograms, cross-variograms nor does it run complete checks on the validity of the models. If the cokriging coefficient matrix is found to be non-invertible for a particular grid point then that point will be omitted but no error message is given. If ANY negative cokriging variances are found then one or more variogram/cross-variogram models is invalid and all results are invalid even those for which the cokriging variances are positive since the positive definiteness condition is not satisfied.

Drift

The Drift option provides access to the VARIABLES AND DRIFT SELECTION menu described below.

Execute

This option will start the process of cokriging, as each grid point/block is cokriging an echo of the results will show on the screen. The results will be written to the *.GRD file and the *.OUT file. If the parameters are to be saved then the user should return to the initial menu and utilize the Save Parameters option before "quit"ing the program. Cokriging will be somewhat slow if a coprocessor is not present, this is more crucial than the clock speed of the CPU.

6. The VARIOGRAMS AND CROSS-VARIOGRAMS menu

The cokriging equations are written in terms of a symmetric matrix valued variogram of size m x m where m is the number of variables. The diagonal entries are variograms and the off-diagonal entries are the cross-variograms. To simply the process of entering the models the entries are numbered instead of being indexed by a pair of subscripts (row-column) which would correspond to the variable numbering.

Examples

          m=1, then there is only one entry and it is #1

          m=2, then there are four entries but using the symmetry
          only three entries need be specified, two variograms and
          one cross-variogram.

                    1  2
                       3

          m=3, then there are nine entries but only six need be
          specified, three variograms and three cross-variograms

                    1  2  4  
                       3  5
                          6

          m=4, then there are sixteen entries but only ten need be
          specified, four variograms and six cross-variograms

                    1  2  4  7
                       3  5  8
                          6  9
                            10
1 denotes the variogram for variable 1, 2 the cross-variogram for variables 1 and 2, 3 the variogram for variable 2, 4 the cross- variogram for variables 1 and 3, 5 the cross-variogram for variables 2 and 3, 6 the variogram for variable 3, etc. Variograms can be entered as nested structures (positive linear combinations) using standard models. Cross-variograms can also be entered as linear combinations but depending on which one of four different ways of relating the cross-variogram to standard models there may be negative coefficients. Note that the validity of a cross-variogram representation is related to the models used for each of the associated variograms, simply modeling a cross-variogram by a valid variogram model is not sufficient to assure a valid cross-variogram model.

Select

The Select option initiates the process of filling variogram matrix by first specifying which of the entries (as illustrated above) is to be entered (for m=1 the Select option is used only once, for m=2 it is used three times, for m=3 it is used six times and for m=4 it is used ten times). The screen fields accessed from the Select option are:

.Variogram matrix entry - a two character integer field specifying the entry in the variogram matrix whose representation is to be entered.

.Cross variogram type - a single digit integer field specifying the form of the cross-variogram representation or indicating that the entry is a variogram.

.Number of nested models - a single digit integer field specifying the number of terms (models) to appear in the linear combination for the specified entry, the maximum is 5.

.Total sill - a 10 real character field specifying the total sill, i.e., the sum of the sills of the separate terms in the linear combination. After this field is entered a message will be displayed indicating that the Edit option should be chosen. For each entry (Variogram matrix entry) the Edit option will have to be used as many times as are specified by the 'Number of nested models'.

Edit

The Edit option allows prescribing one of the standard variogram model types (there are 17 to choose from) and to specify the parameters of the model, it also allows prescribing whether the sign on the term in the linear combination is to be positive or negative. In the case of a variogram (a diagonal entry in the matrix) all signs must be positive, the default sign is +. The screen fields accessed by the Edit option are:

.Model number - a single digit integer field specifying the number of the term in the linear combination, the maximum is fixed by the field 'Number of Nested models'.

.Model type -a two character integer field specifying the model type as shown on the right hand side of the menu.

.Model sign - this is toggled from + to - by the space bar

.Model sill - a 10 character real field specifying the sill of this particular model, i.e., term in the linear combination.

The program allows for geometric anisotropic models in three dimensions.

.Model range a - a 13 real character field specifying the range in the X direction.

.Model range b - a 13 real character field specifying the range in the Y direction.

.Model range c - a 13 real character field specifying the range in the Z direction.

.Azmiuth - a 13 character field specifying the angle of anisotropy in the horizontal plane

.Dip - a 13 character field specifying the angle of anisotopy in the Z direction

.Variable -a 10 character alphanumeric field specifying the variable for which a variogram is being entered (no entry is required in this field when a cross-variogram is being entered)

Add

The Add option allows modification of the form of the linear combination representation of a variogram or cross-variogram by adding additional terms. It allows increasing the value of 'Number of Nested models' entered in the Select option. The only screen field accessed directly by Add is:

.Number of nested models - a single digit integer field specifying the number of terms (models) to appear in the linear combination for the specified entry, the maximum is 5.

After this option is used it will be necessary to 'Edit' to enter the model types and model parameters for the additional terms in the linear combination.

Variable

The Variable option allows updating the name of the variable corresponding to a new selection of the 'Variogram matrix entry'. The only screen field accessed by the Variable option is:

.Variable -a 10 character alphanumeric field specifying the variable for which a variogram is being entered (no entry is required in this field when a cross-variogram is being entered)

7. The VARIABLES AND DRIFT SELECTION menu

Edit

The Edit option allows specifying for each variable the name, the number of terms in the polynomial representation of the drift for the variable and within each term the exponents on the X, Y and Z coordinates. The Edit option must be used as many times as there are variables. The screen fields accessed from the Edit option are:

.Variable - a 10 character alphanumeric field specifying the variable for which the drift is being represented.

.Variable number - a single digit integer field specifying the variable for which the drift is being represented.

.Number of drift terms - a single digit integer field specifying the number of terms in the polynomial (including the constant term which corresponds to all exponents equal to zero).

.Drift Term No. - a single digit integer field specifying the number of the term in the polynomial for which the exponents are to be given by the fields X Power, Y Power, Z Power.

.X Power - a single digit integer field specifying the exponent on the X coordinate in the term indicated by the field 'Drift Term No.'

.Y Power - a single digit integer field specifying the exponent on the Y coordinate in the term indicated by the field 'Drift Term No.'

.Z Power - a single digit integer field specifying the exponent on the Z coordinate in the term indicated by the field 'Drift Term No.'

8. VARIOGRAM and CROSS-VARIOGRAM estimation and modeling

As indicated above in the description of the VARIOGRAMS AND CROSS-VARIOGRAMS SELECTION menu, cokriging requires modeling a variogram for each variable AND a cross-variogram for each pair of variables. Moreover these must be modeled in such a way that the matrix function satisfies the appropriate positive definiteness condition. While the condition on the the variograms can be tested on each variogram separately the same is not true for cross-variograms and there is no simple test.

The problem of modeling and of testing can be attacked simultaneously by considering the variograms of the SUM and the DIFFERENCE for each pair of variables. For a given pair of variables, these two special variograms are particular linear combinations of (1) the variograms of the two variables (2) the cross- variogram. That is, the cross-variogram can obtained from either the SUM variogram and the two separate variograms or from the DIFFERENCE variogram and the two separate variograms. By modeling four variograms and by ensuring that the same cross- variogram is obtained from both representations the problem of estimating/modeling cross-variograms as well as that of testing for positive definiteness is reduced to the techniques used in univariate geostatistics including the software. Variograms (the separate variables, the SUM and the DIFFERENCE) can be estimated using any standard variogram estimating technique and software, moreover they can be cross-validated in the usual way. One only has to check that the same cross-variogram is obtained from representations. If each of the four variograms is constructed as a nested structure using the same model types (including the ranges and anisotropy characteristics) allowing a model type to appear with a zero coefficient, it is sufficient to check the two representations of the cross- variogram term by term. Computing and plotting the sample variogram can be a useful adjunct to the modeling process but it is not sufficient to try to fit a standard model to such a plot.

Since this process of using the SUM and DIFFERENCE variograms as proxies for the cross-variogram requires that their representations not be the same, when the data for the two variables is quite disparate it may be necessary to re-scale the variables to enhance the difference. The scaling may be additive or multiplicative.

Let Zi(x),Zj(x) be two variables of interest. Let ij(h) be the cross- variogram for variables Zi(x),Zj(x) and +ij(h), -ij(h) be the variograms of Zi(x) + Zj(x), Zi(x) - Zj(x) respectively. Then

        +ij(h) =      ii(h) +    jj(h) +2     ij(h)

        -ij(h) =      ii(h) +    jj(h) -2     ij(h)


If       -ij(h) =      dijk   k(h)

         +ij(h) =      cijk   k(h)

     
         ii(h) =      aiik   k(h)
where the k(h) are valid variogram models and dijk, cijk, aiik are NON-negative coefficients then it is sufficient for

         dijk  +  cijk  =  2{aiik + ajjk}  for each k.
If scaling is used there are two possible approaches. Re-scale the data solely for the purpose of estimating and modeling the variograms and cross-variograms then solve for the variograms and cross- variograms of the originals. For example if a, b, c, d are constants with a, c positive and Zi(x),Zj(x) are replaced by aZi(x)+b, cZj(x)+d respectively then ii(h) jj(h), ij(h) are replaced by a2 ii(h), b2 jj(h), ab ij(h). Alternatively the variograms and cross-variograms can be used with the transformed data and the cokriged values re- transformed.

APPENDIX A: REFERENCES

Carr,J., Myers,D.E. and Glass,C., 1985, "Co-kriging:A Computer Program", Computers and Geosciences, 11, 111-

Clark, I., 1979, Practical Geostatistics, Applied Science Publishers, London.

David, M., 1984, Geostatistical Ore Reserve Estimation, Elsevier, Amsterdam.

Journel, A. G. and C. H. Huijbregts, 1978, Mining Geostatistics, Academic Press, London.

Myers,D.E., "Matrix Formulation of Cokriging," Math. Geology, 14,(1982), 249-257

Myers,D.E., 1983, "Estimation of Linear Combinations and Cokriging. Math. Geology, 15, 633-637

Myers,D.E., 1984, "Cokriging:New Developments," in Geostatistics for Natural Resource Characterization, G.Verly et al, (eds)., D. Reidel Pub. Co., Dordrecht.

Myers,D.E., 1985, "Co-kriging:Methods and Alternatives," in The Role of Data in Scientific Progress, P. Glaeser, (ed)., Elsevier Scientific Pub.

Myers,D.E., 1988, "Some Aspects of Multivariate Analysis," in Quantitative Analysis of Mineral and Energy Resources, C.F. Chung et al (eds), D. Reidel Publishing Co., Dordrecht, 669-687

Myers,D.E., 1988, "Multivariate Geostatistics for Environmental Monitoring," Sciences de la Terre.

Rendu, J.M., 1981, An Introduction to Geostatistical Methods of Mineral Evaluation, South African Institute of Mining and Metallurgy, Johannesburg.

Srivastava, R.M., 1988, "A Non-ergodic Framework for Variograms and Covariance Functions", SIMS Technical Report No. 114, Dept. of Applied Earth Sciences, Stanford University.