ESE 502 Home Page



Instructor: Tony E. Smith



250 Towne (898-9647)

tesmith@seas.upenn.edu



Office Hours: By Appointment





TABLE OF CONTENTS

COURSE DESCRIPTION

(return to contents)

PREREQUISITES

(return to contents)

READING MATERIALS

Primary Text

Interactive Spatial Data Analysis, T.C. Bailey and A.C. Gatrell

(All relevant sections of the text will be provided in class)

Also Recommended:

Applied Spatial Statistics for Public Health Data, L.A. Waller and C.A. Gotway



Statistics for Spatial Data, N.A.C. Cressie

Spatial Econometrics, L. Anselin

JMP Start Statistics [JMP], J. Sall and A. Lehman



(return to contents)

COURSE TOPICS

Spatial Point Pattern Analysis

Nearest-Neighbor Methods

K-Function Methods

Continuous Spatial Data Analysis

Variogram Methods

Kriging Methods

Areal Data Analysis

Spatial Regression Models

Maximum Likelihood Estimation

Spatial Diagnostics

COURSE GRADING

There are no exams in this course. Grading is based entirely on the six homework assignments. Each assignment is weighted equally in the final evaluation. But in "border-line" cases for final grades, I do look for improvement over the semester.

(return to contents)

COURSE TIME AND LOCATION



Tuesdays and Thursdays, 4:30-6 PM



319 Towne Building





(return to contents)



CETS Labs



All SOFTWARE is available on the machines in the CETS Labs located in M62 Towne and 100 Moore Suite.



All students need ENIAC accounts to use the labs, which can be obtained from the CETS office (164 Levine, 9-5, M-F). Information on lab hours is also available there. You can also set up a lab account online at https://www.seas.upenn.edu/accounts

TENTATIVE SCHEDULE FOR SPRING 2020

Lectures Day/Date Topic Homework INTRO Th/Jan.16 Introduction

1 Tu/Jan.21 Point Pattern Data

2 Th/Jan.23 CSR Hypothesis

3 Tu/Jan.28 Nearest-Neighbor Methods

4 Th/Jan.30 Data Applications 5 Tu/Feb.4 K-Function Analysis PS1 due

6 Th/Feb.6 Simulation Testing Methods

7 Tu/Feb.11 Bivariate K-Functions

8 Th/Feb.13 Tests of Pattern Similarity

9 Tu/Feb.18 Space-Time Patterns 10 Th/Feb.20 Continuous Spatial Data

11 Tu/Feb.25 Spatial Variograms PS2 due

12 Th/Feb.27 Variogram Estimation

13 Tu./Mar.3 Simple Kriging Model

14 Th/Mar.5 Kriging Predictions Tu/Mar.10 SPRING BREAK



Th/Mar.12 SPRING BREAK



15

Tu/Mar.17 Simple Regression Model 16

Th/Mar.19 Generalized Least Squares PS3 due

17

Tu/Mar.24 Universal Kriging Model

18 Th/Mar.26 Universal Kriging Estimation 19 Tu/Mar.31 Data Applications 20 Th/Apr.2 Data Applications

21 Tu/Apr.7 Regional Spatial Data

22 Th/Apr.9 Spatial Autocorrelation

23 Tu/Apr.14 Spatial Concentration PS4 due 24 Th/Apr.16 Spatial Autoregression

25 Tu/Apr.21 Spatial Lag Model

26 Th/Apr.23 Spatial Diagnostics

27 Tu/Apr.28 Additional Regression Topics PS6 Mon/May 4 Assignment 5 Due (5PM) PS5 due

HOMEWORK ASSIGNMENTS

When doing the lab procedures for these assignments, you might wish to work together with other class members. This is often a more efficient way to learn, and is certainly more fun. My only requirement is that your written reports be individual work.

When submitting assignments, I would like you to email me a PDF copy for my files and submit a HARD copy for grading purposes. You can submit your assignments in class, or leave them in the box on the floor outside my office door (by 5PM on the due date).

(return to contents)



EXTRA MATERIALS

Using LeSage Models. This gives a brief introduction to the package of MATLAB

programs available on Jim Lesage's web site.



ArcView 10 Manual . This manual, written by Amy Hillier, provides a brief

introduction to some of the more useful procedures

in ArcView 10.





Using MATHTYPE . These notes show you how to access MATHTYPE in

WORD, and use it to write both mathematical equations

and in-line expresssions in your reports.



Matrix Regression . These slides give an introduction to MATRIX ALGEBRA

in the context of MULTILE REGRESSION



Reference Materials . This is a secure site containing additional reference materials

for the course that are copyrighted.



Lab Data Access . These notes are intended for those students who do not have

a class account, but would still like to access the class data sets and

use the software in the lab.



Remote Data Access . These notes are intended for those students who have access to

the ARCMAP, JMP, and MATLAB software elsewhere, and

want to access the class data sets from this remote location.

(return to contents)





SPATIAL DATA WEB SITES



PENN CAMPUS RESOURCES

http://guides.library.upenn.edu/data

http://www.cml.upenn.edu/



GENERAL SPATIAL DATA RESOURCES

https://www.opendataphilly.org/

http://www.census.gov/geo/www/cob/index.html

http://www.dcrp.ced.berkeley.edu/research/footprint

http://fisher.lib.virginia.edu/index.html

http://www.gisdatadepot.com/

http://gis.about.com/

http://www.esri.com/data/download/index.html

http://www.maproom.psu.edu/dcw/

http://www.lib.ncsu.edu/stacks/gis/dcw.html

http://www.fedstats.gov/mapstats/

http://factfinder.census.gov/servlet/BasicFactsServlet

http://homer.ssd.census.gov/cdrom/lookup

http://www.nationalgeographic.com/maps/

http://nationalatlas.gov/natlas/natlasstart.asp

http://www3.cancer.gov/atlasplus/

http://www.wakanow.com/ng/pages/a-wakanow-guide-to-geography





(return to contents)



NOTEBOOK ON SPATIAL DATA ANALYSIS

NOTE: To cite this material, use:

Smith, T.E., (2020) Notebook on Spatial Data Analysis [online] http://www.seas.upenn.edu/~ese502/#noteboo k



INTRODUCTION

I. SPATIAL POINT PATTERN ANALYSIS

1. Examples of Point Patterns

1.1 Clustering versus Uniformity

1.2 Comparisons between Point Patterns



2. Complete Spatial Randomness

2.1 Spatial Laplace Principle

2.2 Complete Spatial Randomness

2.3 Poisson Approximation

2.4 Generalized Spatial Randomness

2.5 Spatial Stationarity



3. Testing Spatial Randomness

3.1 Quadrat Method

3.2 Nearest-Neighbor Methods

3.2.1 Nearest-Neighbor Distribution under CSR

3.2.2 Clark-Evens Test

3.3 Redwood Seedling Example

3.3.1 Analysis of Redwood Seedlings using JMPIN

3.3.2 Analysis of Redwood Seedlings using MATLAB

3.4 Bodmin Tors Example

3.5 A Direct Monte Carlo Test of CSR



4. K-Function Analysis of Point Patterns



4.1 Wolf-Pack Example

4.2 K-Function Representations

4.3 Estimation of K-Functions

4.4 Testing the CSR Hypothesis

4.5 Bodmin Tors Example

4.6 Monte Carlo Testing Procedures

4.6.1 Simulation Envelopes

4.6.2 Full P-Value Approach

4.7 Nonhomogeneous CSR Hypotheses

4.7.1 Housing Abandonment Example

4.7.2 Monte Carlo Tests of Hypotheses

4.7.3 Lung Cancer Example

4.8 Nonhomogeneous CSR Hypotheses

4.8.1 Construction of Local K-Functions

4.8.2 Local Tests of Homogeneous CSR Hypotheses

4.8.3 Local Tests of Nonhomogeneous CSR Hypotheses



5. Comparative Analyses of Point Patterns



5.1 Forest Example

5.2 Cross K-Functions

5.3 Estimation of Cross K-Functions

5.4 Spatial Independence Hypothesis

5.5 Random-Shift Approach to Spatial Independence

5.5.1 Spatial Independence Hypothesis for Random Shifts

5.5.2 Problem of Edge Effects

5.5.3 Random Shift Test

5.5.4 Application to the Forest Example

5.6 Random-Labeling Approach to Spatial Independence

5.6.1 Spatial Indistinguishability Hypothesis

5.6.2 Random Labeling Test

5.6 3 Application to the Forest Example

5.7 Analysis of Spatial Similarity

5.7.1 Spatial Similarity Test

5.7.2 Application to the Forest Example

5.8 Larynx and Lung Cancer Example

5.8.1 Overall Comparison of the Larynx and Lung Cancer Populations

5.8.2 Local Comparison in the Vacinity of the Incinerator

5.8.3 Local Cluster Analysis of Larynx Cases



6. Space-Time Point Processes



6.1 Space-Time Clustering

6.2 Space-Time K-Functions

6.3 Temporal Indistinguishability Hypothesis

6.4 Random Labeling Test

6.5 Application to the Lymphoma Example





APPENDIX TO PART I



A1.1. Poisson Approximation of the Binomial

A1.2. Distributional Properties of Nearest-Neighbor Distances under CSR

A1.3. Distribution of Skellam's Statistic under CSR

A1.4. Effects of Postively Dependent Nearest-Neighbor Samples

A1.5. The Point-in-Polygon Procedure

A1.6. A Derivation of Ripley's Correction

A1.7. An Alternative Derivation of P-Values for K-Functions

A1.8. A Grid Plot Procedure in MATLAB

II. CONTINUOUS SPATIAL DATA ANALYSIS





1. Overview of Spatial Stochastic Processes



1.1 Standard Notation

1.2 Basic Modeling Framework



2. Examples of Continuous Spatial Data



2.1 Rainfall in the Sudan

2.2 Spatial Concentration of PCBs

3. Spatially-Dependent Random Effects



3.1 Random Effects at a Single Location

3.1.1 Standardized Random Variables

3.1.2 Normal Distribution

3.1.3 Central Limit Theorems

3.1.4 CLT for the Sample Mean

3.2 Multi-Location Random Effects

3.2.1 Multivariate Normal Distribution

3.2.2 Linear Invariance Property

3.2.3 Multivariate Central Limit Theorem

3.3 Spatial Stationarity

3.3.1 Example: Measuring Ocean Depths

3.3.2 Covariance Stationarity

3.3.3 Covariograms and Correlograms



4. Variograms



4.1 Expected Squared Differences

4.2 The Standard Model of Spatial Dependence

4.3 Non-Standard Spatial Dependence

4.4 Pure Spatial Dependence

4.5 The Combined Model

4.6 Explicit Models of Variograms

4.6.1 The Spherical Model

4.6.2 The Exponential Model

4.6.3 The Wave Model

4.7 Fitting Variogram Models to Data

4.7.1 Empirical Variograms

4.7.2 Least-Squares Fitting Procedure

4.8 The Constant-Mean Model

4.9 Example: Nickel Deposits on Vanvouver Island

4.9.1 Empirical Variogram Estimation

4.9.2 Fitting a Spherical Variogram

4.10 Variograms versus Covariograms

4.10.1 Biasedness of the Standard Covariance Estimator

4.10.2 Unbiasedness of Empirical Variogram for Exact-Distance Samples

4.10.3 Approximate Unbiasedness of General Empirical Variograms



5. Spatial Interpolation Models



5.1 A Simple Example of Spatial Interpolation

5.2 Kernel Smoothing Models

5.3 Local Polynomial Models

5.4 Radial Basis Function Models

5.5 Spline Models

5.6 A Comparison of Models using the Nickel Data

6. Simple Spatial Prediction Models



6.1 An Overview of Kriging Models

6.1.1 Best Linear Unbiased Predictors

6.1.2 Model Comparisons

6.2 The Simple Kriging Model

6.2.1 Simple Kriging with One Predictor

6.2.2 Simple Kriging with Many Predictors

6.2.3 Interpretation of Prediction Weights

6.2.4 Construction of Prediction Intervals

6.2.5 Implementation of Simple Kriging Models

6.2.6 An Example of Simple Kriging

6.3 The Ordinary Kriging Model

6.3.1 Best Linear Unbiased Estimation of the Mean

6.3.2 Best Linear Unbiased Predictor of Y

6.3.3 Implementation of Ordinary Kriging

6.3.4 An Example of Ordinary Kriging

6.4 Selection of Prediction Sets by Cross Validation

6.4.1 Log-Nickel Example

6.4.2 A Simulated Example

7. General Spatial Prediction Models



7.1 The General Linear Regression Models

7.1.1 Generalized Least Squares Estimation

7.1.2 Best Linear Unbiasedness Property

7.1.3 Regression Consequences of Spatially Dependent

Random Effects.

7.2 The Universal Kriging Model

7.2.1 Best Linear Unbiased Prediction

7.2.2 Standard Error of Predictions

7.2.3 Implementation of Univesal Kriging

7.3 Geostatistical Regression and Kriging

7.3.1 Iterative Estimation Procedure

7.3.2 Implementation of Geo-Regression

7.3.3 Implementation of Geo-Kriging

7.3.4 Cobalt Example of Geo-Regression

7.3.5 Venice Example of Geo-Regression and Geo-Kriging

APPENDIX TO PART II

A2.1. Covariograms for Sums of Independent Spatial Processes

A2.3. Expectation of the Sample Estimator under Sample Dependence

A2.3. A Bound on the Binning Bias of Empirical Variogram Estimators

A2.4 . Some Basic Vector Geometry

A2. 5. Differentiation of Functions

A2.6 . Gradient Vectors

A2.7. Unconstrained Optimization of Smooth Functions

7.1 First-Order Conditions

7.2 Second-Order Conditions

7.3 Application to Ordinary Least Squares Estimation

A2.8. Constrained Optimization of Smooth Functions

8.1 Minimization with a Single Constraint

8.2 Minimization with Multiple Constraints

8.3 Solution for Universal Kriging

III. AREAL DATA ANALYSIS

1. Overview of Areal Data Analysis



1.1 Extensive versus Intensive Data Representations

1.2 Spatial Pattern Analysis

1.3 Spatial Regression Analysis



2. Modeling the Spatial Structure of Areal Units



2.1 Spatial Weights Matrices

2.1.1 Point Representations of Areal Units

2.1.2 Spatial Weights based on Centroid Distances

2.1.3 Spatial Weights based on Boundaries

2.1.4 Combined Distance-Boundary Weights

2.1.5 Normalizations of Spatial Weights

2.2 Construction of Spatial Weights Matrices

2.2.1 Construction of S patial Weights based on Centroid Distances

2.2.2 Construction of S patial Weights based Boundaries

3. The Spatial Autoregressive Model



3.1 Relation to Time Series Analysis

3.2 The Simultaneity Property of Spatial Dependencies

3.3 A Spatial Interpretation of Autoregressive Residuals

3.3.1 Eigenvalues and Eigenvectors of Spatial Weights Matrices

3.3.2 Convergence Conditions in Terms of Rho

3.3.3 A Steady-State Interpretations of Spatial Autoregressive Residuals

4. Testing for Spatial Autocorrelation



4.1 Three Test Statistics

4.1.1 Rho Statistic

4.1.2 Correlation Statistic

4.1.3 Moran Statistic

4.1.4 Comparison of Statistics

4.2 Asymptotic Moran Tests of Spatial Autocorrelation

4.2.1 Asymptotic Moran Test for Regression Residuals

4.2.2 Asymptotic Moran Test in ARCMAP

4.3 Random Permutation Test of Spatial Autocorrelation

4.3.1 SAC-Perm Test

4.3.2 Application to English Mortality Data

5. Tests of Spatial Concentration



5.1 A Probabilistic Interpretation of G*

5.2 Global Tests of Spatial Concentration

5.3 Local Tests of Spatial Concentration

5.3.1 Random Permutation Test

5.3.2 English Mortality Example

5.3.3 Asymptotic G* Test in ARCMAP

5.3.4 Advantage of G* over G for Analyzing Spatial Concentration

6. Spatial Regression Models for Areal Data Analysis



6.1 The Spatial Errors Model (SEM)

6.2 The Spatial Lag Model (SLM)

6.2.1 Simultaneity Structure

6.2.2 Interpretation of Beta Coefficients

6.3 Other Spatial Regression Models

6.3.1 The Combined Model

6.3.2 The Durbin Model

6.3.3 The Conditional Autoregressive (CAR) Model

7. Spatial Regression Parameter Estimation



7.1 The Method of Maximum-Likelihood Estimation

7.2 Maximum-Likelihood Estimation for General Linear Regression Models

7.2.1 Maximum-Likelihood Estimation for OLS

7.2.2 Maximum-Likelihood Estimation for GLS

7.3 Maximum-Likelihood Estimation for SEM

7.4 Maximum-Likelihood Estimation for SLM

7.5 An Application to the Irish Blood Group Data

7.5.1 OLS Residual Analysis and Choice of Spatial Weights Matrices

7.5.2 Spatial Regression Analyses

8. Parameter Significance Tests for Spatial Regression



8.1 A Basic Example of Maximum Likelihood Estimation and Inference

8.1.1 Sampling Distribution by Elementary Methods

8.1.2 Sampling Distribution by General Maximum-Likelihood Methods

8.2 Sampling Distributions for General Linear Models with Known Covariance

8.2.1 Sampling Distribution by Elementary Methods

8.2.2 Sampling Distribution by General Maximum-Likelihood Methods

8.3 Asymptotic Sampling Distributions for the General Case

8.4 Parameter Significance Tests for SEM

8.4.1 Parametric Tests for SEM

8.4.2 Application to the Irish Blood Group Data

8.5 Parameter Significance Tests for SLM

8.5.1 Parametric Tests for SLM

8.5.2 Application to the Irish Blood Group Data

9. Goodness-of-Fit Measures for Spatial Regression



9.1 The R-Squared Measure for OLS

9.1.1 The Regression Dual

9.1.2 Decomposition of Total Variation

9.1.3 Adjusted R-Squared

9.2 Extended R-Squared Measures for GLS

9.2.1 Extended R-Squared for SEM

9.2.2 Extended R-Squared for SLM

9.3 The Squared Correlation Measure for GLS Models

9.3.1 Squared Correlation for OLS

9.3.2 Squared Correlation for SEM and SLM

9.3.3 A Geometric View of Squared Correlation

10. Comparative Tests among Spatial Regression Models



10.1 A One-Parameter Example

10.2 Likelihood-Ratio Tests against OLS

10.3 The Common-Factor Hypothesis

10.4 The Combined-Model Approach

APPENDIX TO PART III

A3.1. The Geometry of Linear Transformations

3.1.1 Nonsingular Transformations and Inverses

3.1.2 Orthonormal Transformations

A3.2. Singular Value Decomposition Theorem

3.2.1 Inverses and Pseudoinverses

3.2.2 Determinants and Volumes

3.2.3 Linear Transformations of Random Vectors

A3.3. Eigenvalues and Eigenvectors

A3.4. Spectral Decomposition Theorem

3.4.1 Eigenvalues and Eigenvectors of Symmetric Matrices

3.4.2 Some Consequences of SVD for Symmetric Matrices

3.4.3 Spectral Decomposition of Symmetric Positive Semidefinite Matrices

3.4.4 Spectral Decompositions with Distinct Eigenvalues

3.4.5 General Spectral Decomposition Theorem

A3.5. Nonnegative Matrices

3.5.1 Strongly Connected Matrices

3.5.2 Perron-Frobenius Theorem

3.5.3 Application to Spatial Autoregressive Kernels

3.5.4 Geometry of Complex Eigenvalues

A3.6. Geometry of Correlation in Regression

3.6.1 Deviation Space

3.6.2 Regression in Deviation Space

3.6.3 Application to Squared Correlation for OLS and GLS

A3.7. Large Sample Properties of Maximum Likelihood Estimators

3.7.1 Some Useful Preliminary Results

3.7.2 Consistency of Maximum Likelihood Estimators

3.7.3 Asymptotic Normality of Maximum Likelihood Estimators

IV. SOFTWARE

1. ARCMAP

1.1 Opening ARCMAP

1.2 Tips for Using ARCMAP

1.2.1 Importing Text Files to ARCMAP

1.2.2 Changing Path Directories in Map Documents

1.2.3 Making a Column of Row Numbers in an Attribute Table

1.2.4 Masking in ARCMAP

1.2.5 Making Spline Contours in Spatial Analyst

1.2.6 Excluding Values from Map Displays

1.2.7 Importing ARCMAP Images to the Web

1.2.8 Adding Areas to Map Polygons

1.2.9 Adding Centroids to Map Polygons

1.2.10 Adding Coordinate Fields to Attributes of Point Shapefiles

1.2.11 Converting Strings to Numbers in ARCMAP

1.2.12 Displaying Proper Distance Units

1.2.13 Editing Point Styles in ARCMAP

1.2.14 Exporting Maps from ARCMAP to WORD

1.2.15 Making Legends for Exported Maps

1.2.16 Making Voronoi Tessellations in ARCMAP as Shapefiles

1.2.17 Running Local G* Tests of Concentration in ARCMAP

1.2.18 Joining Point Date to Polygon Shapefiles in ARCMAP

1.2.19 Saving Map Documents with Relative Paths

1.2.20 Increasing Unique Values for Editing Raster Outputs (in Version 9.3)

2. JMP

2.1 Opening JMP

2.2 Tips for using JMP

2.2.1 Printing Results from JMP

2.2.2 Making a Random Reordering of Row Numbers

3. MATLAB

3.1 Opening MATLAB



3.2 Tips for using MATLAB

3.2.1 Exporting Graphics from MATLAB to WORD

3.2.2 Making Boundary-Share Weight Matrices in MATLAB

3.2.3 Making Boundary-Share Weight Matrices using ARCMAP and MATLAB

3.2.4 Clipping Grids in ARCMAP for use in and MATLAB

3.2.5 Exporting Data from MATLAB to ARCMAP

3.2.6 Converting Boundary Shapefiles to MATLAB format.



REFERENCES





(return to contents)



