Sample Data: Leukaemia NW England

Source Notebook

Locations of leukaemia in N.W. England from 1982 to 1998, annotated with age, gender, deprivation index, and subject type marks

Details

Locations of leukaemia in N.W. England from 1982 to 1998 (inclusive) in the observation region that is the polygon of N.W. England, annotated with age, gender, deprivation index (higher values indicating less affluence), and subject type (case or control) marks.

Examples

Basic Examples (1)

In[1]:=

$ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"]$

Out[1]=

Summary of the spatial point data:

In[2]:=

$ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"]["Summary"]$

Out[2]=

Visualizations (3)

Plot the spatial point data:

In[3]:=

$Show[Graphics[{Opacity[.1], ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "ObservationRegion"]}], ListPlot[ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"], AspectRatio -> Full], Axes -> True]$

Out[3]=

Plot the data annotated with gender (3rd annotation) marks:

In[4]:=

$PointValuePlot[ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"], {1 -> None, 2 -> None, 3 -> Automatic, 4 -> None}, PlotLegends -> Automatic]$

Out[4]=

Plot the data annotated with type (4th annotation) marks:

In[5]:=

Out[5]=

Analysis (4)

Compute probability of finding a point within given radius of an existing point - NearestNeighborG is the CDF of the nearest neighbor distribution:

In[6]:=

$nnG = NearestNeighborG[ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"]]$

Out[6]=

In[7]:=

Out[7]=

In[8]:=

Out[8]=

NearestNeighborG as the CDF of nearest neighbor distribution can be used to compute the mean distance between a typical point and its nearest neighbor - the mean of a positive support distribution can be approximated via a Riemann sum of 1- CDF. To use Riemann approximation create the partition of the support interval from 0 to maxR into 100 parts and compute the value of the NearestNeighborG at the middle of each subinterval:

In[9]:=

step = maxR/100;
middles = Subdivide[step/2, maxR - step/2, 99];
values = nnG[middles];

Now compute the Riemann sum to find the mean distance between a typical point and its nearest neighbor:

In[10]:=

Out[10]=

Test for complete spacial randomness:

In[11]:=

$SpatialRandomnessTest[ResourceData[\!$\* TagBox["\"\<Sample Data: Leukaemia NW England\>\"", #& , BoxID -> "ResourceTag-Sample Data: Leukaemia NW England-Input", AutoDelete->True]$, "Data"], {"PValue", "TestConclusion"}] // Column$

Out[11]=

Bibliographic Citation

Gosia Konwerska, "Sample Data: Leukaemia NW England" from the Wolfram Data Repository (2022)

Data Resource History

Date Created: 23 August 2021

Source Metadata

Citation:
- Henderson, R., Shimakura, S. and Gorst, D. (2002). Modeling spatial variation in leukaemia survival. Journal of the American Statistical Association, 97, 965-972.

Publisher Information

Prepared for the Wolfram Data Repository By: Gosia Konwerska
(Wolfram Research, Inc.)
Publisher of Record: Gosia Konwerska