In this section, the necessary libraries are loaded,

1. The arcgisbinding library provides access to the ArcGIS REST API, enabling interaction with ArcGIS Online services and ArcGIS Server services.
2. The sf library provides simple features access, including reading and writing to various data formats.
4. The raster library provides functions to work with raster data in R, and the scales library provides functions for adding scales to ggplot2 plots.
# Load necessary libraries
library(arcgisbinding)
## *** Please call arc.check_product() to define a desktop license.
library(sf)
## Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(ggplot2)
library(stars)
## Loading required package: abind
library(raster)
## Loading required package: sp
library(scales)

The arc.check_product() function is part of the arcgisbinding library and is used to check the product license. This step is necessary to ensure that the correct license is available for use with the ArcGIS API.

# Load ArcGIS License
arc.check_product()
## product: ArcGIS Pro (13.0.3.36057)
## license: Advanced
## version: 1.0.1.300

This code loads two datasets - a remote ESRI Feature Class and a local Raster data set. The url variable holds the URL to the remote ESRI Feature Class and the url2 variable holds the path to the local Raster data set. The arc.open function is used to read the metadata for both datasets. The metadata is stored in the states_meta variable for the Feature Class and the lst_meta variable for the Raster data set. Finally, the metadata is printed to the console using the print function.

# Task 1 & 2

# Load the ESRI Feature Class - Remote
url <- 'https://services2.arcgis.com/bB9Y1bGKerz1PTl5/arcgis/rest/services/US_States_Feature_Local/FeatureServer/0'

#Read the metadata using arc.open
states_meta <- arc.open(url)

#Print the feature class metadata
print(states_meta)
## dataset_type    : FeatureClass
## path            : https://services2.arcgis.com/bB9Y1bGKerz1PTl5/arcgis/rest/services/US_States_Feature_Local/FeatureServer/0 
## fields          : FID, OBJECTID, NAME, STATE_ABBR, STATE_FIPS, ORDER_ADM, 
## fields          : MONTH_ADM, DAY_ADM, YEAR_ADM, TYPE, POP, SQ_MILES, 
## fields          : PRIM_MILES, Shape_Leng, Shape__Are, Shape__Len, Shape__Area, 
## fields          : Shape__Length, Shape
## extent          : xmin=-179.1473, ymin=17.6744, xmax=179.7784, ymax=71.38921
## geometry type   : Polygon
## WKT             : GEOGCS["GCS_North_American_1983",DATUM["D_North_American_198...
## WKID            : 4269
# Load the Raster data set -  Local
url2 <- 'F:\\CourseWork\\AdvProg\\FinalProject\\USLST_Cel.tif'

#Read the metadata using arc.open
lst_meta <- arc.open(url2)

#Print the raster data set metadata
print(lst_meta)
## dataset_type    : RasterDataset
## path            : F:\CourseWork\AdvProg\FinalProject\USLST_Cel.tif 
## format          : TIFF
## pixel_type      : F32 (32bit)
## compression_type: None
## nrow            : 614
## ncol            : 1641
## extent          : xmin=164.3917, ymin=17.33748, xmax=311.8052, ymax=72.49404
## WKT             : GEOGCS["GCS_North_American_1983",DATUM["D_North_American_198...
## WKID            : 4269
## bands           : 1
##                  ncol nrow        nodata       min      max     mean   stddev
## LST_Day_1km_mean 1641  614 -3.402823e+38 -40.28999 46.14392 5.428114 14.17401

This code performs the first part of task 3, which is to create a simple map of the US states with a single color fill. The arc.select function is used to select and convert the Feature Class data to a standard data frame. The arc.data2sf function is used to convert the data frame to a simple feature. A single color state_color is set for all states using the state_color variable. Finally, the visualization is created using ggplot2 and the geom_sf function is used to set the color of the states. The theme_void function is used to make the plot background transparent.

# Task 3 Part 1

# Select and convert feature class to an sf data frame
# Load data set to a standard data frame
states_select <- arc.select(states_meta)

# Convert the feature class to simple feature
states_sf <- arc.data2sf(states_select)


# Set a single color for all states
state_color <- "lightblue"

# Create a visualization using ggplot
ggplot(states_sf) + #Set simple feature
  geom_sf(fill = state_color) + #Set the color
  labs(title = "Map of States", #Set the plot Title
       subtitle = "Visualizing State Boundaries with a Single Color", #Optional: Add subtitle
       caption = "Data source: ArcGIS Online") + #Optional: Source as Caption
  theme_void() #make the plot background transparent

This code performs the second part of task 3, which is to create a visualization of the Raster data set. The arc.raster function is used to convert the Raster data set to an Arc Raster. The as.raster function is used to convert the Arc Raster to a Raster object in R. A color scheme color_scheme is defined using the colorRampPalette function. The plot function is used to plot the Raster object, using the defined color scheme.

# Task 3 Part 2

# Convert to Arc Raster
lst_arc <- arc.raster(lst_meta)

# Load the Arc Raster as Raster
lst_R <- as.raster(lst_arc)

# Define a red color scheme
color_scheme <- colorRampPalette(c("blue","lightblue", "red"))

# Plot the raster 
plot(lst_R, col = color_scheme(100), main = "Land Surface Temperature", sub = "Data source: MODIS LST")

This code performs the first part of task 4, which is to create a bar plot of the population distribution by state. The population count is converted to millions and stored in a new column POP_Millions. The ggplot function is used to initialize the plot and set the data frame and axis. The geom_bar function is used to create the bar plot and the fill argument is used to set the fill color. The x and y axis labels are set using the xlab and ylab functions.

#Task 4 Part 1 - Plot Vector Data

# Convert population count to millions
states_sf$POP_Millions <- states_sf$POP / 1000000 #Simplify the data to millions

# Create the bar plot
plot <- ggplot(states_sf, aes(x = NAME, y = POP_Millions)) + #Initialize ggplot, data frame and axis
  geom_bar(stat = "identity", fill = "lightblue") + # Set the bar height as the values of population count
  xlab("") + # No axis label for x axis
  ylab("Population Count (in millions)") + #y axis title
  ggtitle("Population Distribution by State") + # Plot Title
  scale_y_continuous(labels = comma) + #Format y-axis labels with commas
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1), #Rotate the axis title
        plot.title = element_text(hjust = 0.5), #Set the title position
        plot.margin = unit(c(1,2,1,1), "cm"), #Set plot margins
        legend.position = "none") #Disable the legend

plot  #Display the plot

This code performs the second part of task 4. First, values from the raster object are extracted and stored in a variable. Next, the extracted values are converted into a data frame, and the mean and standard deviation are calculated while ignoring NA values. The ggplot package is used to create the histogram with density, and the normal distribution curve is added using the stat_function function. The resulting plot displays the distribution of LST data and its relationship to a normal distribution.

#Task 4 Part 2 - Plot Raster Data

# Extract the values from the raster object
lst_values <- getValues(lst_R)

# Convert the raster values to a data frame
lst_df <- data.frame(value = lst_values)

# Calculate mean and standard deviation of the values
mean_val <- mean(lst_values, na.rm = TRUE) #ignoring NA values
sd_val <- sd(lst_values, na.rm = TRUE)

# Plot the histogram with density (geom_histogram with aes(y = ..density..))
ggplot(lst_df, aes(x = value)) +
  geom_histogram(aes(y = ..density..), bins = 30, fill = "lightblue", color = "black") + # Add
  stat_function(fun = dnorm, args = list(mean = mean_val, sd = sd_val), color = "red", lwd = 1) +
  labs(title = "LST Histogram with Normal Distribution Curve",
       x = "Surface Temperatures (Degree Celsius)",
       y = "Density") +
  theme_minimal()
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 619298 rows containing non-finite values (`stat_bin()`).

This code performs task 5, which is to extract the mean raster values for each state polygon. The Raster data set is read as a stars object using the read_stars function and stored in the lst_R_stars variable. The mean raster values are extracted for each state polygon using the st_extract function and stored in the extracted_values variable. A data frame mean_values_df is created with the state names and mean raster values.

# Task 5 - Extracting Zonal Data
# Read the raster data as a stars object
lst_R_stars <- read_stars("F:\\CourseWork\\AdvProg\\FinalProject\\USLST_Cel.tif")

# Extract mean raster values for each state polygon using sf_extract
extracted_values <- st_extract(lst_R_stars, states_sf, fun = mean, na.rm = TRUE)

# Create a data frame with state names and mean raster values
mean_values_df <- data.frame(NAME = states_sf$NAME, MeanValue = extracted_values)

# Print the resulting data frame with state names and mean raster values
print(mean_values_df)
##                    NAME             MeanValue.geometry MeanValue.USLST_Cel.tif
## 1              Arkansas POLYGON ((-94.29159 36.4992...               20.743596
## 2               Arizona POLYGON ((-110.4938 37.0037...               32.311393
## 3              Colorado POLYGON ((-106.3756 41.0007...               21.638479
## 4                  Iowa POLYGON ((-91.24676 43.5008...               17.432519
## 5              Illinois POLYGON ((-90.4371 42.50713...               18.627012
## 6               Indiana POLYGON ((-85.93071 41.7597...               18.869256
## 7                Kansas POLYGON ((-101.4523 40.0026...               23.482757
## 8              Missouri POLYGON ((-91.72747 40.6114...               19.541702
## 9              Nebraska POLYGON ((-103.249 43.00075...               21.198956
## 10           New Mexico POLYGON ((-103.0135 37.0002...               29.584687
## 11               Nevada POLYGON ((-114.5356 41.9971...               27.105202
## 12             Oklahoma POLYGON ((-100.106 37.00229...               24.572522
## 13            Tennessee POLYGON ((-87.84958 36.6636...               19.680069
## 14                 Utah POLYGON ((-111.0467 42.0015...               23.236934
## 15        West Virginia POLYGON ((-80.519 40.63334,...               17.456693
## 16                Idaho POLYGON ((-116.0492 49.0008...               15.425984
## 17              Montana POLYGON ((-114.3619 49.0011...               15.446485
## 18         North Dakota POLYGON ((-97.55732 49.0005...               14.715341
## 19         South Dakota POLYGON ((-102.2701 45.9448...               17.859549
## 20              Vermont POLYGON ((-72.77007 45.0158...               11.644919
## 21              Wyoming POLYGON ((-109.1034 45.0059...               17.926762
## 22          Puerto Rico MULTIPOLYGON (((-66.52846 1...               28.244854
## 23  U.S. Virgin Islands MULTIPOLYGON (((-64.77167 1...               27.070824
## 24              Florida MULTIPOLYGON (((-81.97063 2...               26.393349
## 25                Texas MULTIPOLYGON (((-95.08751 2...               29.469853
## 26              Alabama MULTIPOLYGON (((-88.08472 3...               21.756170
## 27              Georgia MULTIPOLYGON (((-81.48121 3...               22.568881
## 28            Louisiana MULTIPOLYGON (((-90.73379 2...               23.184331
## 29          Mississippi MULTIPOLYGON (((-88.49208 3...               21.724110
## 30       South Carolina MULTIPOLYGON (((-80.90451 3...               22.015652
## 31           California MULTIPOLYGON (((-117.1046 3...               27.117184
## 32          Connecticut MULTIPOLYGON (((-73.42085 4...               15.902644
## 33 District of Columbia MULTIPOLYGON (((-77.05355 3...               21.203339
## 34             Delaware MULTIPOLYGON (((-75.54026 3...               19.046556
## 35             Kentucky MULTIPOLYGON (((-89.48247 3...               19.058016
## 36        Massachusetts MULTIPOLYGON (((-70.80749 4...               15.045070
## 37             Maryland MULTIPOLYGON (((-76.08272 3...               18.630555
## 38       North Carolina MULTIPOLYGON (((-78.45643 3...               20.276333
## 39           New Jersey MULTIPOLYGON (((-74.58208 3...               18.026402
## 40             New York MULTIPOLYGON (((-73.8339 40...               14.278430
## 41                 Ohio MULTIPOLYGON (((-83.00718 4...               18.642377
## 42         Pennsylvania MULTIPOLYGON (((-75.29471 3...               17.098477
## 43         Rhode Island MULTIPOLYGON (((-71.56816 4...               15.988938
## 44             Virginia MULTIPOLYGON (((-75.9663 36...               18.792783
## 45            Wisconsin MULTIPOLYGON (((-87.49371 4...               12.602551
## 46             Michigan MULTIPOLYGON (((-83.45478 4...               14.281505
## 47               Hawaii MULTIPOLYGON (((-155.8498 2...               27.952575
## 48                Maine MULTIPOLYGON (((-70.61216 4...                9.703732
## 49            Minnesota MULTIPOLYGON (((-89.65538 4...               11.775165
## 50        New Hampshire MULTIPOLYGON (((-70.72972 4...               11.863261
## 51               Oregon MULTIPOLYGON (((-124.1691 4...               19.105077
## 52           Washington MULTIPOLYGON (((-123.3442 4...               15.462358
## 53               Alaska MULTIPOLYGON (((-156.1573 7...                1.532837

This code creates a bar plot of the statewise mean LST values. The ggplot function is used to initialize the plot and set the data frame and axis. The geom_bar function is used to create the bar plot and the fill argument is used to set the fill color. The x and y axis labels and the plot title are set using the labs function. The x-axis tick labels are rotated for better visibility using the theme function. The reorder function is used to reorder the states based on the mean LST values.

# Task 5 Optional- Plotting Results
ggplot(mean_values_df, aes(x = reorder(NAME, MeanValue.USLST_Cel.tif), y = MeanValue.USLST_Cel.tif)) +
  geom_bar(stat = "identity", fill = "blue") +
  
  # Set the axis labels and title
  labs(x = "State", y = "LST (Degree Celcius)",
       title = "Statewise Mean LST") +
  
  # Rotate the x-axis tick labels for better visibility
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

This code performs task 6, which is to write the sf data frame to a geodatabase. A new column meanLST is added to the data frame with values from the zonal statistics. The arc.write function is used to write the states_sf data frame to the geodatabase. The overwrite argument is set to TRUE to overwrite any existing data in the geodatabase.

# Task 6 - Writing SF dataframe to geodatabase

# Add the mean raster values to the states_sf object as a new attribute
states_sf['meanLST'] <- extracted_values


# Writing the SF dataframe to GDB
arc.write('F:\\CourseWork\\AdvProg\\FinalProject\\GDB_Container.gdb\\US_States_LST', states_sf, overwrite=T)