Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Geospatial Analysis with R and QGIS

You're reading from   Hands-On Geospatial Analysis with R and QGIS A beginner's guide to manipulating, managing, and analyzing spatial data using R and QGIS 3.2.2

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781788991674
Length 354 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Brad Hamson Brad Hamson
Author Profile Icon Brad Hamson
Brad Hamson
Shammunul Islam Shammunul Islam
Author Profile Icon Shammunul Islam
Shammunul Islam
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Setting Up R and QGIS Environments for Geospatial Tasks 2. Fundamentals of GIS Using R and QGIS FREE CHAPTER 3. Creating Geospatial Data 4. Working with Geospatial Data 5. Remote Sensing Using R and QGIS 6. Point Pattern Analysis 7. Spatial Analysis 8. GRASS, Graphical Modelers, and Web Mapping 9. Classification of Remote Sensing Images 10. Landslide Susceptibility Mapping 11. Other Books You May Enjoy

Looping, functions, and apply family in R

Looping allows us to do repetitive task in a couple of lines of code, saving us much effort and time. Functions allow us to write a block of instructions that could be modified to work according to the way they are being called. Combining the power of looping, functions, and apply family in R allows us to loop through the elements of a data type, or similar, and apply a function or use a block of instructions on each of these.

Looping in R

Suppose we want to loop through all the values of the aug_price column inside all_prices4 and square them and return them. We can do so in the following way:

jan = all_prices4$jan_price
for(price in jan){
print(price^2)
}

This prints a square of all the prices in January as follows:

Functions in R

We can also achieve the previous result by using a function. Let's name this function square:

square = function(data){
for(price in data){
print(price^2)
}
}

Now call the function as follows:

square(all_prices4$jan_price)

The following output also shows the squared price of jan_price:

Now suppose we want to have the ability to take elements to any power, not just square. We can attain it by making a little tweak to the function:

power_function = function(data, power){
for(price in data){
print(price^power)
}
}

Now suppose we want to take the power of 4 for the price in June, we can do the following:

power_function(all_prices4$june_price, 4)

We can see that the june_price column is taken to the fourth power as follows:

Apply family – lapply, sapply, apply, tapply

We discuss apply family here, which allows us not to have to write loops and reduces our workload. We will discuss four functions under this family: apply, lapply, sapply, and tapply.

apply

apply works on arrays or matrices and gives us an easier way to compute something row-wise or column-wise. For the apply() function, this row- or column-wise consideration is denoted by a margin. The apply() function takes the following form: apply(data, margin, function). This data has to be an array or a matrix, and the margin can be either 1 or 2, where 1 stands for a row-wise operation and 2 stands for a column-wise operation. We will work with the matrix all_prices, which has the following structure:

Here, we have a record of prices of three different items in three different months (January, March, and June), where a row represents the prices of an item in three different months and a column represents the prices of three different items in any single month. Now, if we want to know which item's price fluctuated most over these three months, we would have to compute a standard deviation row-wise for each row. We can do this very easily using margin = 1 in apply().

apply(all_prices, 1, sd)

We can see the standard deviation for these three items as follows:

Now suppose we want to know the month-wise total cost of all three items. As every column corresponds to different months, we can apply apply() with margin = 2 and a function mean to achieve this:

apply(all_prices, 2, sum)

This gives the sum for all three months in a vector:

We see that the total prices were the highest in June (the third column), totaling 78.

Note that the function that we use inside apply() has to be without (). We just need to write its name without parentheses.

lapply

In the previously mentioned power_function() function, we had to use a for loop to loop through all the values of the june_price column of the all_prices4 data frame. lapply allows us to define a function (or use an already existing function) over all the elements of a list or vector and it returns a list. Let's redefine power_function() to allow for the computation of different powers on elements and then use lapply to loop through each element of a list or vector and take the power of each of these elements on every iteration of the loop. lapply() has the following format:

lapply(data, function, arguments_of_the_function)
power_function2 = function(data, power){
data^power
}
lapply(all_prices4$june_price, power_function2, 4)

As we saw in the last output, all the prices of june_price are taken to the fourth power and are returned as a list:

What we get in return is a list. We can use unlist() to get a simple vector for our convenience.
unlist(lapply(all_prices4$june_price, power_function2, 4))

Now we are returned the fourth power of the june_price column as a vector.

Now we will again work with a combined array, which has the prices of different items in three different months each for 2017 and 2018. Do you remember the structure of it? It looked like this:

Here, the first matrix corresponds to prices for 2017 and the second matrix corresponds to 2018. We will now recreate this array to become a list of matrices in the following way:

combined2 = list(matrix(c(jan_2018, mar_2018, june_2018), nrow = 3), 
matrix(c(jan_2017, mar_2017, june_2017), nrow = 3))
combined2

This returns us the following list of matrices:

Now, if we want the prices for March for both 2017 and 2018, we can use lapply() in the following way:

lapply(combined2, "[", 2,)

So, what this has done is selected the second row from each list:

Now we can modify it further to select a column, row, or any element according to our needs.

lapply() can be used with data frames, lists, and vectors.

sapply

What we have got by using unlist(lapply(data, function, arguments_of_the_function)) can be obtained simply by using sapply(data, function, arguments_of_the_function).

sapply(all_prices4$june_price, power_function2, 4)

We are returned with a vector again as follows:

Now let's go back to the example of the all_prices3 data frame. We can see this from the screenshot that follows:

tapply

Now, suppose instead of prices for 2018 only, we have prices for these items for 2017, 2016, and 2015 as well. This new data frame is defined as follows:

all_prices = data.frame(items = rep(c("potato", "rice", "oil"), 4), 
jan_price = c(10, 20, 30, 10, 18, 25, 9, 17, 24, 9, 19,27),
mar_price = c(11, 22, 33, 13, 25, 32, 12, 21, 33, 15, 27,39),
june_price = c(20, 25, 33, 21, 24, 40, 17, 22, 27, 13, 18,23)
)
all_prices

The output for the preceding lines of code can be seen as follows:

Now suppose we want to take the mean price of different items for very March in all years. We can do this by using tapply(numerical_variable, categorical_variable, function). So, we will need to convert the items column of the all_prices data frame to a categorical variable to take the mean price.

tapply(all_prices$mar_price, factor(all_prices$items), mean)

This gives us a mean March price for oil, potato, and rice in all years, as follows:

Note the use of factor() to convert the items column to a factor variable.

There are other apply functions, but that's it for now, folks. We will introduce new functions as and when it will be necessary as we proceed to new chapters for geospatial analysis.

To install a new package, we need to write install.packages("package_name"), and to use any package, we need to write load.packages("package_name").

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image