-
Notifications
You must be signed in to change notification settings - Fork 0
/
proposal.qmd
163 lines (103 loc) · 10.5 KB
/
proposal.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
---
title: "The Last Giants"
subtitle: "A data-driven safari into the savannah"
format: html
code-fold: true
editor: visual
---
# High Level Goal
The goal of this project is to visualize the African savannah elephants' population and the ivory trade in Africa.
# Background
The African savannah elephant, scientifically known as Loxodonta africana, stands at the brink of an existential crisis, as it grapples with the relentless threats of poaching and the encroachment of its natural habitats. Despite global efforts to address the dire situation facing these magnificent creatures, a veil of uncertainty shrouds our knowledge regarding their population sizes and trends across much of the African continent. The survival of these iconic giants is compromised by a multitude of poaching activities, including the tragic ivory trade, as well as rampant deforestation and the hunting of elephants for their prized tusks.
In response to this critical conservation challenge, we have undertaken a vital project aimed at unraveling the enigmatic population trends and distribution patterns of the savannah elephant throughout the years 1995-2023. Furthermore, our mission extends to comprehending the intricate web of ivory trade activities, which may offer invaluable insights into the potential consequences on the elephant population. By shedding light on these pressing issues, we strive to contribute to a better understanding of the complex dynamics between human activities, conservation efforts, and the survival of these majestic beings in the wild.
# Dataset
```{r}
#| label: load-dataset
#| message: false
# Installing necessary packages
library(readxl)
library(tidyverse)
# Loading the dataset
elephant <- read_excel("data/dataset.xlsx")
#glimpse(elephant)
```
The research paper titled ["Continent-wide Survey: Massive Decline in African Savannah Elephants"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5012305/) contains a dataset which is a collection of information on the African Savannah Elephants and spans over the years 1995 to 2014. This data file has been loaded to the `data` folder. This dataset has been meticulously compiled to track and document the population trends of African savannah elephants, a species that has been facing a severe decline in numbers during this time period.
This dataset contains a wealth of data points and variables that provide insights into various aspects of elephant populations, including their distribution across different regions of Africa, demographic information such as the ecosystem of the elephants. The dimensions of the dataset are 322 rows and 7 columns. The variables in the dataset and their description is presented as follows.
| Column | Data Type | Description |
|-----------|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Country | `character` | Name of the country |
| Ecosystem | `character` | Name of the ecosystem as used in the manuscript and supplemental material. |
| Stratum | `character` | Where the stratum name is the same as the ecosystem name, this indicates that the entire ecosytsem was the study area for the historical data. See Figure S7 for map of strata. |
| Year | `integer` | Year of survey |
| Estimate | `integer` | Estimated number of elephants present |
| SE | `numeric` | Standard error: 0 values indicate total counts; blank indicates missing value for a sample count. |
| Source | `character` | Source of data. "GEC report" indicates that data were taken from the GEC report for that ecosystem; most GEC reports are not publicly available. |
The ivory trade data was obtained from the [Elephant Trade Information System (ETIS)](https://cites.org/eng/prog/etis). The reports published in the website include country-wise ivory trade data from 2008-2019. The dimensions of the dataset are 88 rows and 12 columns. The variables in the data are, `country`, `year` and `total no. of seizure cases`.
| Column | Description |
|---------|------------------------------|
| Country | Name of the country |
| Year | Years spanned from 2008-2019 |
| Total | Total no. of seizure cases |
# Objectives
- Visualizing the country wise variation in elephant population over the years in Africa.
- Predict the elephant population for the years 2015 to 2023 based on the available data.
- Analyze the trend in various human activities influencing the elephant population.
# Questions
Q1. How has elephant population varied over the years across several African countries?
Q2. What is the trend of ivory trade in Africa?
# Analysis plan
## Question 1
The present dataset contains the data from 1995-2015. However, this data can be used to further predict the population of the elephants for the period 2016-2023. Hence, it is one of the important stages to predict the elephant population data from 2016-2023.
In the question, "How has elephant population varied over the years across several African countries?", we are interested in understanding the distribution of the elephants. The important variables to be considered here are `country`, `estimate` and `year`. This task involves a well-structured analysis plan as presented below.
- Data Preprocessing:
- Clean and preprocess the data to ensure consistency and accuracy.
- Handle missing values and outliers appropriately.
- Data Exploration:
- Perform initial exploratory data analysis to understand the dataset's structure and characteristics.
- Calculate basic statistics, visualize distributions, and identify trends or patterns in the data.
- Data prediction:
- Predict the data from 2016-2023 using multiple regression techniques like Ordinary Least squares, Polynomial regression, Lasso regression, Elastic net regression.
- Model Evaluation:
- Select the regression technique which provides the best-fit model.
- Country-Level Analysis:
- Group the data by country and region to analyze elephant populations in different areas of Africa.
- Create map plots that illustrate population distribution using the package `rnaturalearth`, `sf` and `plotly`
- The `rayshader` package may also be used to implement a 3-D map plot.
- Shiny App Development which includes:
- Dynamic timeline for selecting the year of interest
- Interactive maps of African countries with color-coded markers or chloropleths to represent elephant populations.
## Question 2
To address the question "What is the trend of ivory trade in Africa?", the following is a well-structured analysis plan.
- Data Preprocessing:
- Clean and preprocess the data to ensure consistency and accuracy.
- Handle missing values and outliers appropriately.
- Data Exploration:
- Perform initial exploratory data analysis to understand the dataset's structure and characteristics.
- Calculate basic statistics, visualize distributions, and identify trends or patterns in the data.
- Data prediction:
- Predict the data from 2020-2023 using multiple regression techniques like Ordinary Least squares, Polynomial regression, Lasso regression, Elastic net regression.
- Model Evaluation:
- Select the regression technique which provides the best-fit model.
- Data Visualization:
- Visualize the trends ivory trade with a `timeseries plot.`
# Plan of Attack
| Week | Weekly Tasks | Team members involved |
|:----------------------------:|:------------------------------------------------------------------------------------:|:---------------------:|
| Till November 8^th^ | Explore and finalize the dataset and the problem statements | Everyone |
| \- | Complete the proposal and assign some high-level tasks | Everyone |
| Nov. 9^th^ - 15^th^ | Understanding `rnaturalearth`, `sf` and `plotly` packages along with geospatial data | Everyone |
| \- | Data Pre processing | Harshit, Jothish |
| Nov. 16^th^ - 22^nd^ | Data prediction and Model Evaluation | Kashyap, Partha |
| Nov. 23^rd^ - 29^th^ | Data visualisation | Kashyap, Jothish |
| \- | Shiny app development | Harshit, Partha |
| Nov. 30^th^ - December 6^th^ | Review the code | Everyone |
| Dec. 7^th^ - 13^th^ | Write-up and presentation for the project | Everyone |
# Repo Organization
The following are the folders involved in the Project repository.
- **data/:** Used for storing any necessary data files for the project, such as input files.
- **images/:** Used for storing image files used in the project.
- **presentation_files/:** Folder for having presentation related files.
- **\_extra/:** Used to brainstorm our analysis which won't impact our project workflow.
- **\_freeze/:** This folder is used to store the generated files during the build process. These files represent the frozen state of the website at a specific point in time.
- **\_site/:** Folder used to store the generated static website files after the site generator processes the quarto document.
- **.github/:** Folder for storing github templates and workflow.