Projects Utilizing R

I have used R in multiple scenarios, mainly for data analysis. The following are a few basic "mock" projects built upon the experience I have gained through the classroom and client projects.

Aggregating and Displaying Rainfall Data 

This project reads in publicly available data downloaded from a USGS rainfall monitoring gauge located in Moanalua, HI - USGS Site 212359157502601

I read in a .txt file downloaded directly from the website, and heavily used the lubridate and tidyverse packages to organize and aggregate data.

Once I had the data formatted as needed, I converted the data into visuals utilizing ggplot2, which is automatically included in the tidyverse package.

Here is a link to a github repository where the original data and the code for the project are uploaded. Some graphs generated from the data are shown below.

Further analyses of rain data might include trend analyses, incorporation of other types of data, and classification of years on whether they might be impacted by El Niño or La Niña climate patterns.

Histogram: Rain Events by Size

The rain events identified using the code are placed into bins and displayed on a histogram. As expected, a large majority of rain events are of a smaller size. This plot was generated using R's native plotting commands.

Comparison of Monthly Rain between two years

Rain data from two complete years were superimposed onto one another, to observe differences in monthly rain at the site. The months are entered into the program once, which then generates the graph. This plot was generated using ggplot2 package, which is included in the tidyverse package.

Chart of Event Sizes

Rain events were classified into a specific size class depending on the total amount of rainfall. This chart is a simple graphic showing the distribution of events based on size, generated using the ggplot2 package.

Aggregating and Displaying Streamflow Data 

This project applies many of the same principals utilized in the rainfall project. It utilizes publicly available data downloaded from a USGS rainfall monitoring gauge located in Moanalua, HI - USGS Site 16227500

I read in data directly from USGS using their dataRetrieval package, and heavily used the lubridate and tidyverse packages to organize and aggregate data.

Once I had the data formatted as needed, I converted the data into visuals utilizing ggplot2, which is automatically included in the tidyverse package. I closely followed code by Chuliang Xiao, published here.

Here is a link to a github repository where the original data and the code for the project are uploaded. Some graphs generated from the data are shown below.

Further analysis of similar hydrologic data might include parameters such as flashiness, flow timing, durations, and so on. The specific analyses performed would be based on the goals of characterizing the flow regime of the stream, and what the final product might be used for.

Hydrograph: First storm example

The rain events identified using the code were utilized to identify the first storm in the series, which was used to produce a hydrograph.

Hydrograph: Largest Rain Event

The largest rain event was identified using the rainfall depth of the event. Rain events were identified identically to the previous project. This larger event that spans a few days is rainfall from Hurricane Lane in 2018.

Hydrograph: Largest Flow Peak of Largest Rain Event

Perhaps the largest flow peak was of interest during the rainfall from Hurricane Lane. Specific date bounds were identified (arbitrarily) and the graph was generated within those bounds.

Summarization and descriptive statistics

This project reads in publicly available data, the "mpg" dataset available through the ggplot2 package in R. The data consists of vehicle parameters and mileage values for various vehicle models from the years 1999 and 2008. This project makes use of the ggplot2 package for visualization, as well as the psych package for statistical descriptives.

I ran some hypothetical summarization scenarios, such as obtaining the average mileage for specific vehicle models, and obtaining the change in mileage for compact cars between 2008 and 1999.

I utilized the psych package to run descriptive statistics, obtaining correlation vaules between specific parameters. This portion of code references course material provided by Dr. Jennifer Koran at Southern Illinois University. Based on parameters that seemed to correlate well and were likely not related to one another, I converted the data into visuals utilizing ggplot2.

Here is a link to a github repository where the original data and the code for the project are uploaded. Some graphs generated from the data are shown below.

Scatterplot: Engine dispalcement versus City Mileage

City mileage and engine displacement were observed to correlate strongly with each other. A scatterplot was generated, with a loess curve expressing the approximate relationship between the two variables.

Box plot: Highway Mileage by Vehicle Class

Vehicles were grouped by class and ordered by their relative cargo capacity. Coupes and sedans were shown to have the highest mileage values, with SUVs and pickup trucks having the lowest mileage values.