Posts

Building Your Own R Package

I would like to make a package that allows culinary students, chefs, or even just people cooking at home, to analyze and filter thousands of different recipes in a more simplified manner. It would allow them to find specific recipes based on ingredients, preparation time, type of meal, etc. It would also generate visualizations to easily compare different types of recipes based on their characteristics. This package will help them save time, explore new recipes, and view trends. Key functions that I would include are finding a specific recipe (find_recipe), outputting a random recipe (random_recipe), filtering the recipes by ingredients you have (filter_ingredient), and creating graphs to compare recipes by different characteristics (plot_time). I will think of more after further exploring the dataset. I hope this proposal  meets the final project requirements. I was struggling to come up with an idea and realized I love to cook, but I often struggle with new things to make or what...

Comparing Visualization Systems in R: Base Graphics, Lattice, and ggplot2

Image
How do the syntax and workflow differ between base, lattice, and ggplot2? Base R has a simple function depending on the graph; you just need the specific variables and can label them to create an easy-to-read basic graph. Lattice is good for comparing multiple variables by group, two graphs side by side, using a formula-based syntax. ggplot2 uses multiple functions to create a more advanced graph. Can customize a wider range of aspects to the visualizations, including background, color, or shape in a much easier way. Which system gave you the most control or produced the most “publication‑quality” output with  minimal code? The ggplot2 system gave me the most control and produced a higher quality output compared to the other two. It gives clearer visuals with color comparison and automatically adds legends with minimal code. An y challenges or surprises you encountered when switching between systems? My main challenge was with ggplot2, which has a lot more functions to memorize com...

Input/Output, String Manipulation and Plyr

Image
First, I had to install the plyr package in RStudio. install.packages("plyr") library(plyr) For step one, I downloaded the assignment 6 dataset text file to my computer and imported it into R. At first, I had  "<FileName>.txt" in the read line, but I was getting the error " cannot open file Assignment 6 Dataset.txt': No such file or directory", so I changed it to  file.choose().  Then it was reading the text as only one column when trying to find the object Sex, resulting in "Error in FUN(X[[i]], ...) : object 'Sex' not found." To fix this, I had to make sure the read line had sep = "," because the file is comma-separated for columns. #Step 1 student_assignment_6 <- read.table(file.choose(), header = TRUE, sep = ",") Using the ddply function, I calculated the mean by the Sex category. We can see that the new file is now organized with the females in rows 2-17 and the males in rows 18-21, with a new column ...

S3 vs. S4

Using the built-in dataset iris in R, which represents the measurements of iris flowers, I tried a few generic functions to see if they could be assigned to the data set. data("iris") class(iris) print(iris) summary(iris) head(iris, 3) The generic functions print, summary, and head all work on the dataset iris because it has the data.frame class. Methods already exist for this class, so the function is called automatically.  Since class(iris) returns "data.frame" but isS4(iris) returns FALSE, I know that iris is already an S3 object. S4 can be assigned to the dataset by creating an S4 class and converting it. 1. How do you tell what OO system (S3 vs. S4) an object is associated with?       isS4( ) will test if an object is S4, resulting in TRUE or FALSE. If FALSE, then do is.object( ), and if that is TRUE, the object is S3. 2. How do you determine the base type (like integer or list) of an object?      typeof(object) will define the base type of an obj...

Doing Math in R Part 2

 1.  Finding the sum and difference of these two matrices.  A = matrix(c(2,0,1,3), ncol=2) and B = matrix(c(5,2,4,-1), ncol=2) a) Find A + B First, assign the matrices to a value, then assign a new value to represent the sum of both matrices. Run that value to get the new matrix of A + B. A <- matrix(c(2,0,1,3), ncol=2) B <- matrix(c(5,2,4,-1), ncol=2) AB_sum <- A + B AB_sum Result:      [,1] [,2] [1,]    7    5 [2,]    2    2  b) Find A - B First, assign the matrices to a value, then assign a new value to represent the difference between the two matrices. Run that value to get the new matrix of A - B. A <- matrix(c(2,0,1,3), ncol=2) B <- matrix(c(5,2,4,-1), ncol=2) AB_dif <- A - B AB_dif Result:      [,1] [,2] [1,]   -3   -3 [2,]   -2    4  2.   Using the  diag()   function to build a matrix of size 4 with the follo...

Doing Math on Matrices

I used the following values to find the inverse and determinants of two matrices. A = matrix(1:100, nrow=10) B = matrix(1:1000, nrow=10) First, I assigned the matrices in R.  A <- matrix(1:100, nrow=10) B <- matrix(1:1000, nrow=10) To obtain the determinant of each matrix, I used the det function. det(A) Result:  0 det(B) Result: Error in determinant.matrix(x, logarithm = TRUE, ...) :    'x' must be a square matrix Then, to obtain the inverse of each matrix, I used the solve function. solve(A) Result:  Error in solve.default(A) :    Lapack routine dgesv: system is exactly singular: U[3,3] = 0 solve(B) Result: Error in solve.default(B) : 'a' (10 x 100) must be square Looking at the matrices, we can see that A is a square matrix because it has the same number of rows and columns. There are 100 values in the matrix; dividing that by 10 rows gives me 10 columns. This makes matrix A a 10x10 square matrix. With B, there are 1000 values in the matrix, a...

Results of Patient Blood Pressure and MD Decision

Image
Using the data collected for ten patients by a local hospital, including the f requency of their visits, blood pressure, first assessment, second assessment, and final decision, I was able to create five vectors.  Freq <- c(0.6, 0.3, 0.4, 0.4, 0.2, 0.6, 0.3, 0.4, 0.9, 0.2) bloodp <- c(103, 87, 32, 42, 59, 109, 78, 205, 135, 176) first <- c(1, 1, 1, 1, 0, 0, 0, 0, NA, 1) second <- c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1) finaldecision <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 1) With these vectors, I made a side-by-side box plot comparing the low and high blood pressures, then a histogram of the frequency of each blood pressure range.  R Code: boxplot(bloodp ~ finaldecision,          main = "Patients BPs & MD’s Ratings",         names = c("Low","High"),         xlab = "Final Decision",         ylab = "Blood Pressure") hist(bloodp,      main = "Histogram of Patient's Blood Pressur...