Module # 8 Input/Output, string manipulation and plyr package

 

String Manipulation and the plyr Package


1) We have three important steps we need to cover for this week's assingent. Step #1: Import some data then, run the commend "mean" using Sex as the category (use plyr package for this operation). Last commend in this step: write the resulting output to a file. Step#2: Convert the data set to a dataframe for names whos' name contains the letter i, then create a new data set with those names, Write those names to a file separated by comma’s (CSV). Step#3: Write the filtered data set and convert it to CSV file. With all of that laid out lets write some code to make it all happen. 


> #Module 8 assignment: Input/Output, string manipulation and plyr package
> # Install necessary packages if not already installed
> install.packages("plyr")
> library(plyr)
> library(data.table)
> 
> # Step 1: Read the data from the file
> students <- fread("CENSORED/Assignment 6 Dataset.txt", header = TRUE, sep = ",")  # Using fread from data.table package
> print(students)
         Name   Age    Sex Grade
       <char> <int> <char> <int>
 1:      Raul    25   Male    80
 2:    Booker    18   Male    83
 3:     Lauri    21 Female    90
 4:    Leonie    21 Female    91
 5:   Sherlyn    22 Female    85
 6:   Mikaela    20 Female    69
 7:   Raphael    23   Male    91
 8:      Aiko    24 Female    97
 9:  Tiffaney    21 Female    78
10:    Corina    23 Female    81
11: Petronila    23 Female    98
12:    Alecia    20 Female    87
13:   Shemika    23 Female    97
14:    Fallon    22 Female    90
15:   Deloris    21 Female    67
16:    Randee    23 Female    91
17:     Eboni    20 Female    84
18:   Delfina    19 Female    93
19: Ernestina    19 Female    93
20:      Milo    19   Male    67
         Name   Age    Sex Grade
> 
> # Calculate the mean grade for each sex category
> students_gendered_mean <- ddply(students, "Sex", summarise, Grade.Average = mean(Grade))
> 
> # Step 1: Write the output to a file
> write.table(students_gendered_mean, "Students_Gendered_Mean.txt", row.names = FALSE, sep = "\t")
> 
> # Step 2: Filter the dataset for names containing the letter "i"
> i_students <- subset(students, grepl("i", Name, ignore.case = TRUE))
> print(i_students)
         Name   Age    Sex Grade
       <char> <int> <char> <int>
 1:     Lauri    21 Female    90
 2:    Leonie    21 Female    91
 3:   Mikaela    20 Female    69
 4:      Aiko    24 Female    97
 5:  Tiffaney    21 Female    78
 6:    Corina    23 Female    81
 7: Petronila    23 Female    98
 8:    Alecia    20 Female    87
 9:   Shemika    23 Female    97
10:   Deloris    21 Female    67
11:     Eboni    20 Female    84
12:   Delfina    19 Female    93
13: Ernestina    19 Female    93
14:      Milo    19   Male    67
> 
> # Step 3: Write the filtered data to a CSV file
> write.csv(i_students, "i_students.csv", row.names = FALSE)


2) Just like that we have a new file, separated by commas with the names of students who have the letter "i" in them. Not much more to talk about this week. 








Comments

Popular posts from this blog

Module # 10 Building my own R package

Module # 13 Shiny Web App

Module # 4 Programming Structure