R Markdown and the LaTeX Mathematical Typesetting Language
1) This week we are going to make a Markdown file in RStudio and then post that code on GitHub. I have decided to focus on a certain topic that helps me in another class about the same topic. Predicting passenger survival on board of the titanic.
2) We are now going to make our R Markdown file, and add some pieces to it rather than just leaving the default options on there. We are then going to look at it on both the source side and the visual side and the complied side. The following is the source side:
---
title: "Final Project: Titanic Survival Prediction"
author: "Moeen Khan"
date: "2024-11-17"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Introduction
This document provides an overview of the main functions being developed for the Titanic survival prediction project. We aim to predict the likelihood of passenger survival using machine learning models, including GLM, Decision Tree, and KNN.
## Main Functions
### 1. `clean_data()`
This function preprocesses the Titanic dataset by:
- Handling missing values.
- Encoding categorical variables.
- Normalizing numerical features.
```{cleaning}
clean_data <- function(data) {
data <- na.omit(data)
data$Sex <- as.numeric(factor(data$Sex))
data$Class <- as.numeric(factor(data$Class))
return(data)
}
```
## fit_glm_model()
This function builds a generalized linear model (GLM) for predicting survival.
```{models}
fit_glm_model <- function(data) {
model <- glm(Survived ~ Class + Sex + Age, data = data, family = binomial)
return(model)
}
```
## predict_survival()
This function generates survival predictions for new data using the trained GLM.
```{predict_survival}
predict_survival <- function(model, new_data) {
predictions <- predict(model, new_data, type = "response")
return(ifelse(predictions > 0.5, 1, 0))
}
```
2) In order to look at the visual side of this code, we just need to hit a button on the top of where we are writing our code, right below the save button. Doing that leaves us with the same code, the only difference being is that the only parts that actually run code are the parts that are surrounded by the "```". Everything else is just written in plain text like it shows up here in this blog.
3) Compleing the document to a docx file leaves us with the following:
Final Project: Titanic Survival Prediction
Moeen Khan
2024-11-17
This document
provides an overview of the main functions being developed for the Titanic
survival prediction project. We aim to predict the likelihood of passenger
survival using machine learning models, including GLM, Decision Tree, and KNN.
1. clean_data()
This function preprocesses the Titanic dataset
by: - Handling missing values. - Encoding categorical variables. - Normalizing
numerical features.
clean_data <- function(data) {
data <-
na.omit(data)
data$Sex <-
as.numeric(factor(data$Sex))
data$Class
<- as.numeric(factor(data$Class))
return(data)
}
This function
builds a generalized linear model (GLM) for predicting survival.
fit_glm_model <- function(data) {
model <-
glm(Survived ~ Class + Sex + Age, data = data, family = binomial)
return(model)
}
This
function generates survival predictions for new data using the trained GLM.
predict_survival <- function(model, new_data) {
predictions
<- predict(model, new_data, type = "response")
return(ifelse(predictions > 0.5, 1, 0))
}
4) As shown above the compiled file in docx is just slightly more legible and is almost exactly like the visual version, with the main difference being the title layout of the document. If I was ever presenting the code to someone formally who was not goign to work on it, I would give it to them in a docx format. Besides that I would use the visual version since it is clear where to write the code.
5) Making the R Markdown file was pretty simple, there is an option for it in the new file heading or RStudio. I then just filled it out with some vague code that is on the same topic as another final project in another class. Since they are both being done in RStudio, I figured that I could just make a R Markdown file about it. I want to note here however, that the code displayed for this assignment is not the code being used for my final project in the other class. They are completely different.
6) If you want to download this code or check out my other works the following is a link to my GitHub, where you will eventually be able to see where I predict the survival percentage of someone on board the titanic.
GITHUB.
Comments
Post a Comment