To analyze the percentage of "Strongly Agree" sentiments over time with 95% confidence intervals, you can use the tidyverse package in R. This involves filtering your data, grouping it by date, calculating the proportion of "Strongly Agree" sentiments, and then computing the confidence intervals. Finally, you can visualize these results using ggplot2.
Here's a step-by-step guide and example code:
```R
if (!requireNamespace("tidyverse", quietly = TRUE)) install.packages("tidyverse")
if (!requireNamespace("binom", quietly = TRUE)) install.packages("binom")
library(tidyverse)
library(binom) # For calculating binomial confidence intervals
set.seed(123)
dates <- seq(as.Date("2023-01-01"), as.Date("2023-03-31"), by = "day")
sample_data <- tibble(
Date = sample(dates, 1000, replace = TRUE),
Sentiment = sample(c("Strongly Agree", "Agree", "Neutral", "Disagree", "Strongly Disagree", NA),
1000, replace = TRUE, prob = c(0.15, 0.25, 0.30, 0.15, 0.10, 0.05)),
Value = rnorm(1000, mean = 10, sd = 2) # Adding a numeric column for potential filtering
)
sample_data$Value[sample(1:1000, 50)] <- -1
sample_data$Value[sample(1:1000, 50)] <- NA
df <- sample_data
df_filtered <- df %>%
drop_na(Sentiment) %>% # Removes rows where Sentiment is NA
filter(Value >= 0 | is.na(Value)) # Keeps rows where Value is non-negative, or if Value is NA.
# If you want to exclude NA values in 'Value' too, use filter(Value >= 0)
sentiment_summary <- df_filtered %>%
group_by(Date) %>%
summarise(
total_responses = n(),
strongly_agree_count = sum(Sentiment == "Strongly Agree", na.rm = TRUE)
) %>%
mutate(
percentage_strongly_agree = (strongly_agree_count / total_responses) * 100
) %>%
rowwise() %>% # Apply binom.confint row by row
mutate(