Wouter van Atteveldt
Session 5: Sentiment Analysis and Machine Learning
Thursday: Introduction to R
Friday: Corpus Analysis & Topic Modeling
Saturday:
Sunday:
“The man who leaked cell-phone coverage of Saddam Hussein's execution was arrested”
library(slam)
reviews$npos = row_sums(dtm[, colnames(dtm) %in% pos_words])
tokens$sent[tokens$lemma %in% pos_words] = 1
Lexical Sentiment Analysis
Thursday: Introduction to R
Friday: Corpus Analysis & Topic Modeling
Saturday:
Sunday:
(1) Create 'container' from DTM + coded classes
library(RTextToools)
c = create_container(dtm, classes,
trainSize=train, testSize=test, virgin=F)
(2) Train and test model
SVM <- train_model(c,"SVM")
SVM_CLASSIFY <- classify_model(c, SVM)
(3) Evaluate
analytics <- create_analytics(c, SVM_CLASSIFY)
is_coded = !is.na(classes)
c = create_container(dtm, classes,
trainSize=is_coded, virgin=T)
SVM <- train_model(c,"SVM")
SVM_CLASSIFY <- classify_model(c, SVM)
analytics <- create_analytics(c, SVM_CLASSIFY)
head(analytics@document_summary)
Purpura & Wilkerson 2007: Active Learning for Agenda Coding
Some online resources:
Text Classification for Sentiment Analysis
Break
Handouts: