Basic Text Analysis
Chris Bail, Duke University
SICSS, Day 3
Character Encoding
Tokenization
TEXT PRE-PROCESSING
Text Pre-processing: PUNCTUATION
Text Pre-processing: PUNCTUATION
Text Pre-processing: WORD-CASE
Text Pre-processing: NUMBERS
Text Pre-processing: STEMMING