Chris Bail, Duke University
SICSS, Day 2
There is already a vast amount of data out there that has already been compiled (e.g. Facebook, Twitter, The New York Times, Reuters, Google, Wikipedia)
Here is a crowd-sourced list of datasets I curate
-Pre-packaged data (e.g. Google Trends)
-Screen-scraping/Browser Automation/Crowd-Sourced Scraping
-Application Programming Interfaces (APIs)
Screen scraping refers to a type of computer program that:
Once upon a time you could collect virtually any piece of information from the internet by screen scraping.
We are no longer in the “Wild, Wild, West” of the internet.
Screen-scraping many sites is now against the law.
Most sites have become very difficult to scrape because they are designed to prevent screen-scraping.
Please open this link
Or you can google “Wikipedia” and “World Health Organization Ranking of Health Systems”