More Scraping with Rvest

For a quick and dirty solution, you can also copy the content you wish to scrape and then read your clipboard into R using the read.delim() function.

Windows

clipboard <- read.delim("clipboard")

Mac

clipboard <- read.delim(pipe("pbpaste"))

Try scraping a website you like

With these couple of techniques, you would be surprised how much data you can scrape from the web. Wiki pages, news articles, and even social media posts are all fair game.

Learn more

If you wish to learn more about web scraping, We recommend learning about

  1. the general structure of HTML and CSS,
  2. XPath and CSS selectors,
  3. the rvest package in R and,
  4. web crawling and spiders to scrape a network pages.

The core idea of web crawling is to follow links and scrape the content of the linked pages. This is a more advanced topic and is not covered in this course, but feel free to explore it on your own.

These sources are a good starting point: