Another common place for data to reside is on the internet. When these data are in files, we can download them and then import them or even read them directly from the web. For example, we note that because our dslabs package is on GitHub, the file we downloaded with the package has a url:
url <- "https://raw.githubusercontent.com/rafalab/dslabs/master/inst/
extdata/murders.csv"
The read_csv
file can read these files directly:
dat <- read_csv(url)
If you want to have a local copy of the file, you can use the
download.file
function:
download.file(url, "murders.csv")
This will download the file and save it on your system with the name
murders.csv
. You can use any name here, not necessarily murders.csv
.
Note that when using download.file
you should be careful as it will
overwrite existing files without warning.
Two functions that are sometimes useful when downloading data from the
internet are tempdir
and tempfile
. The first creates a directory
with a random name that is very likely to be unique. Similarly,
tempfile
creates a character string, not a file, that is likely to be
a unique filename. So you can run a command like this which erases the
temporary file once it imports the data:
tmp_filename <- tempfile()
download.file(url, tmp_filename)
dat <- read_csv(tmp_filename)
file.remove(tmp_filename)