Converting file formats using the rio package
As we saw in the previous recipe, Rio is an R package developed by Thomas J. Leeper which makes the import and export of data really easy. You can refer to the previous recipe for more on its core functionalities and logic.
Besides the import()
and export()
functions, Rio also offers a really well-conceived and straightforward file conversion facility through the convert()
function, which we are going to leverage in this recipe.
Getting ready
First of all, we need to install and make the rio
package available by running the following code:
install.packages("rio") library(rio)
In the following example, we are going to import the world_gdp_data
dataset from a local .csv
file. This dataset is provided within the RStudio project related to this book, in the data
folder.
You can download it by authenticating your account at http://packtpub.com.
How to do it...
- The first step is to convert the file from the
.csv
format to the.json
format:convert("world_gdp_data.csv", "world_gdp_data.json")
This will create a new file without removing the original one.
- The next step is to remove the original file:
file.remove("world_gdp_data.csv")
There's more...
As fully illustrated within the Rio vignette (which you can find at https://cran.r-project.org/web/packages/rio/vignettes/rio.html), the following formats are supported for import and export:
Format |
Import |
Export |
---|---|---|
Tab-separated data ( |
Yes |
Yes |
Comma-separated data ( |
Yes |
Yes |
CSVY (CSV + YAML metadata header) ( |
Yes |
Yes |
Pipe-separated data ( |
Yes |
Yes |
Fixed-width format data ( |
Yes |
Yes |
Serialized R objects ( |
Yes |
Yes |
Saved R objects ( |
Yes |
Yes |
JSON ( |
Yes |
Yes |
YAML ( |
Yes |
Yes |
Stata ( |
Yes |
Yes |
SPSS and SPSS portable |
Yes ( |
Yes ( |
XBASE database files ( |
Yes |
Yes |
Excel ( |
Yes | Â |
Excel ( |
Yes |
Yes |
Weka Attribute-Relation File Format ( |
Yes |
Yes |
R syntax ( |
Yes |
Yes |
Shallow XML documents ( |
Yes |
Yes |
SAS ( |
Yes | Â |
SAS XPORT ( |
Yes | Â |
Minitab ( |
Yes | Â |
Epiinfo ( |
Yes | Â |
Systat ( |
Yes | Â |
Data Interchange Format ( |
Yes | Â |
OpenDocument Spreadsheet ( |
Yes | Â |
Fortran data (no recognized extension) |
Yes | Â |
Google Sheets |
Yes | Â |
Clipboard (default is | Â | Â |
Since rio
is still a growing package, I strongly suggest that you follow its development on its GitHub repository, where you will easily find out when new formats are added, at https://github.com/leeper/rio.