diff --git a/episodes/06-saving.md b/episodes/06-saving.md index 2617fa9d..9dc934b1 100644 --- a/episodes/06-saving.md +++ b/episodes/06-saving.md @@ -6,54 +6,104 @@ exercises: 5 ::::::::::::::::::::::::::::::::::::::: objectives -- Save an OpenRefine project. - Export cleaned data from an OpenRefine project. +- Save an OpenRefine project as a shareable file. :::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::: questions -- How can we save and export our cleaned data from OpenRefine? +- How can we get our cleaned data out of OpenRefine? +- How can we save the whole project with all history as a file? :::::::::::::::::::::::::::::::::::::::::::::::::: -## Saving and Exporting a Project +## Exporting Cleaned Data + +When you completed the cleaning steps, you probably want to save the cleaned +dataset as a new file, so that you can further analyse the data using other +applications. +OpenRefine allows you to do so by *exporting* the data in various file formats. + +1. Click `Export` in the top right and select the file type you want to export + the data in. `Tab-separated values` (`tsv`) or `Comma-separated values` + (`csv`) would be good choices. +2. OpenRefine creates a file whose name is based on the project name and asks + the browser to download it. + Depending on your browser settings, this file is automatically saved in the + default location for downloaded files, or you see a dialog window to choose + where you want to save the file. + +The downloaded file can then be opened in a spreadsheet program or imported into +programs written in R or Python, for example. + +Remember from our lesson on Spreadsheets that using widely-supported, +non-proprietary file formats like `tsv` or `csv` improves the ability of +yourself and others to use your data. -In OpenRefine you can save or export the project. This means you're saving the +::::::::::::::::::::::::::: callout + +### Only matching rows are exported + +OpenRefine only operates on rows that match all enabled filters. +This is also true for exporting data. +So if you want to export a selection from a larger dataset, you can use filters +and facets to select what data you want to export. + +However, if you wanted to export all data and forget to reset all facets and filters, +the exported dataset may appear to be incomplete. +OpenRefine does not provide a warning about enabled filters when you export data. + +::::::::::::::::::::::::::::::::::: + + +## Saving a Project as a File + +Next to exporting the data, you can export the project as well. +When you export the project, OpenRefine creates a single file that includes the data and all the information about the cleaning and data transformation steps -you've done. Once you've saved a project, you can open it up again and be just -where you stopped before. +that you have taken. + +You can use this file as a project backup, transfer it to another computer to +continue working on the data or share it with a collaborator who can open it +to see what you did and continue the work. -### Saving +::::::::::::::::::::::::::: callout + +### Saving happens automatically + +By default OpenRefine is saving your project continuously while you work on it. +If you close OpenRefine and open it up again, you can see a list of your +projects when you select "Open Project" on the start screen. +You can open an existing project by clicking on its title. + +::::::::::::::::::::::::::::::::::: -By default OpenRefine is saving your project continuously. If you close -OpenRefine and open it up again, you'll see a list of your projects. You can -click on any one of them to open it up again. ::::::::::::::::::::::::: challenge -### Exporting the project +### Exporting and examining the project -You can also export a project. This is helpful, for instance, if you wanted to -send your raw data and cleaning steps to a collaborator, or share this -information as a supplement to a publication. +In this exercise, we will export the project and examine the contents of the +exported file. 1. Click the `Export` button in the top right and select `OpenRefine project archive to file`. -2. A `tar.gz` file will download to your default `Download` directory. - Depending on your browser you may have to confirm that you want to save the - file. The `tar.gz` extension tells you that this is a compressed file. The - downloaded `tar.gz` file is actually a folder of files which have been - compressed. Linux and Mac machines will have software installed to - automatically expand this type of file when you double-click on it. For - Windows based machines you may have to install a utility like '7-zip' in - order to expand the file and see the files in the folder. -3. After you have expanded the file look at the files that appear in this - folder. What files are here? What information do you think these files - contain? +2. OpenRefine then presents a `tar.gz` file for download. + Depending on your browser you may have to specify where you want to save the + file, or it may be downloaded to your default directory for downloaded files. + The `tar.gz` extension tells you that this is a compressed file. The + downloaded `tar.gz` file is actually a folder of files which have been + compressed. Linux and Mac machines will have software installed to + automatically expand this type of file when you double-click on it. For + Windows based machines you may have to install a utility like '7-zip' in + order to expand the file and see the files in the folder. +3. After you have expanded the file, look at the files that appear in this + folder. What files are here? What information do you think these files + contain? ::::::::::::::: solution -## Solution +### Solution You should see: @@ -69,33 +119,18 @@ You should see: ::::::::::::::::::::::::::::::::::: -You can import an existing project into OpenRefine by clicking `Open...` in the -upper right > `Import Project` and selecting the `tar.gz` project file. This -project will include all of the raw data and cleaning steps that were part of -the original project. - -## Exporting Cleaned Data - -You can also export just your cleaned data, rather than the entire project. - -1. Click `Export` in the top right and select the file type you want to export - the data in. `Tab-separated values` (`tsv`) or `Comma-separated values` - (`csv`) would be good choices. -2. That file will be exported to your default `Download` directory. That file - can then be opened in a spreadsheet program or imported into programs like R - or Python, which we'll be discussing later in our workshop. - -Remember from our lesson on Spreadsheets that using widely-supported, -non-proprietary file formats like `tsv` or `csv` improves the ability of -yourself and others to use your data. +### Importing a Project +You can import an existing project into OpenRefine by clicking `Open...` in the +upper right, then opening the `Import Project` tab and selecting the `tar.gz` +project file. :::::::::::::::::::::::::::::::::::::::: keypoints -- Cleaned data or entire projects can be exported from OpenRefine. -- Projects can be shared with collaborators, enabling them to see, reproduce and check all data cleaning steps you performed. +- Cleaned data, or selected data, can be exported from OpenRefine + for use in other applications. +- Projects can be exported to files that contain the original data + and all data cleaning steps you performed. :::::::::::::::::::::::::::::::::::::::::::::::::: - -