Data Publishing Guide for Organizations

1. Overview
2. Playground
3. Getting Started
4. Managing Groups
5. Managing Datasets
6. Advanced Topics

1. Overview

This document is a publishing guide for organizations that want to publish their data on the OpenColorado data catalog.  This document outlines the steps required to get started and provides a how-to guide for completing common administrative tasks

About the OpenColorado Data Catalog

OpenColorado provides a data sharing platform that allows Colorado government organizations to make public data available and accessible to all Colorado constituents.

The OpenColorado catalog is powered by CKAN, an open-source data portal platform that makes it easy to publish, share and find open datasets. The CKAN platform powers large data catalogs around the world including data.gov.uk.

Terminology

Resource A single file or endpoint linked to a URL
Dataset (aka package) A collection of resources. A dataset is usually divided into multiple resources to provide multiple formats for the same resource (ex. Shapefile, KML and CSV) or to support multiple files for different sets (for example, monthly summaries).
Group A collection of datasets. Groups are typically used to control access as only the owners of a group can add or edit datasets in a group.
Tag A keyword attached to a dataset used to help describe the dataset and to assist users in searching for relevant content.

Search

To search the catalog use the search form available on the home page or on the package search page. The search system performs a full-text search of most of the dataset fields including name, title, notes and tag. The search returns matching datasets and also returns a list of matching tags.

For more advanced search capabilities please refer to http://wiki.ckan.org/Searching_Packages2.

2. Playground

The production OpenColorado data catalog is located at http://data.opencolorado.org.  A test catalog is provided at http://test.opencolorado.org (also accessible at http://54.188.89.6) that you can use as a playground for getting familiar with the platform.  We highly recommend that you work out the details of your publishing process in the playground before setting up your catalog group and datasets in production.

Please note that we will occasionally refresh the test catalog from production backups.  This resets all settings (datasets, permissions, accounts, etc.) to the state of the production database.  If you need us to refresh the test environment for a specific purpose please contact us.

3. Getting Started

Publishing data on OpenColorado is designed to be very simple and only two steps are required for an organization to get started:
(1) Register a user account and
(2) Create a catalog group.

Register with OpenColorado

To register a new user account click the “Login” link from any page in the catalog and select the “Register Account” tab or visit http://data.opencolorado.org/user/register. If you are at the OpenColorado home page click the “Data” link to access the catalog.

This account will be used as your primary administrative account in the catalog and it will become the owner of the catalog group for your organization. It is recommended that you use a generic organization name for this initial account.  If you want a master account for your organization but want to create additional accounts for different publishers you can do so. See Managing Groups for more information.

Create a Group

OpenColorado uses groups to allow organizations to manage their own content within the larger catalog.  Groups are used to manage permissions and to organize data within the catalog.

To see the existing groups in the catalog click on Groups (or visit http://data.opencolorado.org/group/).  Each group has a title (ex. City and County of Denver) and short name that is used in URLs (ex. denver).

Before creating the group for your organization please review the titles and URLs for the existing groups

To create a new group click the “Add a Group” tab.

Fields

Title The full name for the group, typically the full name of your organization.
ex: City and County of Denver
URL A unique identifier for the group that is used in URLs and in the CKAN API.Important: Use lowercase characters without punctuation and use hyphens (-) to represent spaces if needed.
ex: denver
Description A short description of your organization.

4. Managing Groups

Update a Group

To update a group click on “Groups” (or visit http://data.opencolorado.org/group), select the group to update, and then click the “Edit” tab. Once you have completed updating the group click the “Save” button at the bottom of the page.

Permissions

To update group permissions click on “Groups” (or visit http://data.opencolorado.org/group), select the group to update, and then click the “Authorization” tab.

Roles

reader Can view the group.
editor Can edit the group.
admin Has full administrative control of the group.
anon_editor Anonymous user can read and edit the group.

User Roles

A user role is a combination of a user account and their role. By default, all anonymous visitors and logged in users can read the contents of the group and only the owner can administer the group.

Update an Existing User Role

To update an existing role, select the appropriate roles for a user and then click “Save” to save the changes.

Important: Be careful not to remove your own admin user role as this is required to administer the group!  If you accidentally do this and no longer have access to your group please contact us at [email protected].

Create a New User Role

To create a new user role select a “User”and a “Role” and click the “Add Role” button at the bottom of the page.

Important: You will need to register any additional user accounts first before you can add them to a user role.

5. Managing Datasets

Add a Dataset

To add a dataset to the catalog click on “Add a dataset” (or visit http://data.opencolorado.org/dataset/new).

Title A short descriptive name for the dataset.Important: Prefix the name of the dataset title with your organization name as the catalog contain packages for multiple organizations.ex: City and County of Denver: ParksThe convention used on OpenColorado is to use the full organization name, followed by a colon and a space (: ), then followed by the actual dataset name.Be sure to use the same title prefix for all dataset in the group.  If you are using a front-end (See Catalog Front-Ends) to publish your catalog group on your own website this prefix can be hidden using the front-end configuration settings.
URL A unique identifier for the dataset that is used in the URL.  A suggestion is automatically generating from the title.If the title is entered as noted above, the URL that is generated should be sufficient.ex. city-and-county-of-denver-parks
License  Select the license under which the data is released.OpenColorado recommends either of the following licenses:OKD Compliant::Creative Commons CCZeroOKD Compliant::Creative Commons Attribution
Groups Select the group for your organization to add this dataset to your group.

Add Resources

To add resources to the dataset add them in the resources section. Once you add the first resource, you can add additional resources by clicking “New resource…”.

“Upload a file” is an option that allows you to upload files to the OpenColorado data catalog, instead of maintaining the infrastructure to host them yourself.

You can also link to a data set on your own website. Linking to a dataset continues to build the OpenColorado central data catalog across jurisdictions.

Important: Be sure to click the “Add Dataset” button at the bottom of the page when you are finished as changes are not saved until this button is clicked.

Update a Dataset

To update a dataset find the dataset in the catalog (by searching or by browsing the dataset) and open the dataset.

To edit the general dataset information click the “Settings” tab.  To edit dataset resources click the “Resources” tab.

Be sure to click the “Save Changes” button at the bottom of the page to save your edits.

Remove a Dataset

To remove the dataset, select the “Settings” tab and then select the “Delete” tab on the left.  To remove the dataset you must click the “Yes!” button and then select the “deleted” status in the dropdown menu.

Note: Marking a dataset as deleted does not physically remove it from the catalog so you will not be able to use the dataset URL for another dataset until the dataset is completely removed from the system.  If you need a dataset to be permanently remove please contact us at [email protected].

6. Advanced Topics

Catalog Front-Ends

OpenColorado provides two open source front-ends that can be used to publish your content (your catalog group) from OpenColorado on your organization’s web site.

Front-ends are available in PHP and a .NET. The front-ends can be skinned and customized to provide a customized interface for your users while leveraging the power of a shared state-wide catalog.

For more information visit https://github.com/opencolorado.

Catalog API

This document provides instructions for maintaining your content through the OpenColorado/CKAN web interface.

The CKAN platform additionally provides an application programming interface that can be used to access and manage your content in a machine-automated manner (scripting etc.).

For more information about the CKAN API visit http://wiki.ckan.org/API#API_Reference_Documentation.

GIS Data Publishing

OpenColorado provides a Python publishing script that can be used to automatically publish GIS dataset from ESRI geodatabases (file geodatabases or ArcSDE). For more information, see https://github.com/opencolorado/OpenColorado-Tools-and-Utilities/tree/master/Scripts/ArcGIS/10.0.