Data

From Data.gov.au
Jump to: navigation, search

Data

What is data?

Please check out the Toolkit section on "What is open data" for information about different data types, what to prioritise, and how to publish unit record data in an appropriate way for public consumption.

Clean data

Make sure at least one of the data files you upload to your dataset is as clean as possible. This will make it easier for others to reuse. As outlined above the best option for this is a CSV file, with as few mistakes or bad data entries as possible. For instance, make sure all the dates are a common format, and that there are no missing entries and there is no non-text data embedded. For tabular data you just want a spreadsheet with a single header row and then multiple fields. KML data has its own format that must be followed.

CKAN specifically understands tabular and spatial data types and generates API access to such datasets, if the data files are clean. Clean means the datasets are raw and appropriately structured data. More information on clean data can be found here http://www.clean-sheet.org/

For example:

This is an example of raw clean tabular data:
Date Age Gender Postcode
20/10/2013 12 M 2580
10/01/2013 15 F 1462
02/11/2011 22 M 3652
12/05/2012 45 F 1464
19/01/2010 75 F 1800
This is not raw clean data:
Copyright of Dept. X
 
Date Age Gender Postcode
01/20/2013 Fifteen M Barton
10th Dec 11 15 Fem  
02/11/2011 xx Male 3652
12/05/2012 45 F 1464
Important Note
Due to the nature of our database if you would like to allow API access to your dataset you will need to keep your column headings under 63 characters.


Removing Author Metadata
It is possible that some filetypes will contain additional author metadata. It is possible to remove this metadata using the Windows interface.

Spatial data

At the end of 2013 we expanded the geospatial functionality of data.gov.au. These changes established additional and improved support specifically for spatial data services. This means data.gov.au can now present a spatial API endpoint for certain types of spatial files. An example of the new functionality can be found seen in the Geelong Roofprints dataset.

When any of the supported filetypes (outlined below) are uploaded through the CKAN platform they will automatically be added to the GeoServer. There should be no need for user intervention but it can take up to 24 hours for the file to be ingested. At the moment data ingested daily. We are looking at ways to make ingestion more immediate.

Spatial data on data.gov.au is also viewable through the National Map. To access the visualisation click the Data button on the left, click Data Providersand then click data.gov.au. It may take up to 24 hours for newly added datasets to become available on the National Map.

Supported Filetypes

Currently supported vector filetypes are KML, KMZ and SHP. Currently supported raster filetypes are GeoTIFF and ESRI ArcGRID (ASCII and binary)

For the moment other formats can be converted using the Feature Manipulation Engine (FME) or Geospatial Data Abstraction Library (GDAL). There is currently no limit for the size of a file uploaded but whenever possible please make the data.gov.au team aware should you need to upload a large file.

Uploading spatial data

At this time data.gov.au only supports a single spatial file per dataset. If a dataset contains multiple spatial resources it will be ignored by the geoserver ingestor. Once the spatial file is successfully ingested into the geoserver data.gov.au will automatically generate a number of additional resources for the dataset.

KML / KMZ files

data.gov.au will accept either KML and KMZ files. Simply upload the file using the normal process as outlined in the, ‘posting a dataset’ section.

Shape files

Before uploading a shape file you should zip all the relevant files into a single archive. When adding the zip file as a resource set the format to ‘shp’.

Colors/Styles/Symbology

It is possible that the incorrect display style for spatial data is displayed in the WMS preview. It is important to note that this does not modify the data as it is served via the WMS/WFS APIs.

To provide a custom display style, attach another data resource in "SLD" Styled Layer Definition format http://docs.geoserver.org/latest/en/user/styling/sld/cookbook/index.html#sld-cookbook

It may be possible to convert ESRI .style/.lyr files to SLD https://github.com/nyalldawson/slyr

It is also possible to modify the default display type for the data for any user which has admin access to the GeoServer.

  1. Point your browser to http://data.gov.au/geoserver/web/
  2. If required login using the form at the top of the page.
  3. From the admin screen select the Layers link.
  4. From here select the layer you’d like to change by clicking the Layer Name.
  5. Select the Publishing tab.
  6. From the Default Style dropdown select the style you’d like to use.
  7. Scroll to the bottom of the page and click the Save button.

Spatial API Access

Web Map Service (WMS) and Web Feature Service (WFS) API links will be created with the datasets when a supported filetype is added. Information about available methods can be found at: http://docs.geoserver.org/stable/en/user/services/wfs/index.html and, http://docs.geoserver.org/stable/en/user/services/wms/index.html.