Data Integration - Interim accreditation process for Integrating Authorities
- Accredited Integrating Authorities
- Interim Accreditation Process for Integrating Authorities
- Interim Accreditation Audit
- Statistical Data Integration
1. This paper outlines an interim accreditation process for integrating authorities undertaking high risk (Endnote 1) data integration projects involving Commonwealth data for statistical and research purposes, along with the criteria that must be met by agencies to gain accreditation. Accreditation is a key element in the governance and institutional arrangements for data integration. The aim of these arrangements is to ensure a safe and effective environment for data integration projects involving Commonwealth data.
2. The interim accreditation process is an administrative scheme which does not over-ride legislation. All legal obligations (for example, obligations resulting from the Privacy Act 1988 (Privacy Act) or privacy and secrecy clauses in agency-specific legislation) must still be met.
3.The High Level Principles for Data Integration involving Commonwealth Data for Statistical and Research Purposes stipulate that an integrating authority will be nominated for every data integration project (even the low risk ones) involving Commonwealth data for statistical and research purposes. Jointly with the data custodian(s) (owners of the datasets to be integrated), the integrating authority is accountable for the data integration project. The integrating authority, along with the data custodian(s), is responsible for achieving an appropriate balance between: maximising the inherent value of Commonwealth data sources; minimising privacy concerns and confidentiality concerns associated with the use of data before, during and after it has been integrated; and facilitating the use of this data within the constraints of privacy, confidentiality and the law.
4.The four main roles of an integrating authority are to:
- negotiate and implement arrangements with data custodians to achieve adequate control and manage risk appropriate to their datasets;
- implement a safe and effective environment for data integration projects involving Commonwealth data;
- manage datasets for the project, including providing suitable access for data users and ensuring that the agreed data retention/data destruction policies are carried out; and
- be transparent in their operation.
5. An accredited Integrating Authority must be used for high risk data integration projects involving Commonwealth data. The data custodian(s) will determine the risk rating of each project or family of projects.
6. Using an accredited Integrating Authority does not guarantee access to data for a particular data integration project. Access to data needs to be negotiated with data custodians for each project (or for a family of data integration projects using the same source datasets, for similar purposes, with the same integrating authority).
Process for interim accreditation of integrating authorities to undertake high-risk data integration projects involving Commonwealth data
7. The process for interim accreditation of integrating authorities, which picks up key elements of Australian accreditation processes in the education and health sectors, involves:
a) Self-assessment. Integrating authorities apply for accreditation by preparing a self-assessment report explaining how they meet the criteria for accreditation. A template for the report is given at attachment 1. The criteria are presented in paragraphs 16 and 17 of this paper. The assessment must be signed off by the head of the agency (Endnote 2) or the application will not be considered.
Summarised versions of the completed applications for agencies that have been granted accreditation against the interim arrangements are available on the National Statistical Service (NSS) website at https://statistical-data-integration.govspace.gov.au. These completed applications may be a useful resource for agencies intending to apply for accreditation against the interim arrangements.
Generally, the self-assessment will be no more than 20 pages long. Commonwealth data integration accreditation has recently undergone a period of review. The interim Data Integration Accreditation Subcommittee Secretariat can provide information on the governance and application process. Please email email@example.com
b) Audit. An independent third party audits the integrating authority’s self-assessment against the criteria, in line with the ANAO Auditing Standards. If an integrating authority can demonstrate that a suitable program of audits has been done recently (in the two years prior to the application), these audits can be used to reduce the scope of the integrating authority accreditation audit.
c) Decision. The Cross Portfolio Data Integration Oversight Board (Endnote 3) will make the final decision on interim accreditation, based on the self-assessment and results of the audit. Once a decision is made, a full report explaining the compliant and non-compliant criteria, with recommendations for what needs to change, will be supplied to the applicant.
d) Publication of list of accredited agencies. The Secretariat will publish a list of accredited Integrating Authorities, together with a summarised version of the integrating authority’s application and a summary of the audit report (see https://statistical-data-integration.govspace.gov.au).
8. The Oversight Board meets 3-4 times a year and will consider applications for accreditation at those meetings.
9. The applicant for interim accreditation will pay audit costs which are expected to be in the order of $15,000 to $20,000 per audit.
Who can apply for accreditation?
10. The interim accreditation arrangements will be tested on Commonwealth government agencies first. While this does not preclude state government agencies applying now for accreditation against the interim arrangements (provided that they meet all the requirements for accreditation), it will not be possible for any state government agencies to be accredited in the short term, as this would not allow time for sufficient testing and evaluation of the arrangements with Commonwealth agencies. The system is not yet mature enough to ensure that adequate safeguards apply to private firms. State government agencies and private firms can continue to apply for access to Commonwealth data under existing arrangements.
11. The Cross Portfolio Data Integration Oversight Board will only consider applications for accreditation, against the interim arrangements, for those institutions covered by privacy legislation (either the Privacy Act 1988 or state/territory equivalent).
Legal framework – project level requirements
12. When considering whether to apply for accreditation, integrating authorities will need to consider whether legislation allows them to undertake the data integration projects they are contemplating.
13. Accreditation applies to an agency. Assessment of the legal framework, on the other hand, is a decision that must be made for each individual project, or family of projects. The data custodian, in partnership with the proposed integrating authority, will assess whether the proposed integrating authority:
a) is authorised by legislation or consent to receive identifiable data from the custodian(s).
b) has appropriate legal protections in place prohibiting the disclosure of identifiable data, other than where allowed by law.
Review of the process, to move it from an ‘interim’ process to a final process
14. The interim accreditation process will be reviewed by late 2014 to determine whether any changes are necessary to the process or criteria. The review will also determine the process and timing for ongoing reviews of accredited agencies to ensure continued compliance with the accreditation criteria. Once the review is complete, and the recommendations have been implemented, the interim accreditation process will become final.
15. If the accreditation arrangements change, then agencies granted accreditation against the interim arrangements will need to demonstrate that they comply with any new requirements (e.g. additional accreditation criteria) for accreditation.
Accreditation criteria for integrating authorities wishing to undertake high risk data integration projects involving Commonwealth data
16. The accreditation criteria are embedded in the requirements of the high level principles and the governance and institutional arrangements. There are eight criteria integrating authorities must meet to gain accreditation:
i. ability to ensure secure data management;
ii. integrating authorities must demonstrate that information that is likely to enable identification of individuals or organisations is not disclosed to external users;
iii. availability of appropriate skills;
iv. appropriate technical capability;
v. lack of conflict of interest;
vi. culture and values that ensure protection of confidential information and support the use of data as a strategic resource;
vii. transparency of operation; and
viii. appropriate governance and administrative framework.
17. i) Ability to ensure secure data management Integrating authorities seeking accreditation must demonstrate that they have secure data management systems in place to protect data both during and after integration, including systems for the safe exchange of sensitive data across agencies. This may include secure management of metadata or software programs to protect intellectual property, as negotiated with the data custodian(s). Agencies who demonstrate they meet Australian Government standards for security practices as set out in the Australian Protective Security Policy Framework would automatically be rated suitable on this criterion, provided that they can also demonstrate that they adhere to the separation principle and that they have an ongoing program of audits to ensure the continued security of the data. Agencies who cannot meet all the requirements in the Framework would need to comply with particular aspects, including control of access to the agency’s premises and police checks for staff.
ii) Integrating authorities must demonstrate that information that is likely to enable identification of individuals or organisations is not disclosed to external users
Integrating authorities seeking accreditation must be able to demonstrate that information that is likely to enable identification of individuals or organisations is not disclosed to external users. Removal of identifying information will not be sufficient. Integrating authorities must ensure that information is only released in a way that is not likely to enable identification, either directly or indirectly, of individuals or organisations. Examples of different ways this criterion can be met include:
- use of formal confidentiality algorithms; and/or
- use of statistical disclosure control techniques such as cell suppression and perturbation; and/or
- providing access to data that are not likely to enable identification of individuals or organisations via on-site data laboratories; and/or
- providing access to data that are not likely to enable identification of individuals or organisations via secure remote access facilities; and/or
- manual review of data by staff with appropriate skills prior to any data release.
As an additional protective measure, integrating authorities may restrict access to data that are not likely to enable identification of individuals or organisations to approved applicants.
iii) Availability of appropriate skills An integrating authority seeking accreditation will need to have a high level of relevant skills to undertake high risk data integration projects or be able to show how they can gain these skills (e.g. secondment provisions, training). Relevant skills include:
- expertise in linkage and merging functions
- expertise in privacy (for example, the ability to conduct a Privacy Impact Assessment)
- expertise in confidentiality
- information management skills
- ability to provide useful metadata to data users
- appreciation of data quality issues to allow the integrating authority to provide advice to stakeholders.
This may be evident in the experience of staff undertaking the integration projects and in the provision of training and documentation to support the integration projects.
iv) Appropriate technical capability To obtain accreditation an integrating authority must have the necessary technical expertise and infrastructure, including secure hardware and software systems and system support, to undertake high risk data integration projects. Two factors that the integrating authority’s technical infrastructure will need to handle are the size of an integrated dataset (use of administrative data can result in very large files) and its complexity (e.g. maintaining a link that may be longitudinal or cross-sectional). The expertise and infrastructure required also extends to data access arrangements to maximise the public benefit of data integration.
v) Lack of conflict of interest The Commonwealth Statistical Integration Principles state that statistical data integration must be used for statistical and research purposes only. Agencies with a regulatory function or with responsibility for compliance monitoring must demonstrate how they will address a potential conflict of interest if linked datasets could help them with these non-statistical purposes. Possible ways an agency may demonstrate a lack of conflict of interest include the use of some legally enforceable obligation, policies, and separation principles (e.g. restricting access so that staff with regulatory/compliance roles cannot access data which would enable list matching).
vi) Culture and values that ensure protection of confidential information and support the use of data as a strategic resource
Integrating authorities seeking accreditation will need to demonstrate a consistently high standard of behaviour by all employees, commensurate with an agency statement equivalent to the APS Code of Conduct. Security needs to be part of the agency’s culture. Staff working on data integration also need to value data as a strategic resource. Examples of how this standard may be demonstrated include:
- a culture of protecting identifiable information
- adequate training on security/privacy/confidentiality matters
- appropriate mechanisms to consult with stakeholders (data custodians, data users and the public).
vii) Transparency of operation
To maintain public trust, use of government data, particularly in data integration projects for statistical and research purposes, must be open and transparent. Integrating authorities seeking accreditation will need to demonstrate the transparency of their operations, including the ability to apply sanctions. This may be evidenced by:
- their legislation and policies, particularly in relation to their implementation of Gov2.0 recommendations which focus on increased openness in government
- mechanisms to consult with and inform the public and key stakeholders about projects that are underway (e.g. via publications, presentations at conferences, focus groups)
- publishing relevant material on the web e.g. data retention statements.
viii) Appropriate governance and administrative framework
An integrating authority must have frameworks in place for management of cost recovery (if applicable), conducting investigations and handling of complaints. An integrating authority must also demonstrate that they have appropriate institutional and project governance in place. Examples include Chief Executive Instructions and a Control Framework.
Additional information to start an application for accreditation
The information below is a summary of some of the key concepts involved in the Commonwealth governance arrangements for statistical data integration. Before applying for accreditation, agencies should read the information on https://statistical-data-integration.govspace.gov.au.
Scope: See Scope of the Commonwealth arrangements.
Confidentiality: The wealth of information provided by integrated datasets can create additional risk by increasing the chance of identifying an entity (such as a person or business). Protecting the confidentiality of individuals or organisations in an integrated dataset is a key element in maintaining the ongoing trust of the Australian public. Removing identifying details, such as name, from a dataset does not necessarily protect identity, as other variables can be used to deduce the identity of an individual or organisation in the dataset. For example, the identity of a person with a very rare disease or health condition could be deduced even in highly aggregated data.
Confidentialising data involves two steps, needed to mitigate the risk that a particular person or organisation could be directly or indirectly identified in a dataset:
- de-identifying the data (removing direct identifiers such as name and address); and
- managing the risk of indirect identification, for example by removing or altering information, or collapsing detail within a dataset.
Information on confidentiality can be found in the Confidentiality Information Series, available at https://statistical-data-integration.govspace.gov.au.
It is the integrating authority's responsibility to confidentialise the integrated dataset. Information on the roles and responsibilities of integrating authorities (and other key stakeholders) can be found at https://statistical-data-integration.govspace.gov.au.
The privacy of individuals and organisations also needs to be considered during the actual linking process used to form the integrated dataset.
Separation principle: The separation principle, that is the separation of identifying and content information, is one mechanism to protect the identities of individuals and organisations in datasets. Such separation means that no-one can see the identifying or demographic information, used to identify which records relate to the same person or organisation (e.g. name, address, date of birth), in conjunction with the content data (e.g. clinical information, benefit information, company profits). Instead, staff can see only the information they need to do the linking or analysis. So, rather than someone being able to see that John Smith has a rare medical condition, or the profits earned by Company X, the person doing the linking sees only the information needed to do the linking (e.g. John Smith’s name and address) and the analyst just sees a record, with no identifying information, showing that a person has a rare medical condition together with any other variables needed for analysis (e.g. broad age group and sex).
Integrating authorities must ensure secure data management and ensure that information that is likely to enable identification of individuals or organisations is not disclosed to external users. One approach to achieving this is carrying out both the linking and merging functions within the same organisation, with sufficient procedural and technical controls to ensure the separation principle is internally applied.
Any questions about the accreditation process should be emailed to firstname.lastname@example.org
Attachment 1: Proforma for applicants to complete
- Data custodians assess the risk of a project in accordance with the risk assessment framework.
- The ‘head of the agency’ is the person legally accountable for the activities of the organisation, and those of its staff and affiliates. In the case of a government department, this will be the Secretary of the department.
- The establishment of the Board was a key element of the governance and institutional arrangements endorsed by the Secretaries Board in October 2010. The Board is chaired by the Australian Statistician and comprises the Secretaries of the Department of Health, the Department of Human Services and the Department of Social Services.
- The separation principle involves ensuring that only that information, from datasets to be linked, that is required to perform specific tasks is made available to those people performing the tasks. Specifically, linking separation (where those people performing the linking of the datasets can only access those parts of the datasets that are required to complete the linkage) and analysis separation (where those people performing analysis of the linked datasets can only access those parts of the datasets required for the analysis).