Data Integration - Roles and responsibilities of data users
Roles and responsibilities series
1. This paper identifies the rights, responsibilities and roles of data users relative to those of the other key participants in data integration projects involving Commonwealth data for statistical and research purposes, namely data custodians and integrating authorities
Who is a data user?
2. A data user refers to a person involved in accessing and investigating integrated datasets for statistical and research purposes (Endnote 1). This paper focuses on data users accessing integrated datasets created using at least one Commonwealth dataset, for statistical and research purposes.
3. Data users include academics working in research institutions and employees undertaking research in Commonwealth and state/territory agencies. It can also include multiple data users working as part of a consortium, alliance or collaborative network.
4. ‘Data users’ differ from ‘end users’ of data. Data users are directly involved in analysing integrated datasets at the unit record level to conduct and undertake research. As such, they will work closely with integrating authorities. In contrast, end users examine research findings rather than produce outputs. End users include employees undertaking research in public and private sector organisations, representatives from media outlets and consumer advocacy groups, and members of the wider community.
5. The Commonwealth’s data integration arrangements are shown in the ‘Rights, responsibilities and roles of data custodians’ (Figure 1 – Appendix 1). This provides a stylised representation of how data users fit within the arrangements, and their interactions with data custodians and integrating authorities. The nomination of an integrating authority will be based on a consultative process led by the data custodian(s). Data custodians need to give in principle approval for the project to proceed before an integrating authority is appointed, remain individually accountable for the source data used in statistical data integration projects (refer to the Commonwealth’s Statistical Integration Principle 2 – Custodian’s Accountability) and must ultimately be satisfied with the chosen integrating authority.
For some data integration projects, it is possible that data users may have multiple roles where the data user may also be a data custodian (e.g., a Commonwealth agency) and/or the integrating authority.
When an entity has more than one role, appropriate internal governance and project documentation, consistent with the Commonwealth principles and governance arrangements for data integration should be in place (for more information refer to the Best Practice Guidelines for Statistical Data Integration Involving Commonwealth Data).
Benefits of the Commonwealth arrangements for data users
6. The governance and institutional arrangements endorsed by the (Commonwealth) Secretaries Board for data integration projects involving the use of Commonwealth data for statistical and research purposes will deliver significant enhancements for data users.
7. The Commonwealth data integration arrangements will provide benefits to data users including greater transparency about how to access Commonwealth datasets for statistical and research purposes, and ultimately greater access to more Commonwealth data holdings. The arrangements will also provide opportunities for greater collaboration between data users, integrating authorities and data custodians at the pre-approval stage of a project. Data users will be supported by training opportunities.
The rights and responsibilities of data users
8. Data users have a number of ‘rights and responsibilities’ associated with accessing and using integrated datasets that involve the use of at least one Commonwealth dataset. These exist within the governance and institutional framework designed to minimise risks around the management of data, once data is received by the integrating authority and after it has been integrated.
9. The key rights and responsibilities of data users in relation to data custodians are:
- It is the right of data users to consult with data custodians and the integrating authority on any material changes or updates to a data integration project (regardless of whether changes originate from data custodians or integrating authorities). This will occur before data users start examining integrated datasets. The consultation is likely to include issues raised by integrating authorities such as the technical feasibility of the project or the limitations of data use.
- Data users are entitled to receive appropriate training covering high level statistical integration principles, governance and institutional arrangements, data protocols (e.g., ethical approval processes in the case of human-based health research), legislative frameworks and security requirements. This training will be facilitated by integrating authorities. Course material will also be determined by integrating authorities, with possible input, advice and assistance provided by data custodians. Data users can also expect to access a range of self-help tools to enhance their understanding of the Commonwealth arrangements.
- Data users must be aware of, and understand, sanctions which apply for attempts to identify (or re-identify) individuals or organisations; disseminating outputs that enable the identification of individuals or organisations; or the misuse of data.
- It is the right of data users to be informed by data custodians of the project status (i.e. approval or disapproval). Data custodians have the right to approve or disapprove a project proposal, in whole or in part. This decision could take into consideration the technical feasibility assessment made by integrating authorities. Data custodians may also need to prioritise and schedule data extraction work associated with project proposals, taking into account the range of data extraction requests that may be outstanding at the time.
- Where the Cross Portfolio Data Integration Oversight Board advises on amendments to ‘high risk’ (Endnote 2) projects (or where a concern is raised), data users, data custodians and integrating authorities will need to collaborate on how to make improvements to such project(s).
- Data users have a responsibility to pay cost recovery payments to data custodians, where applicable.
- Data users and data custodians are responsible for ensuring that datasets are used for the approved purposes only. This is facilitated by practices which help avoid the misinterpretation of data. Examples include the testing of assumptions made in respect of the data by researchers and the supply of appropriate metadata by data custodians.
10. Data users also have rights and responsibilities in relation to integrating authorities:
- It is the right of data users to receive appropriate training from integrating authorities.
- Data users can expect to be informed by integrating authorities on the technical feasibility of a research proposal.
- It is the responsibility of integrating authorities to provide integrated datasets to data users, along with full information on cost recovery or fee-for-service policies (where applicable).
- Data users have a right to receive information on data access arrangements from integrating authorities (subject to written approval from all data custodians and in line with their requirements).
- Data users are responsible for paying data integration fees to integrating authorities (where cost recovery or fee-for-service charges apply). These costs need to be built into project funding proposals by data users.
- Data users may collaborate with data custodians and integrating authorities on how to make improvements to ‘high risk’ project(s), based on advice provided by the Cross Portfolio Data Integration Oversight Board.
The role of data users in data integration projects
11. The Commonwealth governance and institutional arrangements will bring about a major shift in the role of data users. Under the current approaches to data integration used in a number of Commonwealth agencies, researchers undertake data merging and data access functions. These roles will now be carried out by an integrating authority, which will be responsible for the end-to-end management of data integration projects.
12. The five main roles of data users are:
- Developing research proposals that produce significant overall benefits to the public;
- Collaborating with integrating authorities and data custodians at the pre-approval stage;
- Entering into agreements with integrating authorities to ensure the safe management and use of datasets;
- Accessing integrated datasets through secure arrangements; and
- Liaising with data custodians on the valid uses of integrated datasets.
(1) Developing research proposals that produce significant overall benefits to the public
13. A key role of data users is to develop research proposals that make the best use of integrated datasets. These proposals need to take into account the public good which can be derived from a data integration project (e.g., social, economic and scientific benefits) (Endnote 3). The public benefit derived from a project should outweigh the imposition to privacy and any risks to the trust and goodwill that the Australian public has in Commonwealth data collection activities.
(2) Collaborating with integrating authorities and data custodians at the pre-approval stage
14. Data users should seek in-principle agreement from data custodians on the project proposal during the pre-approval stage. Data users will then liaise with the integrating authority appointed by the data custodians. It is the role of integrating authorities to assess project feasibility, which includes an examination of the technical complexities that arise from attempting to combine multiple datasets and the provision of advice on such matters to data users. This includes, for example, advising data users of issues such as coding difficulties, missing variables and/or inaccurate records.
15. Following an assessment of the project proposal by data custodians and integrating authorities, data users will then need to implement any suggested modifications during the pre-approval stage (assuming that all data custodians choose to release the source datasets for the project). Back to top
(3) Entering into agreements with integrating authorities to ensure the safe management and use of datasets
16. Data users will need to enter into an agreement with an integrating authority for data integration projects. This agreement may take the form of a contract, Memorandum of Understanding or other arrangement as appropriate for the parties concerned. When the data custodian and integrating authority is the same agency, appropriate internal governance arrangements, rather than an agreement, will need to be in place. This agreement or arrangement will be administered by the integrating authority on behalf of every data custodian involved in a data integration project.
17. The agreement or arrangement will cover:
- Information on penalties for the identification (or re-identification) of individuals or businesses, the misuse of data or violation of data access arrangements.
- Details on the cost recovery or fee-for-service policies of integrating authorities, where applicable. Fees will be set at the discretion of integrating authorities and may reflect local practices and arrangements.
- Governance protocols that address how to investigate and resolve software issues, along with data anomalies, outliers and other quality concerns that arise from a data integration project. Data users may raise such issues with the integrating authority and, where applicable, have some input into the development of such protocols.
- Any special conditions which must be adhered to by data users, including, for example, training requirements and the signing of confidentiality agreements with data custodians.
18. The purpose of such agreements is to help ensure that datasets are managed in accordance with data custodian requirements while protecting privacy and ensuring that data is not likely to enable the identification (or re-identification) of individuals and businesses.
(4) Accessing integrated datasets through secure arrangements
19. Data users will only be able to access integrated datasets through secure arrangements provided by the integrating authority. The data will be in a form that is not likely to enable the identification of individuals or businesses, and will be subject to the requirements of the data custodians. The access arrangements used will also be dependent on custodian requirements (e.g., data users may be required to visit a data laboratory for ‘high risk’ projects to ensure the protection of sensitive datasets).
20. Data users will also be subject to audits conducted by integrating authorities to check their compliance with custodian’s access arrangements. This could include spot checks, vetting of data outputs by integrating authorities and monitoring activities undertaken by data users (e.g., keystrokes, mouse movements and screen captures).
(5) Liaising with data custodians on the valid uses of integrated datasets
21. A key role for data users is to seek agreement from data custodians on valid uses of integrated datasets before the publication and dissemination of research findings. This could include data users checking with data custodians that data has been used and interpreted correctly. However, it should be noted that this is not a uniform requirement across Commonwealth agencies.
22. Any questions about the roles and responsibilities of data users should be emailed to firstname.lastname@example.org.
- Research is systematic investigation or inquiry which adds to an existing body of knowledge.
- Data custodians assess the risk of a project in accordance with the risk assessment framework.
- Commonwealth’s Statistical Integration Principle 4 – Public Benefit: Statistical integration should only occur where it provides significant overall benefit to the public.