Generative AI for Scientific datasets

Fostering Data Sharing and Collaboration in Science: A Generative AI-Assisted Meta-Data Framework for Diverse and Large-Scale Datasets

Degree type

PhD

Closing date

1 June 2024

Campus

Hobart

Citizenship requirement

Domestic

About the research project

Data sharing is the cornerstone of scientific progress, fostering research collaboration and transparency. By openly sharing scientific data, the research community can build upon existing knowledge, validate findings, and accelerate the pace of discovery. This collective effort not only enhances the reliability of scientific outcomes but also maximises the utility of data, ultimately advancing our understanding of the world around us.


The exponential growth of scientific data has posed a challenge to traditional data management and sharing practices, presenting significant hurdles for researchers. Current methods of generating metadata often struggle to keep pace with large-scale scientific data's sheer volume and diversity, leading to inefficiencies in data discovery and utilisation. Manual metadata creation is resource-intensive and susceptible to errors, which impedes the full realisation of data's potential. Conversely, while automated approaches show promise, they often lack context and fail to capture the rich information necessary for a comprehensive understanding of data in the scientific domain.


The requirements for metadata search in the scientific domain are different from general search. Features specifically tailored to the scientific field carry significant importance in dataset retrieval and sharing. Furthermore, typical sources of metadata in scientific environments encompass relational schemas of datasets, fragmented metadata, labelled unstructured data, and semantic constraints, all of which are known for diversity and inconsistencies that pose a substantial barrier to sharing scientific data for progress and innovation.


Recent advancements in Generative AI offer a highly promising solution for automating the creation of metadata. The existing research in this field has indicated that Generative AI has the potential to achieve more consistent metadata generation. However, challenges such as accuracy, domain knowledge, and linguistic preferences continue to pose persistent obstacles in the development of a fully functional Generative AI-assisted metadata generation framework. This brings us to the primary research question of this project:


Can Generative AI techniques be harnessed to improve the shareability of automatically generated metadata, enhancing both structural consistency and quality in the context of scientific research, while mitigating the limitations associated with accuracy, domain specificity, linguistic nuances, data variability, and data bias?

Primary Supervisor

Meet Dr Muhammad Bilal Amin

Funding

Applicants will be considered for a Research Training Program (RTP) scholarship or Tasmania Graduate Research Scholarship (TGRS) which, if successful, provides:

  • a living allowance stipend of $32,192 per annum (2024 rate, indexed annually) for 3.5 years
  • a relocation allowance of up to $2,000
  • a tuition fees offset covering the cost of tuition fees for up to four years (domestic applicants only)

If successful, international applicants will receive a University of Tasmania Fees Offset for up to four years.

As part of the application process you may indicate if you do not wish to be considered for scholarship funding.

Other funding opportunities and fees

For further information regarding other scholarships on offer, and the various fees of undertaking a research degree, please visit our Scholarships and fees on research degrees page.

Eligibility

Applicants should review the Higher Degree by Research minimum entry requirements.

Ensure your eligibility for the scholarship round by referring to our Key Dates.

Additional eligibility criteria specific to this project/scholarship:

  • Applications are open to Domestic/ International/ Onshore applicants

Selection Criteria

The project is competitively assessed and awarded.  Selection is based on academic merit and suitability to the project as determined by the College.

Additional essential selection criteria specific to this project:

  • Applicant with experience working with large-scale scientific datasets is a plus; especially, working with diverse and heterogenous data
  • Proficiency in high-level programming languages (such as Python, C#, or Java) is a mandatory requirement
  • Passion for research, coupled with critical thinking and problem-solving abilities, is highly valued

Additional desirable selection criteria specific to this project:

  • Applicant with experience working with large-scale complex data such as climate datasets
  • Skilled in high-level programming language Python with adequate understanding of Applied Machine Learning and Generative AI-models
  • Passion for research, coupled with critical thinking and problem-solving abilities, is highly valued

Application process

  1. Select your project, and check that you meet the eligibility and selection criteria, including citizenship;
  2. Contact Dr Muhammad Bilal Amin to discuss your suitability and the project's requirements; and
  3. In your application:
    • Copy and paste the title of the project from this advertisement into your application. If you don’t correctly do this your application may be rejected.
    • Submit a signed supervisory support form, a CV including contact details of 2 referees and your project research proposal.
  4. Apply prior to 1 June 2024.

Full details of the application process can be found under the 'How to apply' section of the Research Degrees website.

Following the closing date applications will be assessed within the College. Applicants should expect to receive notification of the outcome by email by the advertised outcome date.

Apply now Explore other projects

Why the University of Tasmania?

Worldwide reputation for research excellence

Quality supervision and support

Tasmania offers a unique study lifestyle experience