PHYS-E0549 - Introduction to Machine Learning for Materials Science D, Lecture, 5.9.2022-11.10.2022
This course space end date is set to 11.10.2022 Search Courses: PHYS-E0549
Topic outline
-
Description - Introduction to machine learning in materials science
Machine learning (ML) techniques enable us to infer relationships from a large amount of seemingly uncorrelated input data. Their predictive power has made them central to product development in IT and we already use them in daily life (Amazon, Netflix, etc.). Physical sciences have been slow to capitalize on the promise of ML, even though their computational implementation is suited to modern simulation techniques. Materials science has recently benefited from a number of ML applications to materials discovery and design (featuring neural networks, genetic algorithms, regression methods, compressed sensing and Bayesian optimisation), that promise to accelerate development of novel technologies. Machine learning for materials science is an exciting new discipline that is now being taught at Aalto University.
"Introduction to Machine learning in materials science" is a project-led lecture course for graduate students who wish to acquire key skills in this cross-disciplinary research field. Introductory lectures on materials science and machine learning will be followed by tutorial exercises. The course introduces different machine learning methods and provides examples for their application in materials science. The tutorials provide hands-on experience for the different methods. In the subsequent Project in Machine Learning for Materials Science course you will be able to apply the newly learned knowledge to your own data.
Course level
The course is intended for students who have completed their Bachelor's degree and have a basic understanding of machine learning or material science and a keen interest interdisciplinary science. Some programming experience or Python knowledge is required to take the course.
Credits
3 ECT are awarded for the course.
Assessment
The course grade is pass/fail. The passing criteria is to attend at least 5 of the 6 tutorial sessions.
Course structure and workload
The course is taught in Period 1
- 6 x 2 h lectures on machine learning in materials science
- 6 x 2h hands-on tutorial sessions
There is no homework for the course and no final exam.Learning outcomes
After completion of the course you:
- learned the importance of machine learning in materials science.
- have gained an overview of different machine learning methods.
- have hands-on experience with Python notebooks.
- have used different machine learning methods in Python.
- can approach a range of different problems with suitable machine learning methods.
- can follow a presentation (e.g. conference or seminar) on machine learning in materials science.
Teachers
Course dates
5.9-11.10.2022
-
In machine learning, we write programs that the machine executes to process data and to learn. To understand machine learning therefore means to understand also how these instructions to the machine are composed. Python has evolved into the standard programming language in machine learning and we will use it in this course for the machine learning tutorials. The course will provide a gentle introduction into Python in the first tutorial, but it would be advantageous, if you have some programming experience (not necessarily in Python) prior to taking the course. We have devised a short pre-assessment notebook for you to test, if you have sufficient programming knowledge. Please go through this pre-assessment, before you decide to sign up for the lecture.
Pre-assessment
Here is the link for a short Google Colab notebook that we have designed for you to test your Python knowledge. If you can easily complete the tasks, you have sufficient knowledge for the course. If you know how to complete the tasks in a different language (e.g. C, Fortran, Scala, Matlab), but are unsure about how to do them in Python, you can brush up your Python knowledge before the course (see below) and sign up for the course. If you have no programming experience at all, it would be advisable to first acquire rudimentary Python skills and take this course next year instead.
Python and machine learning resources
The University of Helsinki has developed the Elements of AI free online course. This is an excellent resource to start familiarising yourself with machine learning and its practical aspects. The course can be taken in your own time. It is not a prerequisite for this course, but the 2nd part "Building AI" might be useful for you, if you are not sure about your Python knowledge.
CSC - IT Center for Science provides a Beginner Python course (~10h to complete), which is also available as Jupyter Notebooks. You can also find many Python learning resources online and we encourage you to explore options that work best for you.
-
Textbooks
Good introduction books to machine learning are: Introduction to Statistical Learning (with applications in R), by G. James, D. Witten, T. Hastie, and R. Tibshirani; Pattern Recognition and Machine Learning by C. Bishop.
Data sources
Nature Scientific Data is a scientific journal that specialises on publishing data sets
Zenodo is an open access data platform on which you can find many data sets.
The article Data-Driven Materials Science: Status, Challenges, and Perspectives reviewed data infrastructures in materials science and contains a list of available infrastructures in mid 2019
The Open Catalyst Project provides computational data for catalysts and machine learning models that operate on this data.
Collection of data resources in materials science.
List of databases in inorganic chemistry by Information Resources on Inorganic Chemistry.
Machine learning in polymer informatics (2021) lists data sources in polymer science
Recent advances and applications of deep learning methods in materials science (2022) reviews deep learning in materials science and provides suitable data sources
Repositories of machine learning models:
DLHub: Simplifying publication, discovery, and use of machine learning models in science describes the DLHub repository of machine learning models.
Review and overview articles:
The following articles are more or less chronologically ordered.
Tutorial article, "Machine learning for quantum mechanics in a nutshell", M. Rupp, 2015 (includes dataset)
Big data and deep data in scanning and electron microscopies: deriving functionality from multidimensional data sets, 2015 (review focussing on microscopy)
Machine learning: Trends, perspectives, and prospects, 2015 (early review in Science)
Machine learning in materials informatics: recent applications and prospects, 2017
Nature Physics Editorial, "Machine learning: New tool in the box", 2017 (fundamental materials science applications)
Recent advances and applications of machine learning in solid-state materials science, 2019
Artificial Intelligence to Power the Future of Materials Science and Engineering, 2020 (review that includes material design, performance prediction, and synthesis)
Perspective article on digitalization (2021): Digital Transformation in Materials Science: A Paradigm Change in Material's Development
Gaussian Process Regression for Materials and Molecules (2021) - clear review of the mathematical foundation of Gaussian process regression
Toward autonomous design and synthesis of novel inorganic materials (2021)
The materials tetrahedron has a “digital twin”, 2022 (advocating for data science approach in materials science)
Perspective article on Machine Learning: A New Paradigm in Computational Electrocatalysis (2022)
Machine Learning for Electrocatalyst and Photocatalyst Design and Discovery review (2022)
Recent advances and applications of deep learning methods in materials science (2022) reviews deep learning in materials science and provides suitable data sources -
-
In this course we use Google Colab notebooks for the tutorials. We will post the link for the tutorial here on Tuesdays in the corresponding folders. We will also post a solution notebook here on Tuesday evening.
❗ NOTE ❗
- You might see a warning saying that the notebook is not authored by Google, please ignore the warning.
- Please save the notebook to your Google drive to ensure that your work is saved. To save the notebook, Click File -> Save to Drive.
Some useful resources:
🔥 Colab Introduction: If you're not familiar with Colab, you can find a quick introduction (approximately first 10 mins) It shows you how to run code on colab and write text in the text cells.
📚 Here's also a link to Colab documentation. Which goes over similar content as in the video above.
🤔 Colab is like a Jupyter notebook, but run on Google servers. A more indepth introduction to Jupyter notebooks can be found in the following video.
-
Prof. Patrick Rinke (patrick.rinke@aalto.fi)
Dr. Armi Tiihonen (armi.tiihonen@aalto.fi)
Dr. Matthias Stosiek (matthias.stosiek@aalto.fi)