CS-E4880 - Machine Learning in Bioinformatics D, Lecture, 3.3.2023-2.6.2023
This course space end date is set to 02.06.2024 Search Courses: CS-E4880
Topic outline
-
Learning diary
A learning diary is written from each guest lecture:
- around 450 word (ca. 1 page) diary entry for each lecture
- around 250 word (ca. 1/2 page) concluding chapter
See the learning diary instructions below on how to write it. The concluding chapter of the learning diary is returned at the same time as the final report.
Oral Presentation
The groups will make a short presentation (15 minutes presentation + 5 minutes questions) of a paper they have chosen.
- Avoid presenting too many slides: give the audience time to read and understand the content. At least 1.5 minutes per slides should be allocated.
- Focus on key bits of the paper, what do you want the audience to remember of the paper. Make sure that you understand the presented content yourself.
Further hints how to make an oral presentation of a paper are given in the lecture slides How to Present a Paper
Project
Two kinds of projects are possible
- "Demo": Implementation of a prediction method and testing on a dataset.
- "Comparison": Pick a set of methods with existing implementations and rigorously test their performance
Note: It is ok, if the project will not eventually completely match the plan given in the abstract. After all, it is just a plan, and during the project it is natural that some things will change.
To propose the topic for your group, please write an abstract of ca. 1 page about the project that your group will be pursuing. Please submit the abstract in the Mycourses return box by the deadline April 5.
Here is an example of the Project topic abstract style. Hint: Oxford university Press Latex style file produces this layout if you use the documentclass options 'modern,large'.What to write in the abstract:
- Describe shortly the biological application you are focusing on. Is your project of the type "Demo" or "Comparison”.
- Describe the machine learning methods that you are focusing on, including bibliographic reference(s) to the paper(s) describing the method(s). Describe the datasets that you will be using.
- What kind of experiments you plan to conduct? What kind of evaluation metrics? How will you present the results?
- Who will do what? Note that there are different tasks and your group members have different competences. Note that this is only a plan, you can change things during the project. Try to maintain a balanced workload.
Poster Presentation
The projects will be presented poster exhibition on June 2.
- Poster should catch the eye from a few meters distance
- See examples of posters from ICML 2022 conference
- Emphasis on visualizations, avoid massive amounts of text
- The size of poster is usually either A0 or A1
- Tools like PosteRazor can be used to print the large poster in small pieces
- It is recommended to use Aalto poster template (LaTex) to generate the poster, with the layout as Aalto poster example
- Don’t forget to put your name(s) on the poster :)
Final report
- Written in Bioinformatics journal (Oxford university press) style. Please use the templates offered by the journal. if you use the OUP LateX template, please use the document class options ‘modern,large’ to achieve the Bioinformatics journal style.
- Length 6-10 pages
- For report structure, the canonical structure of a scientific paper is recommended:
- Abstract, Introduction, (Materials and) Methods, Results & Discussion, Conclusions
- For writing in Latex, you may use for example the overleaf tool, or any environment of your choice.
Hints on the writing style
Overarching aim: A scientific paper should convince other scientists.
A good scientific writing guide: Chris A. Mack: How to write a good scientific paper
- Communicate in an unambiguous way, there should not be room for multiple interpretations of what you have written (use of math helps here enormously)
- Always justify your claims: either refer to previous literature or demonstrate in your own experiments; leave opinions to other kinds of writings
- Communicate in as simple and transparent way as possible: minimize the effort of the reader; avoid overly complicated sentences; choose mathematical notation carefully (minimalism is good)
- Imitate good papers' style: It is ok to copy the writing style (not the content!) from other papers
Presentation of material
- Text, mathematics and figures can all be used to convey scientific information. A good scientific paper in bioinformatics should contain all of the three
- Think about what is the best way to communicate a certain piece of information: Math is the best tool to describe models and algorithms. Figures are good for showing overall idea and relationships, and for summarising results. Text glues everything together.
Using literature
It is not enough to use one scientific article for the project paper!- How many references you should use and cite? As a rule of thumb "at least as many references as there are pages in the paper". This is applicable for longer works as well (MSc theses, PhD theses).
- Try to locate the best papers about the topic. You will probably end up reading more papers than you will eventually use.
- Try to make a synthesis of the literature. What is the main message of a set of papers about some topic? How do the individual papers relate to or deviate from this main message.
Sources of information
Quality of literature is one of the important aspects of a research article. Below is a preference order for types of source literature:- High-quality journals: e.g. Bioinformatics, PLoS Computational Biology, BMC Bioinformatics, Nature, Genome Research, ...
- Proceedings of high-quality conferences: e.g. Intelligent systems for Molecular Biology (ISMB), International Conference on Machine Learning (ICML), Research in Computational Molecular Biology (RECOMB)
- Text books contain high-quality information. However, as the publication process of books takes very long, the information in text books is rarely the latest in science. Text books can be used as sources of information, but they should always be accompanied by journal and conference papers.
- Wikipedia contains a lot of information and sometimes is a good source to get an overview of the seminar topic. However, the quality of Wikipedia articles varies. In particular, the peer-review process behind a Wikipedia article is typically not the same level as high-quality scientific journals and conferences. You may use Wikipedia as a means to learn about some topic. However, avoid using Wikipedia as the only source of information. Always verify the facts using other sources of information.
- Online course material is widely available in the www. These should be used even with more caution than Wikipedia. Some courses are very good, some are not, and there is no peer-review process behind the material. Online courses should not be used as references in you seminar paper.
- The rest of WWW. A random web page of some individual/organization/group about some subject has typically very little quality control behind it. This material is not suitable for seminar paper material.
Publication forum ranking
Publication forum (Finnish acronym JUFO) is a Finnish initiative to rank journals and conference by their status in each research field.
- basic level
- leading level
- highest level
Generally speaking, the higher ranking the better. However, all levels are considered "ok". However, you should be careful of journals and conferences that are not ranked at all. You can search for the publication forum rank of a journal or a conference
Finding information
- Google Scholar is perhaps the best search engine to find literature on certain topic.
- Electronic journals: Aalto has subscriptions to a wide range of electronic journals, you can access these from the university computers.
- To access Electronic journals Aalto subscribes to via Google Scholar, remember to enable "Library Links” in Google Scholar preferences
Combination of two search strategies will lead to the best results
- Google Scholar will give you well-references articles that match to the keywords. These are often a bit older. Hint: from Scholar you can easily access the list of papers that cite the paper; the most cited of these are probably important
- Systematic search through the tables of contents of latest issues of good journals will return you the latest of the latest in the topic - Scholar will not give you these