Topic outline

  • General

    How to lie with statistics? (5cr)

    This is an advanced course in statistics. The course is aimed at doctoral students and master's students interested in statistics. Maturity in performing statistical analysis is needed and thus students should have taken at least one master's level statistics course before attending this course. There are no other prerequisites.



    During this course, students will talk about typical problems and faults in sample selection, choices of location measure, graphical presentation of data, forming questionnaires, statistical testing, regression analysis, and clinical trials. Students are assumed to be familiar with these methods before attending the course. The focus will be on examples about using these methods wrongly --- either accidentally or on purpose --- and on improving statistical analyses.


    Intended learning outcomes

    The objectives are to learn to evaluate statistical analyses critically, to learn to avoid typical pitfalls in simple statistical analyses and to learn to improve presentation of the results obtained in statistical analyses. The objective is not to learn to lie with statistics, but to learn to spot if there is something fishy in a statistical analysis. The ultimate goal is to learn to tell the truth with statistics.


    Lectures and assignments

    The course consists of 12 lectures, lecture assignments, project work and study journal. Lectures are on Mondays and on Wednesdays from 10.15 to 12.00. The lectures are given in zoom. Please note that the lectures are not recorded. Students are expected to attend the zoom lectures. Majority of the lectures, instead of traditional lecturing, consists of discussions. Students will find problematic data examples themselves and their findings and ideas for improving data analyses are discussed during the lectures. Students will also learn to defend their ideas and discoveries by conducting their project works where statistical analyses are used in justifying opinions and claims. Students will also write a study journal. In the study journals students may write down notes about their thoughts and reactions to what has been discussed. Writing and submitting a study journal on time is compulsory for completing the course!



    Lecture topics

    Lecture 1: Introduction --- We talk about the project works and about all the lecture assignment and about common errors and problems that are related to the lecture assignment topics. 

    Lecture 2: Getting ready for the project works

    Lecture 3: Selecting the sample

    Lecture 4: Measures of location

    Lecture 5: Graphics

    Lecture 6: Questionnaires

    Lecture 7: Testing

    Lecture 8: Regression analysis

    Lecture 9: Statistics related to the current pandemic

    Lecture 10: Miscellaneous

    Lecture 11: Project work presentations

    Lecture 12: Summary


    Lecture assignments

    There is an assignment related to almost every lecture. Submit your assignments on time! Late submission is not possible! For lecture 2, every student has to come up with at least two possible project work topics. On lecture 2, we will discuss about the topics and every student selects his/her topic. Project work presentations take place on Lecture 11 so there is plenty of time to prepare for that. For Lecture 3, every student has to find one real data example or invent two examples that illustrate the problems related to biased sample. For Lecture 4, every student has to find one real data example or simulate two examples, where different location measures tell completely different stories. For Lecture 5, every student has to find one real data example or simulate two examples about misleading graphical presentation. For Lecture 6, every student has to find one real data example or write two examples of badly worded questionnaire questions or answer choices. For Lecture 7, every student has to find one real data example or simulate two examples, where results of statistical testing are false or misleading. For Lecture 8, every student has to find one real data example or simulate two examples, where regression analysis gives misleading results. For Lecture 9, every student has to give one example related to misleading interpretation, analysis or comparison of data that is related to COVID-19 pandemic or discuss two possible problems related to the topic. For Lecture 10, every student has to find one real data example or simulate two examples about false statistical analyses. 

    Examples and ways to improve statistical analyses are discussed during the lectures. 

    Study journal

    In order to complete the course, students have to keep a study journal (approximately 1/2 pages per lecture). Study journal must be submitted on time! Writing and submitting the study journal on time is compulsory for completing the course!


    The assessment is based on the lecture assignments, compulsory study journal and the project work. Writing and submitting the study journal on time is compulsory for completing the course! Final grade of the course is given by

    grade = 5 - 0.5ms - 0.5ma - 1md - 1ij,

    where ms is the number of the student's missed lectures, ma is the number of the student's missed lecture assignments, md is 1 if the student does not present his/her project work (and 0 if the student does present his/her project work), and ij is 1 if the student's study journal is incomplete (and 0 if the study journal is complete). The grades are rounded up to the closest integer. For example, grade 5 may be obtained by full attendance, completing all but one lecture assignments, submitting a complete study journal on time and presenting the project work. Grade 3 may be obtained by full attendance, completed lecture assignments, and submitting an incomplete study journal on time. Grade 1 may be obtained by attending all but 2 lectures, completing all but 2 lecture assignments, and submitting an incomplete study journal on time.



    Majority of students' workload will come from independent assignments. Lecture assignments will take on average 7*8 = 56 hours to complete. That includes finding representative data examples and observing problems in them. Writing the study journal takes on average 20-25 h as total. Project work will take on average about 15-20 h. Attending the lectures takes as total 24 h.


    Learning materials

    Main materials for this course are the examples found by the students. The book "How to lie with Statistics" written by Irving Geis may also be used as study material, but there is no need for the students to purchase this book for the course.