Please note! Course description is confirmed for two academic years, which means that in general, e.g. Learning outcomes, assessment methods and key content stays unchanged. However, via course syllabus, it is possible to specify or change the course execution in each realization of the course, such as how the contact sessions are organized, assessment methods weighted or materials used.

LEARNING OUTCOMES

Learning outcomes for this course, upon successful completion, include the ability to: 1) understand principles of programming using the Python programming language, 2) use Python to collect data from various sources for analysis, 3) employ Python for data cleaning, 4) implement statistical and predictive models in Python using business data, 5) understand how to choose the correct statistical or predictive model based on the available data and business context, and 6) understand how the information resulting from data analysis leads to improved business decision-making.

Credits: 6

Schedule: 04.01.2021 - 22.01.2021

Teacher in charge (valid 01.08.2020-31.07.2022): Joan Lofgren

Teacher in charge (applies in this implementation): Dustin White

Contact information for the course (valid 07.12.2020-21.12.2112):

Dustin R White,
PhD

Assistant
Professor of Economics, University of Nebraska at Omaha



I study sports, labor, and health economics, and teach
lots of data-driven coursework. While I grew up in the Seattle, WA area, I also
lived for two years in southern Brazil, and speak Spanish and Portuguese
(though I am a bit out of practice!). I love learning languages, and I have
caught EVERY SINGLE Pokémon!

Email: drwhite@unomaha.edu

Office hours/lab time will take place during the
60-90 minutes immediately following the daily lecture. All portions of the
course will occur via Zoom. During lab, you will be able to break out into
small groups to work together on the material for a given class period. I will
spend the time answering questions and checking in with your groups as you work
through exercises related to the course material.

CEFR level (applies in this implementation):

Language of instruction and studies (valid 01.08.2020-31.07.2022):

Teaching language: English

Languages of study attainment: English

CONTENT, ASSESSMENT AND WORKLOAD

Content
  • Valid 01.08.2020-31.07.2022:

    This course is intended to introduce the student to programming languages as tools for conducting data analysis, focusing on Python in particular. The course will cover basic principles of programming languages, as well as libraries useful in collecting, cleaning and analyzing data in order to answer research questions. Students will learn to use Python to apply forecasting tools and predictive models to business settings. The course will be divided between lecture and lab time, and labs will be focused on teaching students how to implement the programming techniques and statistical models discussed in lectures.

  • Applies in this implementation:

    Session 1 –
    4 Jan 2020

    Introduction to using
    Python. We will cover opening notebooks, and basic functions in Python.

    Class at 1200
    UTC (1500 Finland time)

    No
    reading or assignments due

     

    Session 2 –
    5 Jan 2020

    Loops and
    Conditions. We will focus on creating logical conditions for our programs to
    meet, as well as looping through code to streamline repeated processes.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 1
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Wednesday,
    Jan 6: Epiphany

    No course activities

    Session 3 –
    7 Jan 2020

    Functions.
    Creating functions in a programming language allows us to reuse code in many
    contexts and to solve new problems. We will explore how to do this in Python
    so that we better understand the code we will be using moving forward.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 2
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Session 4 –
    8 Jan 2020

    Data Frames
    and Pandas. We will practice importing and utilizing data in Python. This is
    the basis for being able to conduct analysis in Python.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 3
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Session 5 –
    11 Jan 2020

    Regular
    Expression and text analysis. Sometimes it is advantageous to be able to
    process text into quantifiable information. Regex provides us the capability
    to transform text and quickly extract patterns from raw data.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 4
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

     

    Read Numsense! Chapter 1

    Session 6 –
    12 Jan 2020

    Plotting in
    Python. We will create visuals using Python to be able to supplement the
    stories that we tell with data through visual media.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 5
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Session 7 –
    13 Jan 2020

    Introducing
    Linear Regression and its implementation in Python. Linear regression
    provides a jumping-off point for statistical analysis, and gives us a chance
    to prepare our data for analysis.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 6
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Read Numsense! Chapter 6

    Session 8 –
    14 Jan 2020

    Classification
    and Regression Trees. Decision trees will give us a chance to discuss machine
    learning and why it differs from regression analysis.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 7 and
    project proposal
    due one hour prior to the start of class (1100 UTC, 1400
    Finland Time).

    Read
    Numsense! Chapter 9

    Session 9 –
    15 Jan 2020

    Random
    Forests and ensemble methods. Ensemble methods provide improved accuracy and
    robustness relative to single machine learning models. We will explore these
    properties through random forest models.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 8
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Read Numsense! Chapter 10

    Session 10
    – 18 Jan 2020

    Clustering
    models. We will explore unsupervised learning through the k-means clustering
    algorithm, and learn about trying to identify various groups of observations
    within data, both as a tool for prediction, as well as for better
    understanding the available data.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 9
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Read Numsense! Chapter 2

    Session 11
    – 19 Jan 2020

    Cross-Validation.
    We want our models to work in the real world. Using cross-validation, we can
    use our data to mimic the real-world and ensure that, to the best of our ability,
    our data practices represent the events that we expect to encounter as we
    implement our models.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 10
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Session 12
    – 20 Jan 2020

    Web scraping
    allows an analyst to collect data from nearly any resource that can be
    accessed online. This powerful tool allows for the examination of complex
    problems and the creative collection of resources to address many different
    needs.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 11
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Session 13
    – 21 Jan 2020

    Where
    possible, the use of Web APIs to streamline data collection is a valuable
    tool. Data collected by API is typically clean and standardized, unlike the
    data that is collected through web scraping.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 12
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Read Numsense! Chapter 5

    Session 14
    – 22 Jan 2020

    Project
    presentations. Each student will present a brief summary of a research
    question they have answered during the term, and policy implications from the
    results that they have uncovered.

    Class at 1200
    UTC (1500 Finland time)

    Assignment 13
    due one hour prior to the start of class (1100 UTC, 1400 Finland Time).

    Project Presentation and
    Writeup due one hour prior to the start of class (1100 UTC, 1400 Finland
    Time).


Assessment Methods and Criteria
  • Applies in this implementation:

    Mimir
    Software:

    Coding
    exercises will form the entirety of the homework assignments. These assignments
    will be completed through the Mimir Classroom web application. This application
    provides access to a virtual machine that can run all of the code you will need
    to implement for this course. It will run on any machine that is able to use
    Google Chrome. Other browsers may be sufficient, as well, but my experience
    suggests that Google Chrome will be the most compatible.

     

    Using Mimir
    Classroom will provide you near-instant feedback on your code exercises, and
    will also provide you the opportunity to submit your code as many times as you
    would like, so that you can keep practicing until you get the problem right.
    This will help your grade almost as much as it will help you to learn to code!

     



    You will receive an email invitation just before the
    beginning of the course, so that you can register for our classroom. I will
    walk you through the application on the first day of class.

Workload
  • Applies in this implementation:

     

    Number of Hours

    Faculty-led engagement (May include synchronous sessions and
    asynchronous interaction, eg viewing recorded lectures, distance teamwork and
    other peer interaction such as threaded discussions.):

    45

    Self-study hours (May include acquisition of content and
    assignment completion.):

    115

    Work with course materials, eg
    required reading

    40     

    Exam preparation

    0     

    Individual research & writing

    50     

    Team projects (meetings, research,
    preparation, etc.)

    25     

    Other

    0     

     

    Total
    of all student workload hours

     

    160


DETAILS

Prerequisites
  • Valid 01.08.2020-31.07.2022:

    none