Översikt

  • A warm welcome to the Data Analytics for Managers course page!

    This course consists of two modules of 3 ECTS each: 1) Principles of planning and leading with data; and 2) Decision making based on optimization and modelling.

    On this hands-on course you get to analyze data related to real-life business decisions and make recommendations based on them. The course is designed so that we have lectures on Monday morning at 9.15 -12 in which the assignment for that week is introduced. We will have an optional workshop on Friday afternoons (12.15 - 15; in person only) if you are having challenges with the data analysis assignment. The data analysis assignment for each week is due at 23.00 on Friday.

    All Monday lectures are offered in hybrid format via Zoom. The Zoom link is available below. Unfortunately our classroom (V002 SAASTAMOISEN SÄÄTIÖ) does not have enough capacity to accommodate the large number of students that have registered for this course. Please book a slot in advance below for attending the session in person. The classroom capacity is 60 students.

    Please find the course syllabus below (updated version with all readings included).

    Best wishes,

    Iiris



  • Tillgänglig om någon av:
    • You are a(n) Teacher (MC)
    • You are a(n) Studerande
  • Articles and other readings assigned for modules

    (scroll through – no need to learn these by heart)

     

    Module 1:

    Delen, D., & Ram, S. (2018). Research challenges and opportunities in business analytics. Journal of Business Analytics, 1(1), 2-12.

    Hansen, H. K., & Mühlen-Schulte, A. (2012). The power of numbers in global governance. Journal of International Relations and Development, 15(4), 455-465.

    Martinez, L. R. (2022). How Much Should We Trust the Dictator’s GDP Growth Estimates?. Journal of Political Economy, 130(10), 000-000. (Read pages 1-5 only)

    Richards, N. M., & King, J. H. (2014). Big data ethics. Wake Forest L. Rev., 49, 393.

    Saltz, J. S., & Dewar, N. (2019). Data science ethical considerations: a systematic literature review and proposed project framework. Ethics and Information Technology, 21(3), 197-208.

     

    Module 2:

    Below is a set of articles that describe how optimisation methods have been applied in real-life situations. Reading these will help understand the potential of the techniques and the practicality and biggest challenges in applying them. Reading one or two will already give most benefits and you can choose the ones that sound more interesting to you. However, the first one is perhaps to most straight forward and is thus recommended as a starting point. When reading for this course, it will be much more useful to focus on the general ideas than trying to remember details.

    Mandatory article:

    "Strategic Planning of BMW’s Global Production Network" (full text inside Aalto network)

    https://pubsonline.informs.org/doi/abs/10.1287/inte.1050.0187

     

    Optional articles:

    "Production and Distribution Planning in Danone Waters China Division " (full text inside Aalto network)

    https://pubsonline.informs.org/doi/10.1287/inte.2018.0973

     

    "Selecting infrastructure maintenance projects with Robust Portfolio Modeling" (full text inside Aalto network)

    http://www.sciencedirect.com/science/article/pii/S0167923615000858

     

    "Kroger Uses Simulation-Optimization to Improve Pharmacy Inventory Management" (full text inside Aalto network)

    http://pubsonline.informs.org/doi/pdf/10.1287/inte.2013.0724

     

    "Optimizing Chevron’s Refineries" (full text inside Aalto network)

    http://pubsonline.informs.org/doi/pdf/10.1287/inte.2013.0727

     

    "Blending OR/MS, Judgment, and GIS: Restructuring P&G's Supply Chain" (full text inside Aalto network)

    Blending OR/MS, Judgment, and GIS: Restructuring P&G's Supply Chain | Interfaces (informs.org)

     

    "Improving Supply-Chain-Reconfiguration Decisions at IBM" (full text inside Aalto network)

    http://pubsonline.informs.org/doi/pdf/10.1287/inte.1050.0166

     

    "Lockheed Martin Space Systems Company Optimizes Infrastructure Project-Portfolio Selection" (full text inside Aalto network)

    http://pubsonline.informs.org/doi/abs/10.1287/inte.1080.0378

     


  • Tillgänglig om någon av:
    • You are a(n) Studerande
    • You are a(n) Lärare
    • You are a(n) Teacher (MC)

    Our case company, Skipperi, is seeking to expand their operations to new cities worldwide. As a result, you are provided with a list of potential cities, and variables to measure the attractiveness of these cities. The excel for the assignment can be found below.

     

    Week 1: Assessing the data at hand; Creating a database

     1.       Getting familiar with the potential cities and variables

     a.       The list of potential cities can be found in the “Data template” sheet of the Excel file. Scroll through it so that you know what the potential options are. Create a pivot table with continent, country and city.

    b.       Practice different hot keys to move around the worksheet.

    c.       Get familiar with the variables given to measure the attractiveness of the cities. Some of these variables should be added in your analysis by default, others are optional for you to decide. The list of variables can be found at “List of potential variables” on the Masterfile. Evaluate the credibility of the sources and suitability of the data for the analysis at hand. Also consider their level of analysis (country, city) and if certain variables are measuring the same thing. Check out the raw data, found on the other excel sheets, to see the which year the data is from, if there are missing values, etc.

    d.       Add your comments, regarding the suitability and relevance of the data for Skipperi’s location selection process, to the “notes” column in the “List of potential variables” sheet.

    e.       Find at least one additional variable that is not included in the given variable list and include it in your database. Justify why you selected this variable. Please do not select a variable that is already listed in the Excel. Note: Sometimes raw data is in cvs. file format. In that case the data needs to be formatted using the “text to column” command before it can be attached to the “Data template” sheet in the next step.

    f.       If you want, you can freeze the panes at the "Data template" sheet in order to make the variable names and city list visible at all time. This would make it more user friendly. To do that, select cell H4, then click "View" and "Freeze panes". The instructions for this are at the end of the "getting started video".

    Note: If you have MAC, the tab is called "Data" / "Tiedot" and then click on "Filter" / "Suodatin"

     

    2. Creating your own database for further analysis

    a.       Attach the mandatory variables to the “Data template” sheet with V-Lookup command for each of the PESTEL categories.

    Note:

    • Some databases use a comma to separate decimals and some use a dot. Most of the variables that we have use comma as a decimal separator but some use a dot (e.g. Environment1 and Technology1). Excel does not recognize it as a value if it does not have the same decimal separator as in your excel settings. In the tutorial excel the decimal separator is a comma but it may change depending on your excel settings. You can change the dots into commas with "replace" function (Shift + H). However, double check that the values stay the same.
    • If V-Lookup does not work with +vlookup command on your computer, try =vlookup. You can also insert the V-Lookup formula by selecting the cell and then clicking the “Formulas” tab and then selecting “Lookup & Reference”. You can find Vlookup under that.
    • V-Lookup uses only one lookup value column based on which it searches information in the range that you have specified. When you use V-Lookup for the country level variables, use first column B as your lookup value. Then for the missing values use the lookup value C.
    • When using V-Lookup, you always need to lock the selected values by typing semicolon ( ; ). However, the command in the MAC version is often comma instead of semicolon ( , ).
    • Social1 variable (Leisure time use): Non-OECD economies (China, India, South Africa) are listed in column B. Thus, the lookup_value needs to be adjusted so that it is column B instead of A when retreiving data for South Africa.
    • F4 refers to "F4" button at the top row of your keyboard. If you have MAC, use COMMAND+T instead.
    • In "Ease of Doing Business" variable (Political 3) there are some hidden columns. Thus, the column number is 61 in the V-Lookup command.


    b.       Attach 5-7 additional variables that you consider relevant and important for your analysis to the “Data template” sheet.


    Please find Excel tutorial videos below! In excel there are often many different ways to get to the same result, e.g. X-Lookup and V-Lookup. You can use other ways than what is instructed, as long as you get to the same outcome.

    Workshop (optional attendance) on Friday Oct 28 is starting at 12.15.

    Submit your excel file to MyCourses by Friday, Oct 28 by 23.00.

    ***

    Week 2/3: Conducting analysis (Excel) Instructions given Monday Oct 31

    Week 3/3: Formulating recommendations and visualizing the findings (Excel and Power Point) Instructions given Monday Nov 7



  • Tillgänglig om någon av:
    • You are a(n) Studerande
    • You are a(n) Teacher (MC)
    • You are a(n) Lärare

    Week 2: Describing and summarizing data; Conducting analysis

    Last week we selected the variables that we want to use to analyze the cities’ attractiveness for Skipperi. We also created our own worksheet (Data template) which we will continue to use for this week’s assignment.

    In this week’s assignment we will first fill in the missing values. A common way to do this is to calculate a sub-sampled mean, for example, we may insert the country mean in missing city-level entries. If the missing values are for country-level variables (or the country level variables are not available for the city level variable), we could then take the mean of the continent. If there is no data for the continent, we could use the average of all cities or countries for the purposes of our analysis. You can also use a neighboring country if you believe that it’s a better estimate than the average of the continent. Do consider though how this will affect your subsequent analysis.

    You can decide which measure you will use but make a note and remember to consider how this may have affected the final recommendations you will suggest in the third assignment. Add these remarks below the variables at the “Data template” sheet. Pivot tables are handy for calculating the descriptive statistics (e.g. mean values) per country and continent. See also variable-specific notes below.

    We will also create scales and weighting for each variable in order to calculate the attractiveness of each city. We are using the following scale to rank each of the variables that you have selected in your analysis:

    very good

    3

    good

    2

    ok

    1

    poor

    0

     

    We will also then assign weights for each variable that you use. You can decide the weight of each variable based on their importance in your analysis. The weights of all variables totaled together should equal 100%.

    The following excel tutorial videos explain the logic behind V-Lookups and Pivot tables nicely. You may want to watch them before moving to the tasks explained below. Even if you have used excel prior to this course, there could be some useful tips (hot keys explained, etc).

     

    V-Lookup tutorial video: https://youtu.be/d3BYVQ6xIE4

    Pivot table tutorial video: https://youtu.be/9NUjHBNWe9M (until 11:30)

    Dollar sign tutorial: https://www.youtube.com/watch?v=b3i6AgMEssI


    Todo:

    1. Filling in missing values

    Change the data source of your pivot table to include the whole database on the “Data template” sheet. You can do this by clicking on your existing Pivot table to activate it. Then select “PivotTable Analyze” tab and click on ”Change data source”. It will take you automatically to the “Data template” sheet because that is the source of our Pivot table data. Then select the whole table with hot keys (Windows: Ctrl+Shift+Arrow key; MAC: Shift+COMMAND+Arrow key).

    See which variables in your database have missing values. Work on one variable at the time. Drag the variable that you are working on to “Values” box of the Pivot table fields. The default value of the variable is typically “sum” but since we are interested in the mean (average), we need to change the value accordingly. To do this, right click on the respective Pivot table column and select “Summarize values by”. Then select “Average”. Alternatively, you can click on the little triangle in the variable and select (Value Field Settings). A separate small window opens from which you can choose the type of calculation that you want to use to summarize data from the selected field. Select “Average” and click “OK”. 

    Then go back to the “Data template” sheet and retrieve the missing values with V-Lookup from the Pivot Table. Once you have filled in all missing values, I recommend copying the output in the columns from the V-Lookup and then pasting them ‘as values’, otherwise they may be prone to change. Repeat this for each variable that you have chosen to be included in your variables list.

    Note:

    ·       Econ3_ Metropolitan GDP per capita: Missing values for the Metropolitan GDP per capita, however, can be replaced with the mean of the GDP per capita of the country (Social2 variable). However, you may want to use the same year’s values for the GDP per capita than what you have used for the Metropolitan GDP per capita.

    ·       Econ4_Purchasing Power Index by City: You can take a look at the original data sheet of the variable and take a mean of the two French cities, Lyon and Paris, that are listed in the database. The easiest way to do this is to go to the Econ4 datasheet, add a filter, and then sort column B (city) so that it contains France. You can then do the same for Mexico.

    ·       Econ5_Employment rate: For Middle East and South America, you can take the average of all countries for the purpose of our analysis.

    ·       Tech4_Smart city index: You may want to change the letters into numbers, e.g. by using the following scale: AAA=1 AA=2 A=3 BBB=4 BB=5 B=6 CCC=6 CC=8 C=9 DDD=10 DD=11 D=12. Thus, the lower the better. The replace function is useful here.

    ·       Social5_Population density: Instead of the sub-sampled mean, it may be justified to use the sub-sampled minimum because the cities that are not metropolitans typically have a lower population density than any of the metropolitan cities. In fact, even the minimum population density is probably a high estimate but it’s ok for our purposes. You can then select “Minimum” instead of “Average” in the “Summarize values by” option in Pivot Table.

    ·       Environment1 and Technology1: The original sourve data had a dot as a decimal separator. If your excel uses comma as a decimal separator then you need to change the dots into commas before your excel recognizes them as values. Replace command (Ctrl + H) is useful here. Make sure the values themselves to not change when you use the replace function!

     

    2. Determining scales for each selected variable

    First, unhide all columns in “Data template” sheet. Then conduct descriptive analysis to identify the following measures for the variables that you select in your analysis:

    ·       Mean, Median, Min, 25th percentile, 50th percentile, 75th percentile, Max,

    ·       Mean Europe, Mean North America, Mean Central America, Mean South America, Mean Middle East, Mean Asia, Mean Oceania (these are listed in column G186:G200)

    Second, make note of the measurement scale, below the descriptive statistics (i.e. “the higher the better” or “the lower the better”). Then, add values 0 (poor) to 3 (very good) for each quarter (lower than 25th percentile, lower than 50th percentile, lower than 75th percentile, 75th percentile or higher) in the grey area. In order to retrieve the correct value of 0-3 to the scale column next to each variable, we can use +IFS command.

    Tip: Command (WINDOWS version) for attaching the scale for “P1_WGI political stability” variable is the following (for cell cell I184):

    =IFS(H184<H$189;I$189;H184<H$190;I$190;H184<H$191;I$191;H184>=H$191;I$192) 

    However, make sure you understand where the command is coming from. Please note that here the last part of the command (>=H$191;I$192) refers that the value is equal of above 75% percentile (H$191), in which case it finds value in I$192.


    Note: There is a small (but important) mistake in the command for the continent means in this video. There should not be a dollar sign in front of the variable column (H) in order to be able to copy paste it to other variables. Thus: =+AVERAGEIF($A$4:$A$184;"EUROPE";H$4:H$184)


     

    3. Determining weights for each selected variable

    On your “List of potential variables” sheet, add two columns next to the notes columns and name them “Include in analysis” and “Weight”. Formulate the “Weight” column to display percentages. Also, add a row called “Total” below your own variable. Calculate the sum of each of these two new columns and display the value in the “Total” row. Indicate 1 in the “Include in analysis” column if you have included it. You can add a filter and hide all other variables to make it easier for you. Then evaluate the importance of each of the variables that you include in your analysis by adding weights for them. The total should be 100%.

    Link all the yellow cells on the “Data template” sheet to the weights that you have identified in the “List of potential variables” sheet. Create a formula (e.g. ='List of potential variables'!$K2) instead of manually typing the value because you can then go back to edit the weights at the “List of potential variables” sheet if you need to. The weights will then automatically update on your “Data template” sheet.


    4. Calculating the attractiveness of the city

    Calculate the weighted score on the “Data template” sheet by multiplying the scale for each city with the weight that you have specified for that specific variable. Copy-paste the formula to all variables in your analysis.

    Add a total column next to your own variable on the “Data template” sheet and calculate a sum of the weighted score of each of the variables that you have selected in your analysis. This indicates the  attractiveness score of the city.

     

     

     


  • Tillgänglig om någon av:
    • You are a(n) Studerande
    • You are a(n) Teacher (MC)

    Week 3: Visualizing the findings and formulating recommendations

    Last week we created an analysis table in which we combined the scales and weights for the variables we included in our analysis. Based on them we calculated the attractiveness of the potential cities.  

    In this week’s assignment we create a slide-based (ppt; 7-10 slides, focus on quality rather than quantity, NOTE: the number of slides is updated after the session 3, Monday Nov 7th) report to the case company in which we illustrate what we have done and why, as well as make our recommendations based on them. This includes reasoning for selecting our (at least the most important) variables, as well as their scales and weights. We also visualize our most important variables to show how the different cities are ranked in them. We then provide the case company with a short list of 2-3 countries (including your recommendation of the most attractive city in each one of them) based on the analysis that we have done. If there is one country that is significantly above the others, you can also focus on that only.

    Please note that Skipperi is currently operating in six countries. They are primarily interested in expanding their operations to cities in countries where they do not yet operate. However, they are also keen to understand how their current cities are ranking in your analysis, and in which cities they should expand their operations in the countries where they already operate. We should thus make their existing cities stand out in our analysis with adding an asterisk to the city name (e.g. Helsinki*). You can find the list of cities in which they currently operate here.

    Create your slide set based on the recommendations and tips provided by Jan Bittner / EY-Parthenon.

    Consider in this week’s tasks: Which visualizations aid the interpretability of our message? What do these selected visualizations show? What do these visualizations not show? Has the geopolitical situation in some cities/ countries changed considerably after the data was collected?

    Structure of the report:

    ·       Executive summary

    ·       Chapters with Level 2 and 3 logics, incl. illustrations

    ·       Conclusion (incl. overall conclusions, decisions that need to be made, further key issues to be investigated, next steps)

    ·       Do not use appendices – include them as part of the report

    Grading criteria:

    ·       Relevance: How well is the presentation content targeted at the audience (i.e. Skipperi)?

    ·       Logic and clarity: How well is the presentation organized and its core messages understood?

    ·       Visuals: How well are the slides visually designed (engaging visuals, headings, appropriate number, suitable amount of information etc.)?

     In addition to the graphs and other illustrations of your analysis, you can also add e.g. pictures of the cities. Only use pictures with “creative commons” license [Ppt: pictures -> online pictures -> select “creative commons only”] or your own.

     

    Please save your files with new versions as you work on them! This way you can always return to the latest version if something happens (e.g. undo does not return a worksheet if you accidentally delete it). It’s a good practice to save your files e.g. yyyy-mm-dd_v2 or dd-mm-yyyy_v2.

    Please submit the MS Excel file, PowerPoint file, and a pdf version of the ppt file in case the formatting changes when being opened on another computer. The final files should have your name and student numbers in the file name.

     

    Todo:

    1. Attach the list of Skipperi’s current cities to the Data template sheet

    Please make sure you have copied all variables as values (instead of having the V-Lookup commands) on your “Data template” before proceeding to the following steps.

    Add a column to the right of the City column (E). Open the list of Skipperi’s current cities. Run a V-Lookup in the new column so that the name of the city appears (column index: 1 in V-Lookup command) if it is listed in the current cities excel. If the city is not found in the current cities excel, it will appear as #N/A. Then save the V-Lookup commands as values and delete all #N/A. Add two other columns to the right of the City column (E). In column F,  add asterisk (*) to the cell if the city was found in the Skipperi’s current cities list. Then in column G, combine text from columns E and F (=E4&F4) so that the there is an asterisk attached to each city in which they operate at the moment (e.g. Helsinki*) but not in cities where they are not present. Save the commands as values and name the column e.g. City*. You can then delete the "working" columns so that you only have the original City column and new City* columns left.

    Add the asterisk also for the countries in which they currently operate.

     

    2. Create visualizations to support your message

    There are many visualization methods available in excel, and we will use them to supplement our recommendations. Here are some that you can use but you are welcome to use also other charts than what are mentioned here. Use at least three different types of charts in your report.

    a. Column chart

    You can illustrate e.g. the max score of the country. Add countries to the “rows” box of the Pivot Table and drag the total score to the “values” box. You can then take the maximum value of the country to illustrate which countries had the most attractive cities. Copy paste the Pivot Table as values to another sheet. Sort your table from largest to smallest and make a clustered column.

    b. Scatterplot

    GDP is a significant variable in the decision-making process when deciding where the case company should expand their business. Hence, the GDP per capita was also a mandatory variable in our analysis. Considering this, we could illustrate the relationship of one key variable with GDP per capita. To do this, firstly create a new sheet in your excel and label it “Scatterplot”, for example. Then go to the Pivot table and drag Country to the “rows” box. Then select the variable that you want to illustrate in the scatterplot and the GDP per capita (averages of both). Copy the pivot table as value to the “Scatterplot” sheet. Add a filter and sort the countries by GDP. Then create a scatterplot including the country names and their selected variable, add country names to the data points, title X axis as ‘Countries sorted by GDP’, then delete the X axis numbers above.

    c. World map

    You can create your own heatmap with the Excel -> Insert Map chart (e.g. WGI: Political Stability and Absence of Violence/Terrorism).

    d. Histogram

    Shows the distribution of the data grouped into bins. Useful to show, e.g. how the scoring of different cities is distributed (e.g. The Cities of the Future Index: citizen adoption).



    3. Create your slide set based on the recommendations and tips provided by Jan Bittner / EY-Parthenon.

     

    Note: I said that there will be no video tutorial for this week's assignment but after the session I was asked to make one to get you started. Please find it below.

     


  • Tillgänglig om någon av:
    • You are a(n) Studerande
    • You are a(n) Teacher (MC)
    Reflection papers (2x10%)

    These assignments helps you to reflect on the knowledge accumulated from the readings and lectures. The purpose of the reflection papers is to summarize the key knowledge and bring your own perspective to it. In the reflection paper, you should summarize YOUR key lessons of the material covered. You may for instance reflect on what you found surprising, particularly interesting, or things that you do not agree with. Excellent papers are ones where you describe your own learning as well as demonstrate critical and analytical thinking.


    1. Different types of data and tools (10 points) DL postponed to 25.11
    Some guiding questions:
    • What are the differences between descriptive, predictive and prescriptive analytics?
    • What do you have to take into account when evaluating the source and context of the data?
    • What kinds of tools would you like to learn after taking this course and why?

    Relevant articles for the reflection paper 1:

    Delen, D., & Ram, S. (2018). Research challenges and opportunities in business analytics. Journal of Business Analytics, 1(1), 2-12.

    Hansen, H. K., & Mühlen-Schulte, A. (2012). The power of numbers in global governance. Journal of International Relations and Development, 15(4), 455-465.

    Martinez, L. R. (2022). How Much Should We Trust the Dictator’s GDP Growth Estimates?. Journal of Political Economy, 130(10), 000-000. (Read pages 1-5 only)



    2. Legal and ethical issues associated with collecting, storing and handling data (10 points) DL 18.11
    Some guiding questions:
    • What types of legal AND ethical issues do you have to consider when collecting, interpreting, and making conclusions based on data?
    • What do we need to know to be able to “know” based on big data?
    • What kind of skills are necessary?

    Relevant articles for the reflection paper 2:

    Richards, N. M., & King, J. H. (2014). Big data ethics. Wake Forest L. Rev., 49, 393.

    Saltz, J. S., & Dewar, N. (2019). Data science ethical considerations: a systematic literature review and proposed project framework. Ethics and Information Technology, 21(3), 197-208.



    Format: Length of maximum 2 pages, font 12, line spacing: 1,5.

    Deadline: The assignment must be uploaded to the assignment submission box in MyCourses by 23.00 on Friday of the submission week.

    Evaluation: Maximum number of points for each reflection paper is 10.