Topic outline

  • General information

    The course will have 3 assignments. Students will be graded based on the three assignments. No final exam will be needed.

    • First assignment: 25 points
    • Second assignment: 35 points
    • Third assignment: 40 points

    In total,  with the three assignments students can achieve max 100 points, which will be mapped into the grades 0-5.

    Assignments are individual work and in principle they are independent, thus it does not need to finish all three assignments to pass the course (minimum 50 points).

    An assignment delivery must follow the following template:

    * Assignment delivery template

    We suggest you to use GIT to manage/develop your assignment deliveries.  You have to submit your assignment delivery into MyCourses but you can use your own Git spaces in Github, Gitlab, Bitbucket or Aalto Version.

    Academic violations

    Read the important information about academic violations.

    Tentative Release and Due Dates (May be changed)

    • 1st Assignment: Release on 03.02 and Due on 19.02
    • 2nd Assignment: Release on 03.03 and Due on 19.03
    • 3rd Assignment: Release on 24.03 and Due on 9.04

    Deadline and workload

    • All deadlines are hard: no extension (special extension according to special cases - check again the Course Management)

    Reference Answers

    Due to the flexibility of the content, we cannot produce a reference answer (e.g., a reference design of architecture or schema or service discovery). That is the reason why the questions are broken down into a primitive level: 0-1 mark.
    • If one cannot answer a question: 0
    • if one can answer in a correct and clear: 1
    • if one can answer the question reasonable but unclear & not enough: 0.5
    Each assignment has different questions with different difficulties and time spent. Some questions might need only 10 minutes if one learns materials from lectures and tutorials. Some might take 1-2 hours because one needs to implement the code. This way allows students who understand the concept well still able to get a reasonable grade even the implementation is not strong enough. The course includes elements of design, implementation and test so the questions in assignments cover different aspects, each question does not need the same amount of time.  

    Are the assignments hard or how to get higher grades?


    Designed as a continuous pipeline of assignments, assignments in fact combine things:
    •  Design: you need to architect the system, components and how they interact each other. You need also to be aware of the different roles in the system to be designed (e.g., customer versus provider versus third party)
    • Implementation: develop certain components in a reasonable way and test with real-world software and datasets
    • Extension - future perspective: imagine and think some future features based on the current design and experiments
    • Documentation: write your work so that other people can understand.
    The assignments also combine some kind of questions about theories/concepts but applied into your situations. Even without a strong programming skill, one can also do quite good design and test existing software as well as to document his/her story in a clear way.  Some aspects that you should consider in order to improve the grades:
    • Clarify the assumption and requirements: we give you a lot of flexibility in terms of selecting technologies and dataset. We will not seek for a perfect solution but if you just tell your design without clarifying your assumption (e.g., your customers need store a lot of time series data. You have only 1 customer but it is very big with a lot of data), we will question why you select technology A but not B (e.g., why Cassandra but not MongoDB). Always clarify your assumption!
    • Explain the design, especially the reason! This also shows that you can make decisions (not need to be perfect). for example, if you just say "I will use VMs" then we will ask why "VMs?". Remember that the reasons should be based on the design (requirements). Avoid to say "because I use MongoDB before"!
    • Make decisions: we do not ask for "purely theoretical answer" and you need to show that you can apply the theory by making decision. For example, we know that the choice of a cloud provider would be based on pricing, resource needs, etc. but if you answer like that, you do not show that you make a decision. When a question is for a decision, then make your choice and explain the reason (even it is not perfect for other people but it is your choice)
    • Application of concepts/techniques: show that you can apply techniques or concepts. We dont look at theoretical answers! For example, if you say we can partition data and in your assignment, you have customer data, you can explain and give examples. Saying "data can be partitioned into different nodes" is not enough.
    • Big data: remember that we do work on big data. Even you use a simple, small dataset, you should be able to replicate and test with big data! Performance maybe slow but it is not good if you run everything with a small dataset (e.g., few MBs).
    • Show "real numbers" in your case: real numbers are from testing, from logs, from UI of monitoring, etc. They show that you are really working in a real environment. We know that big data platforms are hard, thus in many cases, there are bugs, errors, etc. that prevent you to make a nice solution. But we do not look at a perfect run of your code. We want to see that you learn, work and understand aspects related to "big data platforms".
    • Writing documents: we dont need long documents but we need to understand your story. Just think that if you write something to publish in github then people should understand your design and follow the code. Some assignments are not well written. In fact, if you do have good implementation and document, you can also make your code available!


    • Assignment icon
      Assignment 1
      Not available unless: You belong to L01 (Oodi)

      Description

      • Here is the current version of description

      FAQ

      • Check the FAQ from the Git of the course

      Log of changes/notes

      Here is a list of issues/corrections/updates after the release of the assignment


      Assignment Processing

      The following issue wont affect the grading but just give an example of an issue in handling big data ingestion (check if your assignment is in the list):
      • **Failed to comply with file name convention leads to PROCESSING TIME AND COST!**. Think if you have to handle **BIG DATA**.

    • Assignment icon
      Assignment 2
      Not available unless: You belong to L01 (Oodi)

      Description

      • Here is the current version of description

      Log of changes/notes

      Here is a list of issues/corrections/updates after the release of the assignment



    • Assignment icon
      Assignment 3
      Not available unless: You belong to L01 (Oodi)

      Description

      • Here is the current version of description