Topic outline

  • The lectures are only for concepts, designs and possible examples of technologies. Therefore, we have some hands-on tutorials and discussions for some practical systems/choices. We use tutorial sessions to run examples and discuss related to real-world implementations. Each tutorial will be short and aim at supporting students to deal with real systems. 

    In total we will have 7 hours for tutorials.

    Click here to see tutorial videos.

    Note that the detailed content of the tutorial will be updated.

    • Walk around of key industrial and open source big data platforms that are important for industrial and real-world applications that you can use for your study (e.g., from Google, Microsoft, Amazon and Apache open sources)

    • Hands-on tutorials on understanding performance and consistency by using Cassandra as one example. You will practice with a production-level deployment of Cassandra and a real-world dataset.

    • Page icon
      Data Ingestion with Apache Nifi Page
      Not available unless: You belong to any group

      Hands-on tutorial with Apache Nifi for moving data among different services. You will practice with also RabbitMQ and cloud storage.



    • Page icon
      Hands-on with Hadoop Page
      Not available unless: You belong to any group

      You will practice hands-on activities with a Hadoop system, focusing on basic Hadoop Filesystems and Hive.

    • In this tutorial, you will practice to write Spark code and test it in a production-level spark cluster, using real data set, e.g. New York Taxi data.

    • This tutorial is for setting up Apache Flink and developing stream data processing applications using Flink.

    • In this tutorial, you will practice with Apache Airflow and develop examples for data processing using workflows.