MDA621 - Software Practice for Big Data Analytics

Credit Points: 20 credit points

Workload: 60 hours

Prerequisite: MDA512 Data Science

Co-requisite: N/A

Aims & Objectives

This is a core unit out of a total of 12 units in the Master of Data Analytics (MDA) with a major in Software Engineering. This unit addresses the MDA course learning outcomes and complements other courses in a related field by developing students’ specialised knowledge of the design and engineering of a big data analytics software platform by applying the agile data science methodology and development by an Agile software stack. For further course information refer to: http://www.mit.edu.au/study-withus/programs/master-data-analytics. This unit is part of the AQF level 9 (MDA) course. 
 
This unit introduces the agile data science approach and associated methodologies for constructing modern big data analytics software platforms and through a web application project provides the essential hands-on practice to the students. The emphasis is on equipping students with practical knowledge of agile tools for scalable data processing, the agile stacks requirements, and by actual programming acquire the skills in using the tools of a software stack, which include SPARK, MongoDB, Elasticsearch, Apache Kafka, PySpark, scikit-learn, Spark MLlib, etc. 
 
This unit will cover the following topics: 

  • Engineering Big Data Analytics (BDA) Platform
  • Agile Data Science Methodology 
  • Agile Tools for Scalable Data Processing 
  • Agile Stack Requirements 
  • SQL, NoSQL, and Dataflow Programming with SPARK
  • Putting it All Together: Design of a Web-based BDA Platform
  • Collecting and Displaying Data
  • Visualizing Data with Charts and Tables
  • Exploring Data with Reports
  • Making Predictions
  • Deploying Predictive Systems
     

Learning Outcomes

4.1 Course Learning Outcomes 
The Course learning outcomes applicable to this unit are listed on the Melbourne Institute of Technology’s website: www.mit.edu.au  
 
4.2 Unit Learning Outcomes 
At the completion of this unit students should be able to: 
a. Analyse the software engineering requirements in constructing modern big data analytics platforms; 
b. Apply the agile data science methodology and the agile tools to support scalable data processing; 
c. Select the tools in the chosen software stack to design and program the big data analytics platform;
d. Relate the concept and use of visualisation to big data analytics;
e. Develop and appraise big data platforms for predictive systems. 
 

Weekly Topics

This unit will cover the content below:

Week Topics
1 Engineering Big Data Analytics (BDA) Platform
2 Agile Data Science Methodology
3 Agile Tools for Scalable Data Processing
4 Agile Stack Requirements
5 SQL, NoSQL, and Dataflow Programming with SPARK
6 Putting it All Together: Design of a Web-based BDA Platform
7 Collecting and Displaying Data
8 Visualizing Data with Charts and Tables
9 Exploring Data with Reports
10 Making Predictions
11 Deploying Predictive Systems
12 Review

Assessment

Assessment Task Due Date Release Date A B Learning Outcomes Assessed
Assignment 1 Week 3 Week 1 5%   a
In-class test Week 6 Week 1   10% a-b
Assignment 2 Week 11 Week 7 25%   c-d
Laboratory and Problem Based Learning participation & submission Week 2-11 Week 2-11 10%   a-e
Final Examination (3 hours)       50% a-e
TOTALS     40% 60%  

Task Type: Type A: unsupervised, Type B: supervised.

Contribution and participation (in class) (10%)
Students are expected to attend each scheduled session, arrive on time and remain for the entire session. Adherence to this requirement will be reflected in the marks awarded for this assessment. Students are also strongly encouraged to actively participate in the class discussions and tutorial activities by answering questions, expressing their opinions, insights and their learnings from the course.

Presentations (if applicable)
For presentations 

Textbook and Reference Materials

Text Book: 

  • Jurney, R. (2017). Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark. United States: O'Reilly Media. 

References: 

  • N. Dasgupta, “Practical Big Data Analytics”, Packt Publishing, 2018 
  • Dirschl, C., Welch, J., Koller, A. (2018). Engineering Agile Big-Data Systems. Denmark: River Publishers. 
  • Krishnan, K. (2019). Building Big Data Applications. United Kingdom: Elsevier Science. 

 
Adopted Reference Style: IEEE 
Students are required to purchase the prescribed text and have it available each week in the class. 
 

Graduate Attributes

MIT is committed to ensure the course is current, practical and relevant so that graduates are “work ready” and equipped for life-long learning. In order to accomplish this, the MIT Graduate Attributes identify the required knowledge, skills and attributes that prepare students for the industry.
The level to which Graduate Attributes covered in this unit are as follows:

Ability to communicate Independent and Lifelong Learning Ethics Analytical and Problem Solving Cultural and Global Awareness Team work Specialist knowledge of a field of study

Legend

Levels of attainment Extent covered
The attribute is covered by theory and practice, and addressed by assessed activities in which the students always play an active role, e.g. workshops, lab submissions, assignments, demonstrations, tests, examinations.
The attribute is covered by theory or practice, and addressed by assessed activities in which the students mostly play an active role, e.g. discussions, reading, intepreting documents, tests, examinations.
The attribute is discussed in theory or practice; it is addressed by assessed activities in which the students may play an active role, e.g. lectures and discussions, reading, interpretation, workshops, presentations.
The attribute is presented as a side issue in theory or practice; it is not specifically assessed, but it is addressed by activities such as lectures or tutorials.
The attribute is not considered, there is no theory or practice or activities associated with this attribute.