The Big Data lesson module introduces an efficient way of handling, processing, and analyzing large amounts of data using pandas, matplotlib and seaborn. pandas is the de facto data analysis and manipulation tool for Python programming language. Matplotlib and Seaborn are visualization packages for data analysis in Python. The data handling skills introduced in this lesson form the foundation for the subsequent two lessons on machine learning and neural networks.
Prerequisites
Learners should have acquired basic skills in Python programming in order to learn this lesson effectively. Learners that are new to Python are encouraged to take a brief tutorial on Python, such as the Plotting and Programming in Python lesson by Software Carpentry. The DeapSECURE project also maintains a list of Python crash courses.
This lesson module requires Python, Pandas, Matplotlib and Seaborn. For optimal teaching and learning experience, please use Jupyter Notebook or JupyterLab.