Big Data Analysis with Python: A Complete Course

Big Data Analysis with Python

Practice and refine your big data analytical skills with Python to distill complicated data into digestible and meaningful insights.

(BIG-DATA-PYTHON.AJ1) / ISBN : 978-1-64459-315-8

Lessons

Lab

TestPrep

AI Tutor (Add-on)

237 Reviews

Get A Free Trial

This course includes:

Free pre-assessment and first 2 lessons

9+ Interactive Lessons | 20+ Exercises

Accessible on mobile and tablet too

Certificate of completion

Are you an instructor?

Access detailed information about the course content, learning objectives, activities, and assessments before adding it to your curriculum.

About This Course

This big data analysis with Python course online is your go-to training guide for mastering the art of handling and analyzing massive piles of data. You’ll experiment with Python libraries like Pandas, Seaborn, and Spark. Also, our course modules will help you visualize data, manage missing values, and perform in-depth statistical analysis, giving you hands-on experience. By the end, you’ll have the technical skills to tackle real-world challenges and make data-driven decisions.

Skills You’ll Get

Use Pandas and Spark for effective data handling
Create insightful statistical visualizations using Seaborn and Matplotlib to communicate findings clearly
Work with frameworks like Hadoop and Spark to manage large datasets
Handle missing values and prepare data for analysis and accuracy
Translate business problems into a measurable metric and actionable insight
Maintain data analysis reproducibility with best practices using Jupyter Notebooks
Dive deep into Spark DataFrames for advanced data manipulation and analysis
Compile full analysis reports to present data findings professionally
Execute SQL operations on Spark DataFrames for efficient data querying

Interactive Lessons

9+ Interactive Lessons | 20+ Exercises | 50+ Quizzes | 65+ Flashcards | 65+ Glossary of terms

Gamified TestPrep

30+ Pre Assessment Questions | 30+ Post Assessment Questions |

Hands-On Labs

48+ LiveLab | 12+ Video tutorials | 20+ Minutes

Lesson Plan

Preface

About

The Python Data Science Stack

Introduction
Python Libraries and Packages
Using Pandas
Data Type Conversion
Aggregation and Grouping
Exporting Data from Pandas
Visualization with Pandas
Summary

Statistical Visualizations

Introduction
Types of Graphs and When to Use Them
Components of a Graph
Seaborn
Which Tool Should Be Used?
Types of Graphs
Pandas DataFrames and Grouped Data
Changing Plot Design: Modifying Graph Components
Exporting Graphs
Summary

Working with Big Data Frameworks

Introduction
Hadoop
Spark
Writing Parquet Files
Handling Unstructured Data
Summary

Diving Deeper with Spark

Introduction
Getting Started with Spark DataFrames
Writing Output from Spark DataFrames
Exploring Spark DataFrames
Data Manipulation with Spark DataFrames
Graphs in Spark
Summary

Handling Missing Values and Correlation Analysis

Introduction
Setting up the Jupyter Notebook
Missing Values
Handling Missing Values in Spark DataFrames
Correlation
Summary

Exploratory Data Analysis

Introduction
Defining a Business Problem
Translating a Business Problem into Measurable Metrics and Exploratory Data Analysis (EDA)
Structured Approach to the Data Science Project Life Cycle
Summary

Reproducibility in Big Data Analysis

Introduction
Reproducibility with Jupyter Notebooks
Gathering Data in a Reproducible Way
Code Practices and Standards
Avoiding Repetition
Summary

Creating a Full Analysis Report

Introduction
Reading Data in Spark from Different Data Sources
SQL Operations on a Spark DataFrame
Generating Statistical Measurements
Summary

Hands-on LAB Activities

The Python Data Science Stack

Interacting with the Python Shell
Calculating the Square
Grouping a DataFrame
Applying a Function to a Column
Subsetting a DataFrame
Slicing and Subsetting
Reading Data from a CSV File
Viewing the Standard Deviation
Calculating the Median Value
Calculating the Mean Value

Statistical Visualizations

Plotting an Analytical Graph
Creating a Graph
Creating a Graph for a Mathematical Function
Creating a Line Graph Using Seaborn
Creating a Line Graph Using pandas
Creating a Line Graph Using matplotlib
Detecting Outliers
Displaying Histograms
Using a Box Plot
Constructing a Scatterplot
Plotting a Line Graph with Styles and Color
Configuring a Title and Labels for Axis Objects
Designing a Complete Plot
Exporting a Graph to a File on a Disk

Working with Big Data Frameworks

Performing DataFrame Operations in Spark
Accessing Data with Spark
Parsing Text in Spark

Diving Deeper with Spark

Creating a DataFrame Using a CSV File
Creating a DataFrame from an Existing RDD
Specifying the Schema of a DataFrame
Removing a Column from a DataFrame
Renaming a Column in a DataFrame
Adding a Column to a DataFrame
Creating a KDE Plot
Creating a Linear Model Plot
Creating a Bar Chart

Handling Missing Values and Correlation Analysis

Filtering Data
Counting Missing Values
Handling NaN Values
Using the Backward and Forward Filling Methods
Calculating Correlation Coefficient

Exploratory Data Analysis

Generating the Feature Importance of the Target Variable
Identifying the Target Variable
Plotting a Heatmap
Generating a Normal Distribution Plot

Reproducibility in Big Data Analysis

Performing Data Reproducibility
Preprocessing Missing Values with High Reproducibility
Normalizating the Data

Lessons Lab TestPrep AI Tutor

Data Wrangling with Python

ISBN: 9781644593028
DATA-WRGLG-PYTHON.AJ1

Try

Lessons Lab TestPrep AI Tutor

Regression Analysis with Python

ISBN: 9781616916886
REG-PYTHON.AJ1

Try

Lessons Lab TestPrep AI Tutor

Data Analysis and Visualization with Excel

ISBN: 9781644592595
DATA-VIS-XLS.AD1

Try

Lessons Lab TestPrep AI Tutor

Power BI: Data Analysis Professional

ISBN: 9781644592601
PWR-BI.AD1

Try

Lessons Lab TestPrep AI Tutor

Using Data Science Tools in Python

ISBN: 9781644592526
DS-TOOLS-PYTHON.AD1

Try

Lessons Lab TestPrep AI Tutor

Foundation of Data Analytics

ISBN: 9781644592779
FDN-DA.AE1

Try

Big Data Analysis with Python

Are you an instructor?

Big Data Analysis with Python

About This Course