Big Data - Data Science

Explore Big Data

CATEGORY: Data Science

SUB-CATEGORY: Big Data

PROVIDER: NobleProg


Big Data

Share

Course Format

Online

Accreditation Type

Certificate

Skill Level

Intermediate

Course Cost

R40150

Big Data - Data Science

COURSE OVERVIEW

Delegates will have computer based examples and case study exercises to undertake with relevant big data tools

    • Big data fundamentals
      • Big Data and its role in the corporate world
      • The phases of development of a Big Data strategy within a corporation
      • Explain the rationale underlying a holistic approach to Big Data
      • Components needed in a Big Data Platform
      • Big data storage solution
      • Limits of Traditional Technologies
      • Overview of database types
      • The four dimensions of Big Data
    • Big data impact on business
      • Business importance of Big Data
      • Challenges of extracting useful data
      • Integrating Big data with traditional data
    • Big data storage technologies
      • Overview of big data technologies
        • Data storage models
        • Hadoop
        • Hive
        • Cassandra
        • MongoDB
      • Choosing the right big data technology
    • Processing big data
      • Connecting and extracting data from database
      • Transforming and preparation data for processing
      • Using Hadoop MapReduce for processing distributed data
      • Monitoring and executing Hadoop MapReduce jobs
      • Hadoop distributed file system building blocks
      • Mapreduce and Yarn
      • Handling streaming data with Spark
    • Big data analysis tools and technologies
      • Programming Hadoop with Pig Latin language
      • Querying big data with Hive
      • Mining data with Mahout
      • Visualizing and reporting tools
    • Big data in business
      • Managing and establishing Big Data needs
      • Business importance of Big Data
      • Selecting the right big data tools for the problem
  • What is Data Ware House?
  • Difference between OLTP and Data Ware Housing
  • Data Acquisition
  • Data Extraction
  • Data Transformation.
  • Data Loading
  • Data Marts
  • Dependent vs Independent data Mart
  • Data Base design
  • Introduction.
  • Software development life cycle.
  • Testing methodologies.
  • ETL Testing Work Flow Process.
  • ETL Testing Responsibilities in Data stage.       
  • Big Data and its role in the corporate world
  • The phases of development of a Big Data strategy within a corporation
  • Explain the rationale underlying a holistic approach to Big Data
  • Components needed in a Big Data Platform
  • Big data storage solution
  • Limits of Traditional Technologies
  • Overview of database types

NoSQL Databases

Hadoop

Map Reduce

Apache Spark

 

  • Delegates should have an awareness and some experience of storage tools
  • An awareness of handling large data sets

14 hours (usually 2 days including breaks)


COURSE COMPLETION

Better understanding of Bid Data 

CREDIT BEARING

This course is NOT credit bearing

COURSE LICENCE

This course is available under Attribution-ShareAlike 2.0 South Africa