Data Mining and Management Strategies

Course Info

Length: 2 Weeks

Type: In Classroom

Available Dates

Venue

  • Dec-30-2024

    Amsterdam

  • Dec-30-2024

    Kuala Lumpur

  • Jan-13-2025

    Dubai

  • Jan-13-2025

    London

  • Feb-17-2025

    Kuala Lumpur

  • Feb-17-2025

    Singapore

  • Feb-17-2025

    Amsterdam

  • Feb-17-2025

    Istanbul

  • Feb-17-2025

    Barcelona

  • Feb-17-2025

    Paris

  • Mar-24-2025

    London

  • Mar-24-2025

    Dubai

  • May-19-2025

    Istanbul

  • May-19-2025

    Barcelona

  • May-19-2025

    Amsterdam

  • May-19-2025

    Kuala Lumpur

  • May-19-2025

    Paris

  • May-19-2025

    Dubai

  • May-19-2025

    Singapore

  • May-19-2025

    London

  • July-14-2025

    Dubai

  • July-14-2025

    London

  • Aug-18-2025

    Barcelona

  • Aug-18-2025

    Paris

  • Aug-18-2025

    Istanbul

  • Aug-18-2025

    Amsterdam

  • Aug-18-2025

    Kuala Lumpur

  • Aug-18-2025

    Singapore

  • Sep-01-2025

    Dubai

  • Sep-01-2025

    London

  • Nov-17-2025

    Barcelona

  • Nov-17-2025

    Paris

  • Nov-17-2025

    Singapore

  • Nov-17-2025

    Istanbul

  • Nov-17-2025

    Kuala Lumpur

  • Nov-17-2025

    Amsterdam

  • Dec-29-2025

    London

  • Dec-29-2025

    Dubai

Course Details

Course Outline

10 days course

 
Enterprise Database and Data Models
 
  • Key differences between data and information.
  • An understanding of enterprise database environments.
  • Define specific challenges with data cleansing.
  • The elements that make up a data model.

 

Extracting Data from a Database
 
  • The role of queries in extracting data from a database.
  • How to implement advanced queries in Microsoft® Access (or other database environment) using a visual querying language.
  • How to write queries using Structured Query Language (SQL).
  • Recognize the manner in which SQL supports, extracts, transforms and loads to prepare data for analytics model development.

 

Large Scale Implementation of Hadoop® MR
 
  • An understanding of and differences between brute force and parallel approaches.
  • Core concepts, advantages and supporting programs of ApacheTM Hadoop®.
  • Identify the components of MapReduce.

 

Getting Data: Social Networks and Geolocalization
 
  • Structure of a web page and how to obtain HTML files.
  • The advantages of web crawlers and how to get data page by page.
  • How to conduct text analysis: identifying human text, common issues, and resource libraries.
  • The ethical implications of using publicly available data.

 

Unstructured Data, Graphs and Networks
 
  • How to apply the right data structure for a problem.
  • The differences between graph, node and edge properties.
  • Define what degree means and analyze and interpret the degree distribution.
  • Concept of clustering coefficient and what it can mean for your data.

 

Clustering: Understanding the Relationship of Things
 
  • The Idea Behind Clustering.
  • Types of Clusters.
  • Distances Between Points.
  • K-Means Clustering.
  • Not Every Cluster Is a Good Cluster.
  • How Good Are My Clusters?
  • Hierarchical Clustering.
  • Min, Max, and Mean.

 

Classifications: Putting Things Where They Belong
 
  • The Idea Behind Classification.
  • Reading and Interpreting a Classification Tree.
  • Making a Decision Tree.

 

Alternative Impurity Measures
 
  • Expansion to 2D.
  • How Good Is My Classifier?
  • But I Only Have Training Data.
  • A Brief Look at Association Rule Mining.

 

Classifications: Advanced Methods
 
  • Rule-Based Classifier.
  • Extracting Rules.
  • Nearest Neighbors.
  • Classifiers – Defined Boundaries.

 

Artificial Neural Networks
 
  • Limits, Boundary Conditions and Choosing the Right Classifier.
  • Clustering vs. Classification.
  • Outlier and Anomaly Detection.

 

Course Video