Big Data Fundamentals: Concepts, Drivers & Techniques

Big Data Fundamentals: Concepts, Drivers & Techniques
by Paul Buhler, PhD
Thomas Erl
Wajid Khattak

PART I: The Fundamentals of Big Data

  • Chapter 1: Understanding Big Data
    • Concepts and Terminology
      • Datasets
      • Data Analysis
      • Data Analytics
        • Descriptive Analytics
        • Diagnostic Analytics
        • Predictive Analytics
        • Prescriptive Analytics
      • Business Intelligence (BI)
      • Key Performance Indicators (KPI)
    • Big Data Characteristics
      • Volume
      • Velocity
      • Variety
      • Veracity
      • Value
    • Different Types of Data
      • Structured Data
      • Unstructured Data
      • Semi-structured Data
      • Metadata
    • Case Study Background
      • History
      • Technical Infrastructure and Automation Environment
      • Business Goals and Obstacles
    • Case Study Example
      • Identifying Data Characteristics
        • Volume
        • Velocity
        • Variety
        • Veracity
        • Value
      • Identifying Types of Data
  • Chapter 2: Business Motivations and Drivers for Big Data Adoption
    • Marketplace Dynamics
    • Business Architecture
    • Business Process Management
    • Information and Communications Technology
      • Data Analytics and Data Science
      • Digitization
      • Affordable Technology and Commodity Hardware
      • Social Media
      • Hyper-Connected Communities and Devices
      • Cloud Computing
    • Internet of Everything (IoE)
    • Case Study Example
  • Chapter 3: Big Data Adoption and Planning
    • Considerations
    • Organization Prerequisites
    • Data Procurement
    • Privacy
    • Security
    • Provenance
    • Limited Realtime Support
    • Distinct Performance Challenges
    • Distinct Governance Requirements
    • Distinct Methodology
    • Clouds
    • Big Data Analytics Lifecycle
      • Business Case Evaluation
      • Data Identification
      • Data Acquisition and Filtering
      • Data Extraction
      • Data Validation and Cleansing
      • Data Aggregation and Representation
      • Data Analysis
      • Data Visualization
      • Utilization of Analysis Results
    • Case Study Example
      • Big Data Analytics Lifecycle
      • Business Case Evaluation
      • Data Identification
      • Data Acquisition and Filtering
      • Data Extraction
      • Data Validation and Cleansing
      • Data Aggregation and Representation
      • Data Analysis
      • Data Visualization
      • Utilization of Analysis Results
  • Chapter 4: Enterprise Technologies and Big Data Business Intelligence
    • Online Transaction Processing (OLTP)
    • Online Analytical Processing (OLAP)
    • Extract Transform Load (ETL)
    • Data Warehouses
    • Data Marts
    • Traditional BI
      • Ad-hoc Reports
      • Dashboards
    • Big Data BI
      • Traditional Data Visualization
      • Data Visualization for Big Data
    • Case Study Example
      • Enterprise Technology
      • Big Data Business Intelligence

PART II: Storing and Analyzing Big Data

  • Chapter 5: Big Data Storage Concepts
    • Clusters
    • File Systems and Distributed File Systems
    • NoSQL
    • Sharding
    • Replication
      • Master-Slave
      • Peer-to-Peer
    • Sharding and Replication
      • Combining Sharding and Master-Slave Replication
      • Combining Sharding and Peer-to-Peer Replication
    • CAP Theorem
    • ACID
    • BASE
    • Case Study Example
  • Chapter 6: Big Data Processing Concepts
    • Parallel Data Processing
    • Distributed Data Processing
    • Hadoop
    • Processing Workloads
      • Batch
      • Transactional
    • Cluster
    • Processing in Batch Mode
      • Batch Processing with MapReduce
      • Map and Reduce Tasks
        • Map
        • Combine
        • Partition9
        • Shuffle and Sort
        • Reduce
      • A Simple MapReduce Example
      • Understanding MapReduce Algorithms
    • Processing in Realtime Mode
      • Speed Consistency Volume (SCV)
      • Event Stream Processing
      • Complex Event Processing
      • Realtime Big Data Processing and SCV
      • Realtime Big Data Processing and MapReduce
    • Case Study Example
      • Processing Workloads
      • Processing in Batch Mode
      • Processing in Realtime
  • Chapter 7: Big Data Storage Technology
    • On-Disk Storage Devices
      • Distributed File Systems
      • RDBMS Databases
      • NoSQL Databases
        • Characteristics
        • Rationale
        • Types
        • Key-Value
        • Document
        • Column-Family
        • Graph
      • NewSQL Databases
    • In-Memory Storage Devices
      • In-Memory Data Grids
        • Read-through
        • Write-through
        • Write-behind
        • Refresh-ahead
      • In-Memory Databases
    • Case Study Example
  • Chapter 8: Big Data Analysis Techniques
    • Quantitative Analysis
    • Qualitative Analysis
    • Data Mining
    • Statistical Analysis
      • A/B Testing
      • Correlation
      • Regression
    • Machine Learning
      • Classification (Supervised Machine Learning)
      • Clustering (Unsupervised Machine Learning)
      • Outlier Detection
      • Filtering
    • Semantic Analysis
      • Natural Language Processing
      • Text Analytics
      • Sentiment Analysis
    • Visual Analysis Techniques
      • Heat Maps
      • Time Series Plots
      • Network Graphs
      • Spatial Data Mapping
    • Case Study Example
      • Correlation
      • Regression
      • Time Series Plot
      • Clustering
      • Classification
  • Appendix A: Case Study Conclusion
  • About the Authors
  • Index