About this course
Big Data Analytics using IBM InfoSphere Big Insight
Enterprise Grade Hadoop
IBM® BigInsights for Apache™ Hadoop® collects and economically stores a very large set of highly variable data. IBM BigInsights enhances open source Hadoop with the complete set of capabilities to query, visualize, explore data and conduct distributed machine learning at scale resulting in deeper insight and better actions.
IBM InfoSphere BigInsights brings the power of Hadoop to the enterprise. Apache™ Hadoop® is the open source software framework, used to reliably manage large volumes of structured and unstructured data.
InfoSphere BigInsights Enterprise Grade Hadoop empowers enterprises of all sizes to cost effectively manage and analyze big data – the massive volume, variety and velocity of data that consumers and businesses create every day. InfoSphere BigInsights helps increase operational efficiency by modernizing your data warehouse environment as query-able archive, allowing you to store and analyze large volumes of multi-structured data without straining the data warehouse.
What’s inside IBM BigInsights?
IBM BigInsights extends the core components of Hadoop for improved usability. Enterprise-scale features from IBM are added to deliver massive scale-out data processing and analysis with built-in resiliency and fault tolerance. Simplified administration and management capabilities, rich developer tools and powerful analytic functions reduce the complexity of Hadoop.
What’s new
IBM® BigInsights™ for Apache™ Hadoop® supports data science teams and business analysts with:
- Deeper insight via advanced analytics including text and geospatial
- Automated prediction via machine learning algorithms in R
- Enhanced text analytics that can infer context and relationships from text
- Visualization of data with spreadsheet-like interface that now includes web tooling for business users (IBM BigSheets)
- Access to all data with distributed SQL-on-Hadoop that now includes Hbase, high availability, greater performance and even richer SQL (IBM Big SQL)
IBM Open Platform with Apache Hadoop builds the platform for big data projects and provides the most current Apache Hadoop open source content.
- Native support for rolling upgrades for Hadoop services
- Support for long-running applications within YARN for enhanced reliability & security
- Heterogeneous storage in HDFS for in-memory, SSD in addition to HDD
- Spark in-memory distributed compute engine for dramatic performance increases over MapReduce and simplifies developer experience, leveraging Java, Python & Scala languages
- Ambari operational framework for provisioning, managing & monitoring Apache Hadoop clusters
The Open Data Platform
IBM is a platinum founding member of this shared industry initiative focused on accelerating enterprise Apache Hadoop® innovation.
It’s based on open standards
Includes a foundation of standard open-source Hadoop and includes the rich open-source components that Hadoop users expect.
It’s enterprise-ready
Includes several optional enterprise-grade features that users can choose to implement to accelerate the value of Hadoop.
Hadoop for the data scientist
Machine Learning and mathematical algorithms have emerged as the missing piece needed to complement Hadoop to achieve faster time to value. IBM BigInsights for Hadoop puts the full range of analytics for Hadoop into the hands of Data Science teams.
Course Overview
This course is designed to aid business analysts who are working with IBM's InfoSphere BigInsights. Writing programs that extract data from unstructured text can be a daunting task. The student will learn how to create annotators through the use of IBM's Annotation Query Language (AQL). Analyzing data using Apache's Hadoop requires that map / reduce programs be written. People familiar with the Hadoop technology are aware of other open source products that are used in this environment. This course will give the student an overview of Apache Pig, ZooKeeper, and Map / Reduce and other Big Data components.
Pre-requisites: A programming background would be advantageous especially knowledge of SQL
Course Curriculum:
Part 1: InfoSphere BigInsights Basics
1. Customer Video on Business Transformation
2. Introduction to InfoSphere Big Insight : Classroom Session
3. IBM Case Study and Customer Video
4. Introduction to Big Insights Analytics for Business Analysts : Classroom Session
5. Importing Data to InfoSphere BigInsights : Classroom Session
6. BigSheets Workflow : Classroom Session
7. Big Sheets Collections : Classroom Session
8. Lab Exercise 1 : Creating BigSheets Collection by uploading a file
9. Lab Exercise 2: Creating Collections by using Applications to gather Data
9. Customer Video on Business Transformation
Part 2: Working with InfoSphere BigInsights
10. Big Sheets Navigation : Clasroom Session
11. Working with Big Sheets Collections : Classroom Session
12. Big Sheets Readers & Extensions : Classroom Session
13. Lab Exercise 3: Analyzing a Big Sheets Collection
14. Lab Exercise 4: Combining Data to create a new Collection
15. Lab Exercise 5: Visualizing Data in Graphical Form & Exporting Data from a Big
Sheets Collection
Lab Exercise 6: Installing BigSheets Plug in
For Admissions:
Please send your Resume to admission@aegis.edu.in
Get in Touch with Ritin Joshi at +91 9022137010 or Sachin Khare +91 9819008153

