Tagged: big data

Analytics Translator / Artificial Intelligence / Big Data / Data Science

September 12, 2018

Analytics Translator

What is an Analytics Translator? The Analytics Translator is an important member of the new analytical team. As organizations encourage data democratization and implement Self-Serve Business Intelligence and Advanced Analytics, business users can leverage...

Big Data / Data Science / Hadoop

June 25, 2018

The Graph Database Model

The Graph Database Model For too long, relational database management has been the de facto standard for most of us looking to store data and retrieve the different values as and when desired. However,...

AWS / Big Data

May 10, 2018

AWS Glue

AWS Glue Platform and Components AWS Glue is built on top of Apache Spark, which provides the underlying engine to process data records and scale to provide high throughput, all of which is transparent...

Big Data / Hadoop

April 14, 2018

History of Hadoop and its comparison to other systems

History Of Hadoop and its Comparison to Other Systems Hadoop isn’t the first distributed system for data storage and analysis, but it has some unique properties that set it apart from other systems and...

Big Data / Hadoop

April 10, 2018

Big Data

“Big Data” Just as the name itself implies ‘Big Data’ is huge amount data with complex structure and that grows exponentially with time. Such a data is so large and complex that none of...

Big Data / Data Science / Python

March 30, 2018

Installing and running Jupyter Notebook, Spark and Python on Amazon EC2

Step by step guide to getting PySpark working with Jupyter Notebook on an instance of Amazon EC2. This article assumes some basic familiarity with the command line and AWS console. Step 1: Create an...

Big Data / spark

March 29, 2018

Features of RDD & its Operations

Lets look at some of the more appealing features of apache spark and RDD. Apache Spark performs in-memory computation, also it evaluates RDDs lazily i.e. they do not compute their results right away. Instead,...

Big Data / spark

March 27, 2018

Resilient Distributed Dataset (RDD)

Before we discuss Resilient Distributed Dataset , lets see how do we launch Spark? A Spark shell executable file is usually present in Spark version folder which in turn is present under the “opt”...

Big Data / spark

March 26, 2018

Apache Spark Architecture

From the image shown above one can easily understand the huge dynamics of spark. The section on the left hand side of the image depicts all the different sources which provides the input data...

Big Data / spark

March 24, 2018

What is Apache Spark?

Apache Spark is a powerful open source processing engine, with a cluster computing framework. Spark is designed in such a way to ensure lightening fast data processing of large datasets. this includes Batch processing...

Tagged: big data

Analytics Translator

The Graph Database Model

AWS Glue

History of Hadoop and its comparison to other systems

Big Data

Installing and running Jupyter Notebook, Spark and Python on Amazon EC2

Features of RDD & its Operations

Resilient Distributed Dataset (RDD)

Apache Spark Architecture

What is Apache Spark?

About Author

Discover

Categories