Introduction To BigData
Big data is defined as data that is too big, fast & hard for existing tools to process. Here, “too big” means that now a days organizations have to deal with petabyte scale collections of data...
View ArticleHow Google Transformed Big Data To A Life Saver Technology?
In 2009 a new virus was discovered ,combining elements of bird flu & Seasonal flu the virus strain dubbed H1N1 and spread quickly similar to Spanish flu in 1918 that infected half billion and...
View ArticleApplications Of Big Data
Medical Records The extreme cost of healthcare in the U.S. can be reduced with the adoption of electronic patient medical records. Many companies are searching way out to explore through large...
View ArticleCharacteristics Of Big Data
Volume – The quantity of data that is generated is very important it is the size of the data which determines the value and potential of the data under consideration and whether it can actually be...
View ArticleBig Data Analysis
Big data analytics refers to the process of collecting, organizing and analyzing large sets of data ("big data") to discover patterns and other useful information. Big data analytics help in...
View ArticleRelational Database Management System (RDBMS)
Traditional RDBMS (relational database management system) have been the conventional standard for database management throughout the age of the internet. This is also known as Traditional row-column...
View ArticleNoSQL
NoSQL(also known as "Not Only SQL") represents a completely different framework of databases that allows high-performance, agile processing of information at massive scale i.e. it is a database...
View ArticleHadoop
Apache Hadoop is an open source framework for writing and running distributed application that process large amounts of data. Their are some key distinction of Hadoop which give it an edge over...
View ArticleHadoop Distributed File System
The Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop. HDFS provides high-performance access to data across Hadoop clusters. HDFS has become a key tool for managing...
View ArticleHadoop Cluster
A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment. These clusters run...
View ArticleMap Reduce
MapReduce is a software framework that allows developers to write programs to process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone...
View ArticleSimple MapReduce Approach For Word Count
The word count operation takes place in two stages a mapper phase and a reducer phase. In mapper phase first the sentence is tokenized into words then we form a key value pair with these words where...
View ArticleHow Big Data Helped UPS to save millions ?
United Parcel Service Inc.(UPS) the world’s biggest package shipping company is using Big Data from customers, drivers and vehicles in a new route guidance system that will save time and money and...
View ArticleChallenges in Big Data
The challenges in Big Data are the implementation hurdles which require immediate attention. If these challenges are not handled they may lead to technology failure and also some unpleasant results....
View ArticleNew Technological Advancements In Big Data
There are two new technological advancements in Big Data as mentioned below: Spark by Apache Quantum Computing
View ArticleSpark
Spark is new technology that is on the top of Hadoop Distributed File System (HDFS) that is characterized as “a fast and general engine for large-scale data processing.” Spark have few key features...
View ArticleQuantum Computing
Quantum computing may be the future of most high-end data centers. This is because as the demand to intelligently process a growing volume of online data grows so the limits of silicon chip...
View ArticleIBM starting up with Big Data in India
In the American crime drama series Person of Interest, a machine predicts whether a person can be a victim or a perpetrator of a crime. Then it's up to a data scientist to find that person and prevent...
View ArticleArchitecture Of Apache Hadoop
Apache Hadoop has two pillar1.YARN - Yet Another Resource Negotiator (assigns CPU, memory, and storage to applications running on a hadoop cluster. The first generation of Hadoop could only run...
View ArticleScheduling In Apache Hadoop
Apache Hadoop by default uses FIFO scheduling (That I will explain you In my coming post) and 5 scheduling priorities to schedule jobs from job queue(I think we should sometime arrange operating system...
View Article
More Pages to Explore .....