As we close out the year, here is a look back at our 10 most popular blogs of 2014. Our top posts include machine learning and time series data topics, new milestones for the Apache projects Drill and Spark, and hands-on technical explanations that save you time and headaches.
This video clarifies the differences. It’s About Time: Time Series Databases By Ellen Friedman. Recording the time at which a measurement was made or an event occurred can make data much more useful for revealing valuable insights.
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop - Webinar Follow Up By Michele Nemschoff. Apache Spark is currently one of the most active projects in the Hadoop ecosystem, and there’s been plenty of hype about it in the past several months. In the latest webinar from the Data Science Central webinar series, titled “Let Spark Fly: Advantages and Use Cases for Spark on Hadoop,” we cut through the noise to uncover practical advantages for having the full set of Spark technologies at your disposal.
Loading a Time Series Database at 100 Million Points Per Second By Jim Scott. There are many use cases for time series data, and they usually require handling a decent data ingest rate. Rates of more than 10,000 points per second are common and rates of 1 million points per second are not quite as common, but not outrageously high either.
Comparing MapR XD and HDFS NFS and Snapshots By Bruce Penn. Having been at MapR for 2.5 years, a common question that I get from customers is, “Isn’t HDFS going to eventually catch up to MapR XD?” The simple answer is a resounding “NO”, and the reasons lie in the foundations of the two architectures. I will first describe these differences and then outline how the implementations vastly differ in their value to customers.