DataStax Enterprise 3.0: A synonym for High Secure Real-Time Analytics


Some days ago, I had the pleasure to talk with two Apache Cassandra experts. The first was Edward Capriolo, a Hadoop System Administrator at Media6Degrees, organizer of the NYC Cassandra User Group and NYC NoSQL Meetups, author of the incredible “Cassandra High Performance Cookbook” book and one of the DataStax´s MVP.

The second was the same Jonathan Ellis, DataStax’s Chief Technology Officer and co-founder, who leads Apache Cassandra’s project too.

Continue reading “DataStax Enterprise 3.0: A synonym for High Secure Real-Time Analytics”


Choosing a MPP database is incredibly hard

Like the title says, to choose an enterprise-level Massive Parallel Processing (MPP) database is actually a big headache for every Data Science Manager; basically because there are very good choices around the tech world.

Continue reading “Choosing a MPP database is incredibly hard”

Data Scientists: the world need us

Data Science

Some months ago, I wrote a post dedicated to new Data Scientists, giving my personal recommendation about several books that are pure gold, and great tools like Python, R, and Apache Hadoop. Right now, today is a new day for this kind of professional; yes, because, the Harvard Business Review (HBR) published a great article talking about the Data Scientist, written by Thomas H. Davenport and D.J. Patil; and I think that both did an incredible job in this writing, believe me, you should read it, you will not regreat. So, I want to dedicate these lines to the raising quantity of jobs with a shining title: “Data Scientist”. If you look today in any Job Board like Linkedin, AOL Careers , Indeed, SimplyHired, Technology Ladders or Dice, and you do a little search about this title, you will find more than 250 new open positions everyday, doing only the search in U.S. If you expand the search to more countries like UK, Germany, Ireland, India, China, Netherlands, the numbers grow like a completed madness. Continue reading “Data Scientists: the world need us”

Some upcoming features in HBase 0.96


HBase 0.96 is synonym of speed, better compression and high performance

The HBase development team is doing in these days a great job, adding some rock-solid features to this amazing data store. The next release will be 0.96, and it brings great things which I discuss with you righ now. I will expose you here the best features based on my own opinion; I’m open to discussion, so, let a comment to enrich the blog post if you want. OK, let’s start the engine. Continue reading “Some upcoming features in HBase 0.96”

MapR and Joyent: you should talk


Yes, I have to say this: I admire to the MapR´s management team for the great job that they have done in these few years after the company was founded. In two years, MapR have been become on a respected leader in the rush Big Data following a simple mantra: “Build a great product and create a business around it”.

Well, and which is the product behind the business of MapR? This amazing team have done one of the best Hadoop distributions to this date, divided in two versions: M3, which it’s a free edition of the distribution, and M5, which is the commercial version, with a lot of great features that I will talk later. These two distributions are the key reason why MapR have been selected like an important partner in the Cloud market; first by the Amazon Web Services team, to offer M3 and M5 in the Amazon Marketplace, and then by the new Cloud service offered by Google: Google Compute Engine, proving that MapR has much more to offer to the world. But what about another Cloud services providers? What about Rackpace, HP Cloud Services or Microsoft Azure? Ok, first, let’s talk about MapR.

Continue reading “MapR and Joyent: you should talk”

Reshaping the Ads Game with Data Science

New players in the Ads business are using Data Science to create smarter and better solutions every day

Big Data in Ads industry

Data Science is becoming more and more important everyday, almost in every business in the planet. It doesn’t matter if the industry is biotech, green technologies, finance, wealth management, security analysis, retail,
e-commerce; we are in the era of information, so; data is everywhere, and to keep running a business, is more frequent now to analyze statistics, trends, to build graphics to show clever results, to construct models based on data mining techniques like logistic regression and predictive analytics, and big data tools like Hadoop, Storm, R, and many others, are playing a key role on this issue.

But I don’t want to talk about a lot of industries; I will focus on one industry that I love to research and to study: “Digital Advertising” Continue reading “Reshaping the Ads Game with Data Science”

We are in the era of Real-Time Analytics


We are in the era of Real-Time Analytics

Data can come to us in some many forms that we sometimes, feel fear about this constant growing. But, more important is what we can do with this data. It doesn’t matter if we have a vast quantity of data if we don’t know how to become this on revenue for the company, and here is when Analytics plays a key role on this. But, not just Analytics, but: Real-Time Analytics.

Continue reading “We are in the era of Real-Time Analytics”

The Rise of Column-based data stores Part 1

Column-based data stores are becoming in an important trend today

If you read my post about Real-Time Analytics, you should be excited like me about this trend. Did you remember the phrase: “Time is equal to money”? Time is the main cause behind all innovations in the Database world: we want faster solutions; quick ways to gather huge quantity of data; faster ways to query billions of records; faster ways to adapt our infrastructure, etc; and many have tried to give clean and useful solutions to this problem.

Continue reading “The Rise of Column-based data stores Part 1”

Data Science paradigms

Hilary Mason

Don´t you know some Data Scientists? Here I let you my paradigms

There are a lot of professionals which want to become on Data Scientists (like me), but many times, they don’t know the work that current Data Scientists do on their work. I want to share with you some of the most well known Data Scientists, which love to share their knowledge with the world. Continue reading “Data Science paradigms”

2013: A great time to be a Data-Driven person

Data Science

Searching the web, I found in KDnuggets, the great Data Mining portal, that 2013 is the Year of Statistics. Even, there is a video about the topic on its site.

More and more companies, are interested on the services that this kind of professionals can provide to them. It doesn’t matter if it’s a startup, a middle-size company or a large corporation; everybody is loooking the “Holy Grail” of Data Science. Continue reading “2013: A great time to be a Data-Driven person”