I´m a long and proud Linux user since 2006, and in my geek life like a Linux user and advocate; I´ve used more than 20 different Linux distros since the days of compiling from stage 1 with Gentoo, crack a new Windows-based machine with an amazing Knoppix 3.8 LiveCD, from compiling the new version of the kernel to extract the maximum performance of a 256 MB RAM PC with a ligth and minimalist desktop environment. Then, I had the pleasure to be in charge of a complex platform where the main OS was Red Hat Enterprise Linux, and after two months working with it, I said: Wow, this is another kind of Linux ready for the enterprise.”. Then, I heard some great news: “Red Hat become in the first Open Source billion dolars company”, and I wrote a post about it. Then, I found Fedora Linux, and I’m happy with it yet. Then, I wrote about why Jim (Red Hat’s CEO) and his team should create some critical partnerships to drive Hadoop and Big Data market focused in the security of the platform. But, right now, I think that there’s an inflection point with the new release of Red Hat Enterprise Linux 7. Keep reading why I think that RHEL 7 chould change the path for Big Data and Cloud Computing markets.
When I used Dropbox for the first time from my Linux box, it was a shining moment for me. In that time, I was looking precisely for a solution for my files that I used to let behind always in my USB memory. For every Linux user, which many of them loves Open Source software; collaboration is an important issue, and Dropbox came to save my work a lot of times, because the platform itself, is a synonym of collaboration, and this is one of the reason why I love the platform.
The other reason why I love the platform, that they use my favorite programming language for the core development of the proprietary synchronization daemon: Python, and 2012, Guido, the creator of the language was included in Dropbox’s payroll: Just awesome !!!
So, I want to do my little contribution to the platform, writing some ideas how to improve it and the business itself. I will divide this in some key points:
- Improve blogging frequency in Tech’s blog about Data Science at Dropbox
- Improve user engagement in Mobile devices using Localytics services
- Hire to Greg Nudelman like consultant to improve Dropbox for Android, and work with Mailbox’s team for Android-based version
- Build a high class Data Science team to get more useful and better insights from Dropbox massive data sets
- Improve Marketing efforts using Inbound Marketing techniques focused on Facebook, Google Plus, LinkedIn, Twitter, Blogging, ebooks, etc
Some days ago, I had the pleasure to talk with two Apache Cassandra experts. The first was Edward Capriolo, a Hadoop System Administrator at Media6Degrees, organizer of the NYC Cassandra User Group and NYC NoSQL Meetups, author of the incredible “Cassandra High Performance Cookbook” book and one of the DataStax´s MVP.
The second was the same Jonathan Ellis, DataStax’s Chief Technology Officer and co-founder, who leads Apache Cassandra’s project too.
Like the title says, to choose an enterprise-level Massive Parallel Processing (MPP) database is actually a big headache for every Data Science Manager; basically because there are very good choices around the tech world.
There are many industries which are in total explosion: Real State, Marketing Analytics, Retail, Recruiting Services, Big Data Analytics; but these are the good guys. There are other guys which are using its deep knowledge about Security, Hacking, Cracking, Phishing to take advantage of the popularity of these industries to cut a big slide of the pie and make money from that. A new kind of business have born: Crime as a Service (CaaS).
Yes, I have to say this: I admire to the MapR´s management team for the great job that they have done in these few years after the company was founded. In two years, MapR have been become on a respected leader in the rush Big Data following a simple mantra: “Build a great product and create a business around it”.
Well, and which is the product behind the business of MapR? This amazing team have done one of the best Hadoop distributions to this date, divided in two versions: M3, which it’s a free edition of the distribution, and M5, which is the commercial version, with a lot of great features that I will talk later. These two distributions are the key reason why MapR have been selected like an important partner in the Cloud market; first by the Amazon Web Services team, to offer M3 and M5 in the Amazon Marketplace, and then by the new Cloud service offered by Google: Google Compute Engine, proving that MapR has much more to offer to the world. But what about another Cloud services providers? What about Rackpace, HP Cloud Services or Microsoft Azure? Ok, first, let’s talk about MapR.
PostgreSQL 9.2 is out
I opened my mailbox today and for my surprise, I received great news from Selena Deckelmann, one of the main contributors to the excellent database management system (Did I say to you that this is my favorite? Proud user since version 8.0), announcing that PostgreSQL 9.2 was released today, with a lot of new features, major improvements, and bug fixes. Continue reading “PostgreSQL 9.2: A rock-solid component for your Cloud infrastructure”
You have a terrible headache: How to design an scalable architecture using Amazon Web Services in a quick and effective way? How to test an architectural approach to a certain problem and share it in a easy way with you fellows? I have an answer for you: MadeiraCloud.
But, What is MadeiraCloud and how it can solve these problems?
I had the pleasure to talk with his CEO Dany O’ Prey, and after a few lines, I said: “Wow, this deserve a blog post”.
This blog post describes the improvements that Amazon Web Services (AWS) could do to deliver a better service to customers everyday, solve more problems to them and in the process, generate more revenue for the company.
Great event, amazing talks
Today, HPCWire sent to all subscribers amazing news. The city of Boston will be the outstanding host for the AWS Big Data and HPC in the Cloud event on April 27th. Here is the completed message from HPCwire:
Join us April 27th for the Big Data and HPC in the Cloud Event
Big Data and High Performance Computing (HPC) are more than just buzz words. Thousands of companies today are using data to differentiate their products and disrupt a growing number of industries, including media/advertising, retail, gaming, health care, and financial services. Learn how the AWS Cloud can cost effectively provide the scalable computing resources, storage services, and analytics tools required for anyone to participate in these technologies. Join us on April 27 for customer presentations, how–to sessions, and presentations specifically designed for both those new to Big Data and HPC and experienced users looking to learn about what’s new.
Register now here, and enjoy the journey to Big Data and HPC in the Cloud.
Two examples of great speakers are:
Dr. Matt Wood: As the Product Manager for Big Data and HPC for Amazon Web Services, Matt Wood discusses the technical and business aspects of cloud computing throughout the world. With a background in the life sciences and a PhD in Bioinformatics, Matt is interested in helping teams of all sizes bring their ideas to life through technology.
John Rauser, who is an Amazon Data Scientist & Principal Quantitative Engineer on Amazon Web Services’ Infrastructure team where he works to optimize all aspects of Amazon’s data center infrastructure.
What are you waiting for? If you use Amazon Web Services like a core part of your business, don’t wait more to register to this event and hear what Matt and John can show you to optimize your Cloud infrastructure using AWS. The location of the event is in the Back Bay Event Center, John Hancock Hall, 200 Berkeley Street, Boston, MA 02116
Happy Hacking !!!