• submit to reddit
Mark Needham11/19/13
0 replies

Neo4j: Modeling Hyper Edges in a Property Graph

At the Graph Database meet up in Antwerp, we discussed how you would model a hyper edge in a property graph like Neo4j, and it occurred to me that I’d done this in my football graph without realizing. In this article, you'll find two versions of a relationship model illustrating the use of hyper edges.

Mitch Pronschinske11/19/13
1 replies

The Future of Cloud Application Development

This talk will provide a unique opportunity to hear from the chief technology officer at the leading PaaS and cloud expert who has worked with hundreds of companies that are leveraging cloud application development platforms and adopting next generation technologies.

Brian Gracely11/19/13
0 replies

The Cloudcast #118 - OpenStack VMware Interop and Nicira SDN

Aaron talks with Kenneth Hui (@hui_kenneth) and Scott Lowe (@scott_lowe) about their OpenStack Summit session on OpenStack/VMware integration as well as get the latest on Nicira NSX from Scott.

Nikita Salnikov...11/18/13
2 replies

What garbage collector are you using?

We conducted a study on how often a particular GC algorithm is being used. The results are somewhat surprising. 13% of the environments had explicitly specified a GC algorithm. The rest left the decision to the JVM. So out of the 11,062 sessions with explicit GC algorithm, we were able to distinguish six different GC algorithms:

Vlad Mihalcea11/18/13
0 replies

Optimistic locking auto retry with JPA

This article shows how you can implement an automatic retry mechanism for optimistic locking JPA batch processors.

Mike Cottmeyer11/18/13
0 replies

Barriers to Agile Adoption

Though the status quo is killing their organization, some barriers to further Agile adoption happen way too often among organizations that need it most. I was asked to actually list some common barriers others have dealt with.

John Berryman11/18/13
0 replies

Cassandra: How to Build a Naive Bayes Classifier of Users Based on Behavior

In our last post, we found out how simple it is to use Cassandra to estimate ad conversion. This post will take the online ad company example just a bit further by creating a Cassandra-backed Naive Bayes Classifier. Again, we see that the “secret sauce” is simply keeping track of the appropriate counts.

Vlad Mihalcea11/18/13
0 replies

Optimistic Locking Auto Retry with MongoDB

The author wrote before about the benefit of employing optimistic locking for MongoDB batch processors. The optimistic locking exception is a recoverable one, as long as you fetch the latest Entity, update and save it. Spring makes it easy to implement an automatic retry mechanism, and this is how he did it.

Ricky Ho11/18/13
0 replies

Recommendation Engine, Part 2: Diverse Recommender

The author's previous post on recommendation systems suffers from a lack of diversity. For example, a list may contain the same book as a soft cover, hard cover, and Kindle version. Because interests are diverse, a better recommendation list should contain items that cover a broad spectrum of the user's interests

Arthur Charpentier11/18/13
0 replies

Data News: "The Hidden Technology That Makes Twitter Huge," and More

This installment of Arthur Charpentier's regular collection of data science-related links includes analyzing baseball data with R, a profile of a sword-swallowing statistician, and the technology behind Twitter that creates a massive network of data.

Ayende Rahien11/17/13
0 replies

How do they DO this?

We are doing some more performance work in Voron. And we got some really surprising results there. Voron is writing at really good rate, (better than anything else we tested against), just not a good enough rate.

Alec Noller11/17/13
0 replies

The Best of the Week (Nov. 8): Big Data Zone

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone! This week's best include the open source announcement of Facebook's Presto, an analysis of the multi-armed bandit algorithm, and practical uses for Big Data in terms of social and economic efficiency.

Alec Noller11/17/13
0 replies

The Best of the Week (Nov. 8): NoSQL Zone

Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include a look at what happened to all the buzz around NoSQL, a story about the perils of using MongoDB, and an objective measure of the most popular SQL and NoSQL database engines.

Ariya Hidayat11/16/13
0 replies

Using Packer to Create Vagrant Boxes

Using Packer to create CentOS and Ubuntu boxes is not difficult. If you want to follow along, I have prepared a Git repository ariya/packer-vagrant-linux which contains all the necessary bits to create CentOS 5.4 and/or Ubuntu 12.04 LTS 64-bit boxes.

Trevor Parsons11/16/13
0 replies

22 Billion Heroku Log Entries: Forget Big Data, It’s the Little Data That Matters!

Logentries processes over 10 billion log events every day. That’s quite a lot of data. So, the Logentries research team decided to take advantage of their unique position and set out to examine a sample of their overall user base for insights: 22 billion log events from over 6,000 Heroku applications.