• submit to reddit
Vlad Mihalcea11/18/13
0 replies

Optimistic locking auto retry with JPA

This article shows how you can implement an automatic retry mechanism for optimistic locking JPA batch processors.

Mike Cottmeyer11/18/13
0 replies

Barriers to Agile Adoption

Though the status quo is killing their organization, some barriers to further Agile adoption happen way too often among organizations that need it most. I was asked to actually list some common barriers others have dealt with.

John Berryman11/18/13
0 replies

Cassandra: How to Build a Naive Bayes Classifier of Users Based on Behavior

In our last post, we found out how simple it is to use Cassandra to estimate ad conversion. This post will take the online ad company example just a bit further by creating a Cassandra-backed Naive Bayes Classifier. Again, we see that the “secret sauce” is simply keeping track of the appropriate counts.

Vlad Mihalcea11/18/13
0 replies

Optimistic Locking Auto Retry with MongoDB

The author wrote before about the benefit of employing optimistic locking for MongoDB batch processors. The optimistic locking exception is a recoverable one, as long as you fetch the latest Entity, update and save it. Spring makes it easy to implement an automatic retry mechanism, and this is how he did it.

Ricky Ho11/18/13
0 replies

Recommendation Engine, Part 2: Diverse Recommender

The author's previous post on recommendation systems suffers from a lack of diversity. For example, a list may contain the same book as a soft cover, hard cover, and Kindle version. Because interests are diverse, a better recommendation list should contain items that cover a broad spectrum of the user's interests

Arthur Charpentier11/18/13
0 replies

Data News: "The Hidden Technology That Makes Twitter Huge," and More

This installment of Arthur Charpentier's regular collection of data science-related links includes analyzing baseball data with R, a profile of a sword-swallowing statistician, and the technology behind Twitter that creates a massive network of data.

Ayende Rahien11/17/13
0 replies

How do they DO this?

We are doing some more performance work in Voron. And we got some really surprising results there. Voron is writing at really good rate, (better than anything else we tested against), just not a good enough rate.

Alec Noller11/17/13
0 replies

The Best of the Week (Nov. 8): Big Data Zone

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone! This week's best include the open source announcement of Facebook's Presto, an analysis of the multi-armed bandit algorithm, and practical uses for Big Data in terms of social and economic efficiency.

Alec Noller11/17/13
0 replies

The Best of the Week (Nov. 8): NoSQL Zone

Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include a look at what happened to all the buzz around NoSQL, a story about the perils of using MongoDB, and an objective measure of the most popular SQL and NoSQL database engines.

Ariya Hidayat11/16/13
0 replies

Using Packer to Create Vagrant Boxes

Using Packer to create CentOS and Ubuntu boxes is not difficult. If you want to follow along, I have prepared a Git repository ariya/packer-vagrant-linux which contains all the necessary bits to create CentOS 5.4 and/or Ubuntu 12.04 LTS 64-bit boxes.

Trevor Parsons11/16/13
0 replies

22 Billion Heroku Log Entries: Forget Big Data, It’s the Little Data That Matters!

Logentries processes over 10 billion log events every day. That’s quite a lot of data. So, the Logentries research team decided to take advantage of their unique position and set out to examine a sample of their overall user base for insights: 22 billion log events from over 6,000 Heroku applications.

Kristina Chodorow11/16/13
0 replies

MongoDB and User Support

The author was asked "how the whole 'Hacker News MongoDB random bashing' situation was dealt with from the inside." In this article, she explains her reaction and her strategies for handling such issues during her time at MongoDB.

Jim Bird11/15/13
2 replies

Applying the 80:20 Rule in Software Development

Managers don’t want to think harder than they have to. They like simple rules of thumb. One of the most useful rules of thumb is the 80:20 rule. You can see obvious cases where the 80:20 rule applies in software without looking too hard. For example, 80% of performance improvements are found by optimizing 20% of the code.

Mikio Braun11/15/13
0 replies

Hipster Scala Features

Yesterday someone was looking at my code and said “uh, you’re using +T in a generic”, and I said “that’s a hipster feature of Scala, you don’t need to understand it, you just need to get it right so your code compiles.”

Jim Hirschauer11/15/13
2 replies

An Example of How Node.js is Faster Than PHP: Part 2

In my previous post I installed and configured Ghost and WordPress. The purpose of that blog post was to test relative performance of the 2 platforms to see which one could handle more load. Many requested another test where an opcode cache was in place for WordPress. So that is exactly what this blog post is about.