• submit to reddit
Matthew Dubins11/20/13
0 replies

Estimating Age from First Name, Part 2

In the author's last post, he wrote about how he compiled a US Social Security Agency dataset into something usable in R, and mentioned some issues scaling it up to be usable for bigger datasets. This time, he’ll show you how he prepped the dataset to become more scalable.

Lukas Eder11/20/13
0 replies

A History of Databases in “No-Tation”

We’re heading towards very exciting times in the field of databases! In this article, the author discusses a number of talks from Topconf in Tallin, Estonia, and the changing landscape of the world of databases.

Arthur Charpentier11/20/13
0 replies

Data News: Time Series Analyses in R, and More

This installment of Arthur Charpentier's collection of data science-related links includes time series analyses in R, an article about "Probability, Gambling and the Origins of Risk Management," an analysis of The Economist's explanation of statistical significance, and more.

Alec Noller11/20/13
0 replies

A New Tool for Analyzing Hadoop Data in Excel

This new tool seems to be centered on creating an easy-to-use interface for analyzing Hadoop data - probably aiming to be more accessible to employees who aren't quite as much in the loop, among other things - and allows users to manipulate data in Excel, which is then scaled to Hadoop's dataset.

Alec Noller11/20/13
0 replies

NoSQL Databases: How to Compare for Performance and Reliability

This talk from Ben Engber at Surge 2013 discusses how to compare NoSQL databases for true performance and reliability. Databases featured in his comparisons include Cassandra, Couchbase, FoundationDB, MongoDB, and others, and Engber tackles more general issues as well.

Ian Mitchell11/19/13
0 replies

Agile Transitioning and Transparency

The first step in any agile transformation is transparency. A Kanban board, which exposes work and its actioning in response to certain signals, is a good tool for encouraging this practice. In this post we look at how transparency is attained and at the controversy that is often involved.

Lukas Eder11/19/13
0 replies

Faster SQL Pagination with Keysets, Continued

A while ago, I blogged about how to perform keyset pagination (some also call this the “seek method”). Keyset pagination is a very powerful technique to perform constant-time pagination also on very large result sets, where “classic” OFFSET pagination will inevitably get slow on large page numbers.

Peter Zaitsev11/19/13
0 replies

MySQL Encryption Performance Revisited

I wanted to compare performance differences between MySQL’s built-in SSL encryption facilities and external encryption technologies, such as SSH tunneling. I’ll also be using this post to address a couple of questions posed in the comments on my original article. So, without further ado….

John Sonmez11/19/13
0 replies

How to Become a More Valuable Software Developer

To reach the ultimate level of success and truly increase your value, you have to have both style—the ability market yourself and make a name for yourself, and substance –the skills that pay the bills.

Tom Howlett11/19/13
0 replies

Remote Practices

Our practices revolve around keeping the team engaged. My goal is to give everyone enough space to think and opportunities to collaborate. The practices are similar to what you might see on an Agile team, the difference with remote work is they need to be more deliberate.

Alec Noller11/19/13
0 replies

Android vs. iOS: The Development Process Compared

This recent article makes a detailed comparison of Android and iOS development and asks a question: Which one should you pursue first? The author compares IDEs, configuration, UX design, languages, APIs, internet connectivity, social media sharing, fragmentation, and the publication process.

Michal Bachman11/19/13
0 replies

Modeling Data in Neo4j: Qualifying Relationships

Let's say we want to model movie ratings in Neo4j. People have an option to rate a movie with 1 to 5 stars. One way of modelling this - perhaps the first one that springs into mind - is creating a RATED relationship with a rating property that takes on 5 different values. There are more ways than that, though.

Matthew Dubins11/19/13
0 replies

Estimating Age from First Name, Part 1

After reading a post with lists of the trendiest names in US history, the author decided to compile the lists using R. In this post, the author discusses building a dataframe, as well as a function to query the dataframe.

Mark Needham11/19/13
0 replies

Neo4j: Modeling Hyper Edges in a Property Graph

At the Graph Database meet up in Antwerp, we discussed how you would model a hyper edge in a property graph like Neo4j, and it occurred to me that I’d done this in my football graph without realizing. In this article, you'll find two versions of a relationship model illustrating the use of hyper edges.

Mitch Pronschinske11/19/13
1 replies

The Future of Cloud Application Development

This talk will provide a unique opportunity to hear from the chief technology officer at the leading PaaS and cloud expert who has worked with hundreds of companies that are leveraging cloud application development platforms and adopting next generation technologies.