• submit to reddit
Mark Needham01/02/14
0 replies

Neo4j and Cypher: Using MERGE with Schema Indexes/Constraints

Neo4j’s powerful graph database can be used for analytics, recommendation engines, social graphs and many more applications. In the following example we demonstrate in a few steps how you can load Neo4j from your legacy relations SQL source.

John Cook01/01/14
1 replies

Know Whether to Delegate

Managing energy is more important than managing time. Energy is what gets things done, and time is only a crude surrogate for energy. Instead of only looking at what you could earn per hour versus what you could hire someone else for per hour, consider the energy it would take you to do something versus the energy it would free to delegate it.

Alec Noller01/01/14
0 replies

DRM and W3C Standards: Will the Web Stay Open?

A recent article from Danny O'Brien at the Electronic Frontier Foundation reported that the proposed Encrypted Media Extension (EME), which focuses on the protection of video content, could potentially be incorporated into W3C's HTML5.1 standard.

John Cook01/01/14
0 replies

Sensitive Dependence on Initial Conditions

The following problem illustrates how the smallest changes to a problem can have large consequences. As explained at the end of the post, this problem is a little artificial, but it illustrates difficulties that come up in realistic problems.

Rob Galanakis12/31/13
3 replies

TDD via Tic-Tac-Toe

I’ve tried out lots of different subject matter for teaching TDD, but my favorite has been Tic-Tac-Toe (or whatever your regional variation of it is). It has these benefits:

Chase Seibert12/31/13
10 replies

Development on a Mac versus Linux

I love the Mac computing experience. Even though I use a Mac as my home laptop, I prefer a Linux machine for work. Here are the key differences between developing on a Mac and on Linux.

Vlad Mihalcea12/31/13
2 replies

NoSQL is Not Just About Big Data

After publishing a small experiment with MongoDB, the author was challenged by the JOOQ team to match his results against Oracle. He will explore the specifics of that challenge in a later post, and in this one, he discusses a number of Small Data use-cases in which MongoDB was the right tool for the job.

Alec Noller12/31/13
1 replies

Are You Really a Data Scientist?

According to this recent post, you're not a data scientist just because you work with Hadoop a bit, and know some Python, and have some chops when it comes to databases. According to the author, it takes more than that, and in this article, he provides some resources to help you get there.

Gareth Rushgrove12/30/13
0 replies

Making the Web Secure, One Unit Test at a Time

Writing automated tests for your code is one of those things that, once you have gotten into it, you never want to see code without tests ever again. Why write pages and pages of documentation about how something should work when you can write tests to show exactly how something does work?

Joshua Gross12/30/13
23 replies

Top Posts of 2013: Please stop using Twitter Bootstrap

Let’s be honest: a great many of us are tired of seeing the same old Twitter Bootstrap theme again and again. Twitter Bootstrap’s success has turned it into the Times New Roman of design.

John Sonmez12/30/13
18 replies

Top Posts of 2013: There Are Only 2 Roles of Code

All code can be classified into two distinct roles; code that does work (algorithms) and code that coordinates work (coordinators). I would say that 90% of the code I have written does not nicely divide my classes into algorithms and coordinators.

Lukas Eder12/30/13
0 replies

MongoDB “Lightning Fast Aggregation” Challenged with Oracle

What does “Scale” even mean in the context of databases? When talking about scaling, people have jumped to the vendor-induced conclusion that SQL doesn’t scale, while NoSQL scales. In this article, the author takes a look at database scalability by comparing Oracle benchmarks to MongoDB.

Arthur Charpentier12/30/13
0 replies

100 Blogs Worth Reading: R, Probability, Data Analysis and Visualization, and More

For the 100th installment of Arthur Charpentier's collections of data science-related links, he has decided to instead provide a list of 100 blogs worth reading. Topics covered include statistics, probability, R, data analysis, graphs, maps, visualization, sciences, economics, and more.

Ayende Rahien12/30/13
0 replies

Reducing the Cost of Writing to Disk

So, we found out that the major cost of random writes in our tests was actually writing to disk. Writing 500K sequential items resulted in about 300 MB being written. Writing 500K random items resulted in over 2.3 GB being written. So the obvious thing to do would be to use compression

Adam Fowler12/30/13
0 replies

MarkLogic Range Index Scoring in V7

A new feature of MarkLogic 7′s search API is range index scoring – affecting relevancy based on a value within a document. In this article, the author details a couple of use cases: One involving ratings, and one involving distance from the center point of a geospatial query.