Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
Analyzing Log Analysis: An Empirical Study of User Log Mining
Proceedings of Large Installation System Administration Conference (LISA’14), November 2014. Best Student Paper Award.
Sara Alspaugh, Betty Beidi Chen, Jessica Lin, Archana Ganapathi*,
Marti A. Hearst, Randy Katz
* Splunk Inc.
We present an in-depth study of over 200K log analysis queries from Splunk, a platform for data analytics. Using these queries, we quantitatively describe log analysis be- havior to inform the design of analysis tools. This study includes state machine based descriptions of typical log analysis pipelines, cluster analysis of the most common transformation types, and survey data about Splunk user roles, use cases, and skill sets. We find that log analysis primarily involves filtering, reformatting, and summarizing data and that non-technical users increasingly need data from logs to drive their decision making. We conclude with a number of suggestions for future research.
FULL PAPER: pdf