Distributed Analytics with JuliaDB and OnlineStats
Distributed Analytics with JuliaDB and OnlineStats


As datasets get larger, so do the demands on tools for data science. JuliaDB is an analytical database designed for distributed and out-of-core processing of many large data files. It integrates with OnlineStats, a package for calculating statistics and models both on-line and in parallel, allowing you to easily scale analyses to production. OnlineStats is one-of-a-kind, in that everything from summary statistics to advanced statistical learning techniques can be updated with one observation at a time. The underlying data structures use constant memory and estimates can be calculated on infinite-sized data streams.


Josh is a recent PhD grad from NC State where his research focus was on-line algorithms for statistics. He enjoys working on difficult optimization problems and translating paper solutions into efficient computer programs (particularly with the Julia language). Besides math and computers, he enjoys music, playing guitar, hiking, biking, and ultimate frisbee.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google