The Era of Brain Observatories: Open-Source Tools for Data-Driven Neuroscience


Brain science is at an inflection point: while previous generations of researchers relied mostly on small datasets collected in individual labs, the field is transitioning to analyze large data sets that are collected and openly released by large research consortia, such as the Allen Institute for Brain Science or the Human Connectome Project. These large datasets promise to provide new and important information about the brain, and through data-driven approaches, help us understand and cure brain diseases. However, because of the scale and dimensionality of these datasets, researchers are meanwhile struggling to store, manage, analyze, and understand the data. In this talk, I will present some of the challenges that neuroscience is facing in tackling these large open datasets and discuss the ecosystem of open-source software tools that aims to address these challenges. I will demonstrate this with a set of tools that I have developed to analyze human MRI data that sheds light on networks of brain connections.


Ariel Rokem is a Research Assistant Professor at the University of Washington Department of Psychology. He received a PhD in neuroscience from UC Berkeley (2010) and additional postdoctoral training in computational neuroimaging at Stanford (2011 – 2015). He was also previously a Senior Data Scientist at the University of Washington eScience Institute (2015-2020)

He leads a research program in neuroinformatics, the development of data science tools, techniques and methods and their application to the analysis of neural data. One thrust of this research focuses specifically on the application of methods from statistical learning to analysis of diffusion MRI data acquired in human brain. This type of data sheds light on the role that human brain connections play in cognitive abilities, in diverse behaviors, and in neurological and psychiatric disorders.

Another thrust of the research focuses specifically on the development of systems for analysis (e.g., Mehta et al. 2017, Richford and Rokem, 2018) and sharing (e.g. Yeatman et al. 2018) of large-scale open datasets, to enable research with datasets that are increasingly becoming available through data-sharing initiatives, and to facilitate its reproducibility.

He is a member of the Software and Data Carpentry communities, where he has been an instructor since 2013 and an instructor trainer 2015. He also directs the annual Summer Institute for Neuroimaging and Data Science (NeuroHackademy). A contributor to multiple open-source software projects in the scientific Python ecosystem, he is a member of the editorial board of the Journal for Open Source Software (JOSS).