Machine Learned Ranking for LegalTech

Abstract: The amounts of data in digital investigations are ever increasing and new approaches are needed for finding the relevant items amongst the noise. For too long, the focus on digital investigation software has been on parsing and extracting any possible piece of data and displaying it to the user. But, with the increasing amount of data, the focus needs to be on showing only the most relevant items.

Machine learning techniques can help identify which items the user should see first and therefore save them time. This talk will outline how these techniques can be used to rank documents, executables, and other files found during a digital investigation.

Bio: Carl founded Basis Technology in 1995 to help American companies enter Asian markets. In 1999, the company shipped its first products for website internationalization, enabling Lycos and Google to become the first search engines capable of cataloging the web in both Asian and European languages. In 2003, the company shipped its first Arabic analyzer and began development of a comprehensive text analytics platform.
Today, Basis Technology is recognized as the leading provider of components for information retrieval, entity extraction, and entity resolution in many languages. Carl has been directly involved with the company’s activities in support of national security missions, and works closely with analysts in the U.S. intelligence community. Prior to starting Basis Technology, Carl worked as an independent consultant in Boston, New York and Tokyo to international clients in finance and knowledge management. Carl spent eight years on the research staff of the MIT Laboratory for Computer Science. He is an active contributor to several non-profit organizations, including the Free Software Foundation, the MIT Alumni Fund, and the Unicode Consortium.