
Abstract: AI has emerged as a viable approach for sifting through terabytes of heterogeneous cybersecurity data to execute fundamental cybersecurity tasks, such as asset prioritization, control allocation, vulnerability management, and threat detection, with unprecedented efficiency and effectiveness. Despite its initial promise, AI and cybersecurity have been traditionally siloed disciplines that rely on disparate knowledge and methodologies. I will aim to provide an important step to progress the AI for Cybersecurity discipline in this talk by summarizing the state of the field and promising future directions. I will offer a multi-disciplinary AI for Cybersecurity roadmap that centers on major themes such as cybersecurity applications and data, advanced AI methodologies for cybersecurity, and AI-enabled decision-making.
As part of this workshop, I will also present an open-source virtual machine (VM) that integrates traditionally disparate tools and resources from AI and cybersecurity. The workshop will present an overview of the VM's operations. Sample illustrations of AI for cybersecurity will be demonstrated, including detecting vulnerable code on GitHub repositories and emerging threats from the Dark Web for proactive cyber threat intelligence capabilities.
Session Outline
1. Module 1: Introduction: Background and Overview of AI and Cybersecurity
a. Common tasks (e.g., asset identification, control allocation, vulnerability management, threat detection)
b. Increased focus from the national academies and federal government
c. Summarize the role of AI for Cybersecurity
2. Module 1: A General Approach to AI for Cybersecurity
a. Lifecycle, major components of AI for cybersecurity (fundamental tasks)
3. Module 1: Summary of prevailing data sources for Cybersecurity
a. Internal data sources, including log files, netflow data, vulnerability assessments, and others; summary of example platforms or tools that generate these data, sample metadata and data, and some of the characteristics of this type of data
b. External data sources, including Dark Web data, social media data, GitHub data, and others; summary of example platforms or tools that generate these data, sample metadata and data, and some of the characteristics of this type of data
4. Module 2: Existing AI for Cybersecurity Initiatives
a. Cyber Threat Intelligence, Disinformation and Computational Propaganda, Security Operations Centers, Adversarial Machine Learning
b. Selected common tasks, datasets, selected software, selected conferences for each initiative
c. Sample research at the intersection of AI and cybersecurity: detecting emerging threats from the Dark Web and analyzing publicly accessible GitHub repositories to detect vulnerabilities in code.
5. Module 2: Summary of Existing Limitations and Emerging Research Opportunities
a. Limitations: Often siloed resources and initiatives
b. Emerging research opportunities from a multi-disciplinary perspective, including:
i. Cybersecurity applications and data: application areas, data sources, data representations
ii. Advanced AI methods: multi-view and multi-modal analytics, explainable and interpretable AI approaches, Augmented intelligence and human-AI interfaces
iii. AI-enabled decision making: cybersecurity visualizations and dashboards, automated cybersecurity predictions, and AI-enabled cyber defense and resiliency
6. Module 2: Illustration of AI for Cybersecurity Research and Education Activities
7. Module 3: AI4Cyber Virtual Machine
a. Summary of prevailing infrastructure for AI4Cyber
b. Background and review of VM components
c. Hands on illustrations of VM operations: dataset loading, model running, visualization
8. Module 3: Mechanisms to cultivate the AI for Cybersecurity Discipline
a. Existing academic and practitioner conferences and journals that can be used to archive AI for cybersecurity research
b. Prevailing funding mechanisms to help fund AI for cyber and cyber for AI research and education activities
9. Question and Answer, with audience interaction
a. Seek feedback, if possible about the ideas and concepts presented
b. Any additional questions from the audience
Background Knowledge
This session is rated as intermediate as some knowledge about machine learning, deep learning, data analytics, Python, and general systems administration would help to maximize the learning experience for the participants.
Bio: Dr. Sagar Samtani is an Assistant Professor and Grant Thornton Scholar in the Department of Operations and Decision Technologies at Indiana University. Dr. Samtani graduated with his Ph.D. from the AI Lab from University of Arizona. Dr. Samtani’s research interests are in AI for Cybersecurity, developing deep learning approaches for cyber threat intelligence, vulnerability assessment, open-source software, AI risk management, and Dark Web analytics. He has received funding from NSF’s SaTC, CICI, and SFS programs and has published over 40 peer-reviewed articles in leading information systems, machine learning, and cybersecurity venues. He is deeply involved with industry, serving on the Board of Directors for the DEFCON AI Village and Executive Advisory Council for the CompTIA ISAO.

Sagar Samtani, PhD
Title
Assistant Professor and Grant Thornton Scholar | Indiana University
