FINS Faculty are currently conducting sponsored research on the following Artificial Intelligence (AI) topics:
A note on AI and National Defense: Intelligence implies the ability to learn and utilize knowledge to perform a task. Human intelligence -considered the epitome of intelligence, enabled mankind to thrive. Artificial intelligence (AI) attempts to impart this ability to computer algorithms to perform a chosen task effectively. AI can acquire information from a wide variety of sources simultaneously and retain or recall knowledge indefinitely. AI can also be sensitive to minute changes in the data imperceptible to humans while at the same time understand latent relationships in the data. AI is so effective that insights obtained from AI have been used to expand human understanding in several domains. With these characteristics combined with lack of boredom and fatigue in repeated tasks, the ability to surpass human capabilities at a fraction of the effort comes as no surprise. Hence, AI is not only an ideal candidate for national security applications but a critical need of the time in ensuring robust and reliable mission critical systems.
AI for Security-Aware Electronics:
Over the last 35 years, the EDA industry has delivered 10 million times improvements in chip design productivity. With Moore’s Law ending, most experts agree that chips designed by artificial intelligence (AI) and machine learning (ML) are the future of semiconductors. In fact, academic and commercial tools are now starting to incorporate AI/ML for optimization of digital and analog integrated circuits (ICs) at various levels of abstraction and in very large solution spaces. However, the goals thus far have been to enhance traditional metrics such as power, performance, and area. Security concerns and issues related to IC design have been highly overlooked.
A current thrust at FINS is investigating state-of-the-art methods in AI/ML to optimize and port hardware security primitives such as physical unclonable functions (PUFs), true random number generators (TRNGs), and silicon odometers across technology nodes, to predict functionality-security-performance tradeoffs and guide users/tools, to seamlessly integrate best-in-class protections (e.g., masking, hiding, fault tolerance, etc.) under traditional constraints, and to harden designs/layouts against hardware Trojans and IP theft. Further, we are exploring the benefits of federated learning to overcome data bottlenecks while effectively utilizing AI/ML without compromising the confidentiality of semiconductor intellectual property (IP).
Artificial Scientific Intelligence for Automating Scientific Modeling:
The explosive growth of diversified scientific data has reaped unprecedented advancements in scientific discovery and engineering. Unfortunately, it is difficult to relate data from one scientific problem to another without a unified approach to the mathematics of scientific modeling. FINS researcher James Fairbanks takes a radical approach to scientific computing that applies category theory to mathematically model mathematics itself. This novel modeling and simulation approach is the foundation for software for diverse scientific applications including — space weather forecasting to protect satellite communications, and the hierarchical analysis and control of immunology and epidemics. Both areas are of vital interest to the science portfolio of DARPA, the research and development agency of the Department of Defense (DoD). We use category theory, a mathematical language developed to unify the diverse fields and subfields of mathematics, to create a software ecosystem that unifies the various fields and subfields of computational science and engineering. Dr. Fairbanks’ team specializes in taking this lofty abstract language and implementing concrete software packages that solve real-world problems for stakeholders.
Automated Physical Inspection for Counterfeit Electronics Detection and Avoidance:
Counterfeit electronics in the supply chain are a longstanding problem with nontrivial impacts to government, industry, and society as a whole: (i) security and reliability risks for critical systems and infrastructures that incorporate them; (ii) substantial economic losses for intellectual property (IP) owners; (iii) source of revenue for terrorist groups and organized crime; (iv) reduce the incentive to develop new products and ideas, thereby impacting worldwide innovation, economic growth, and employment. The counterfeit chip market has an estimated worldwide value of $75B, and such chips are integrated into electronic devices reportedly worth more than $169. The ongoing chip shortage due to the COVID-19 pandemic is only aggravating the situation by creating huge gaps in the supply chain.
FINS is currently engaged in research that focuses on using artificial intelligence (AI), image processing, and computer vision to address the challenges associated with non-invasive physical inspection for counterfeit integrated circuit (IC) and printed circuit board (PCB) detection. Namely, by automating identification of the defects associated with counterfeits, we can reduce the time, costs, and need for subject matter experts. This technology is envisioned for use by non-technical, minimally trained operators such as border agents at U.S. Ports of Entry.
Every literary piece has a unique style composition. Addition of this style element to the text transforms the raw information into a captivating artwork. In authorship attribution, this style is considered the author’s virtual fingerprint and is present in every text article composed by the author. There are several situations in everyday life where the author of a document needs to be identified –in case of historic documents, such as the Federalist Papers, where the authors are unknown or the authorship is in dispute, or, in forensic situations where the authenticity of a suicide note needs to be verified. Even in present-day cyberspace, the inherent anonymity is exploited to publish text articles of sensitive nature ranging from hate speech to fake news. Stylistic analysis can effectively assist in these situations by identifying the author of the text article.
A current thrust at FINS focuses on leveraging natural language processing, machine learning and artificial intelligence to process, identify and extract stylistic information from the text. This information is later utilized in matching an unknown document to its author and, by extension, even obtain heuristics on the author’s physical and psychological state.
Computational Behavioral Analytics:
Psychology suggests that an individual’s language usage is conditioned by their life experiences and, consequently, reflects their personality. Inferring an individual’s personality traits from their language usage can then be used to predict behavior. Since individuals modulate their language based on all accumulated life experiences, identifying markers for the desired traits need to be manually isolated for analysis from the text. Well-established personality models, such as Big5, MBTI, and Dark Triad, measured using carefully curated psychological questionnaires help probe personality traits. It’s inherent relationship with language then provides an avenue for passive assessment of these personality traits. The ability to predict behavior from traits can be used for psychological profiling in both medical and forensic settings, or, in a commercial setting such as for assessing the fit of a prospective employee into their corporate culture.
Grounded in psychology and linguistics, research on this topic by FINS aims to enhance the current state of behavior analysis from uncontrolled raw text samples to enable human interpretation and leverage recent advances in natural language processing/inferencing along with machine learning and artificial intelligence.
FakeNews: Modeling the neurocognitive mechanisms underlying fake news detection using AI:
Fabricated information mimicking news media, referred to as ‘fake news’, is an epidemic deception technique to manipulate public opinion. Older adults, and especially those with lower cognitive functioning, are particularly vulnerable to deception via fake news. Currently only technical solutions exist (e.g., fact checking), but fake news continues to break through, leaving human decision making as the last line of defense. A FINS research team, which includes psychology professors, are working on developing a neural network model to identify deceptive features in news headlines and cognitive characteristics in the decision maker toward enhancing fake news detection.
Cyber Deception for Proactive Defense Against Physical Attacks:
Modern system-on-chip circuits (SoCs) handle sensitive assets like keys, proprietary firmware, top secret data, etc. Attacks against SoCs may arise from malicious or vulnerable software, the hardware itself (e.g., hardware Trojans), and physical attacks against hardware (side channel analysis, fault injection, optical probing, microprobing, and circuit edit). Recently, cyber-attacks that exploit physical vulnerabilities have been successfully performed against commercial chips, e.g., to remotely extract keys from TrustZone in ARM/Android devices, to breach confidentiality and integrity of Intel SGX and AMD SEV, and more. These exploits demonstrate that existing solutions are not enough. Further, given the static and long-lived nature of hardware, it can be argued that compromise by physical attacks is inevitable.
One thrust of FINS’s research on this topic aims to address hardware vulnerabilities to physical attacks using cyber deception. Cyber deception is an emerging proactive methodology that tries to reverse the typical asymmetry in cybersecurity where the attacker changes at will while the defender is a static “sitting duck”. Specifically, we are utilizing deception to enable chip designers to gather intelligence on attacks/attackers, assess their exploitive capabilities, and perform self-aware manipulation that forces them to waste valuable time and resources during physical attacks. In the long term, artificial intelligence (AI) and game theory will be used to craft optimal hardware deception policies.
Phishing: Identifying vulnerabilities to cyber deception by simulating phishing attacks:
Another example of a form of cyber-attack currently being researched by FINS faculty involves Phishing, which at stealing sensitive information and targeting hardware systems via viruses in deceptive messages (e.g., emails, phone texts). One of FINS’ research teams is working on a project where simulated phishing emails and text messages are sent to study participants to determine their susceptibility to this form of deception. Results have shown that older (compared to younger) adults, and particularly those with low positive affect and low cognitive function, are at particular risk for falling for phishing attacks. New knowledge generated in this project will inform defense solutions in reducing vulnerability to cyber deception in adults of different ages; and will enhance the design of real-world experimental phishing interventions for use in future research.
A significant barrier in data-driven analysis, especially deep learning, is the lack of data. In microelectronics security, the development of pre-silicon assessment tools and post-silicon assistance tests requires lots of real-world test articles, benchmarks, measurements, and datasets. The obvious advantage of real examples is that they have security vulnerabilities already identified, e.g., CVEs in the National Vulnerability Database. However, such designs are typically confidential, proprietary, or difficult to share. Further, collecting images and/or measurements from real-world systems can be time-consuming and expensive. This has led most researchers to rely on open-source data, which is also limited.
In one of FINS’s thrusts on this topic, we focus on generating arbitrarily large amounts of synthetic test articles and benchmarks using data augmentation. Data augmentation is a technique used to increase the amount of data by adding slightly modified copies of already existing data. For example, in the image domain, we are employing generative adversarial networks (GANs) and semantic maps to create realistic optical and SEM images for counterfeit detection, hardware Trojan detection, etc. At the circuit level, we are creating diverse test articles and benchmarks using a mixture of parameter variation, traditional optimization, and AI-based optimization.
Deepfake refers to data generated by a deep learning model based on a fictitious scenario. The data can be of any form ranging from image, audio to even video. With the current state-of-the-art deepfake models, it is possible to synthesize highly deceitful content that can be indistinguishable from true real-life content. Such content is always tied to ethical issues or concerns because they are widely used for blackmailing, creating fake news, or fake pornography videos. Generative models are hazardous if they are in the wrong hands, but they still have some highly sought-after positive applications. One such application is synthetic data generation. Generative models can synthesize different types of data that can assist in providing insights into experiments that may never happen in real-life.
FINS thrust on this topic focuses on imbuing the deep learning models with the ability to better understand/interpret the exemplary data and gain finer control over the data generation process.
Delineating psychological correlates of deepfake detection
One FINS research team is currently working toward examination of human factors involved in deepfake detection. The main areas of investigation include, but not limited to, determining how well humans can detect faces in deepfake images and videos as well as examining the role of various psychological and cognitive processes on detection of deepfakes. In this context, the teams are also interested in understanding how human performance compares to machine (classification algorithm) performance in deepfake face detection.
Explainable Artificial Intelligence:
As Artificial Intelligence (AI) solutions become ever more ubiquitous, there is still a serious lack of understanding in how these systems make decisions. The brittle nature of the statistical correlations learned by these models are often overlooked. To address this gap, Explainable and Interpretable Artificial Intelligence (XAI) research seeks to explain how and why models make their decisions. Without this understanding, users of AI systems are left blindly trusting the output of their AI models. Especially in high-stakes decision-making such as with self-driving cars and criminal sentencing, XAI methods provide trust anchors in how these systems work in order to verify the validity of these decisions and debug them when they are not.
One of FINS faculty’s thrust on this topic focuses on interpretable methods for natural language processing. We distinguish traditional XAI and inherently interpretable AI architectures. XAI has primarily focused on post-hoc, opaque-box approaches, which can be misleading as they themselves are approximations of the model they are attempting to explain. This work focuses on the design of powerful inherently interpretable methods that combine the clarity of decision trees with the power of deep learning neural architectures.
Integrated Formal Methods & Game Theory for Security:
For mission-critical Cyber-physical systems (CPSs), it is crucial to ensure these systems behave correctly while interacting with dynamic, and potentially adversarial environments. Synthesizing CPSs with assurance is a daunting task: On the one hand, the interconnected networks, sensors, and (semi-)autonomous systems introduce unprecedented vulnerabilities to both cyber- and physical spaces; On the other hand, purposeful and deceptive attacks may aim to compromise more complex system properties beyond traditional stability and safety. We aim to develop integrated formal methods and game theory for constructing provably correct and secured cyber-physical systems.
Inverse Reinforcement Learning:
Reinforcement learning (RL) is a machine learning paradigm inspired by how humans learn. In RL, algorithms called “agents” interact with a simulated environment. In turn, the simulated environment provides feedback to the agent based on what actions it takes. The RL agent continues taking actions and the environment continues providing feedback until the RL agent optimizes some task. Hence, RL has achieved high performances in applications where one’s actions have consequences, and those consequences may be delayed in time. Examples of applications where RL has been successfully deployed include self-driving vehicles, automated stock trading, customized healthcare, robot manipulation, and natural language processing.
Current research on this topic at FINS concerns Inverse Reinforcement Learning (IRL), which is the opposite of RL. While RL involves designing agents that optimize some task, IRL involves designing algorithms that understand said agents. IRL helps humans infer how RL agents operate, what they are doing, and what they will do next. IRL has many applications within the domain of human-computer interaction, such as human interpretability of agents and efficient agent training from human experts.
Natural Language Processing (NLP):
Discovery of hidden mental states using explanatory language representations:
Taking the pulse of a given population in a health crisis (such as a pandemic) may predict how the public will handle restrictive situations and what actions need to be taken/promoted in response to emerging attitudes. Our research focuses on the use of explainable representations to gauge potential reactions to non-pharmaceutical interventions (such as masks) through NLP techniques to discover hidden mental states. A stance, which is a belief-driven sentiment, is extracted via propositional analysis (i.e., I believe masks do not help [and if that belief were true, I would be antimask]), instead of a bag-of-words lexical matching or an embedding approach that produces a basic pro/anti label. For example, the sentence I believe masks do not protect me is rendered as ~PROTECT(mask,me). We pivot off this explanatory representation to answer questions such as What is John’s underlying belief and stance towards mask wearing? Because a health crisis can lead to drastic global effects, it has become increasingly important to derive a sense of how people feel regarding critical interventions, especially as trends in online activity may be viewed as proxies for the sociological impact of such crises.
Natural Language Processing (NLP) across languages and cultures:
Speakers of different languages express content differently, both due to language divergences (e.g., Chinese word order indicates grammatical meaning, whereas Korean relies heavily on suffixes for grammatical meaning) and cultural distinctions (e.g., discussions about a given topic may employ conformist or polite terms in one culture, but may employ emotive or antagonistic terms in another). Our multilingual and multicultural NLP takes into account such distinctions for both analysis and synthesis of human language for a wide range of applications of interest to national security, including the detection of beliefs, stances, or concerns associated with heavily debated topics that might lead to harmful polarization or violence. Mapping NLP algorithms across languages and cultures and understanding both equivalences and distinctions among syntactic, lexical, and semantic levels of understanding is a key component toward supporting civil discourse across languages and cultures.
NLP in Social Media vis a vis Minority & Gender Representation:
Extensive use of online media comes with its own set of problems. One of the significant problems plaguing online social media is rampant mis/disinformation and the use of toxic language to silence minorities. Dr. Oliveira’s team uses AI and NLP to understand and sometimes predict human behavior. The techniques employed aim to identify what features personality are more likely to be associated with higher user engagement in deceptive Facebook posts and to understand misinformation on social media platforms, such as Twitter, as a function of tweet engagement, content, and veracity. They also study methods to identify subtle toxicity (e.g., benevolent sexism, sarcasm, etc.) in online conversations. The ability to identify these markers of engagement and harmful behavior can be used to build better systems that can shield people online. Besides studying social media, Dr. Oliveira’s team also looks into online news media. They use NLP to explore factors associated with gender bias and to identify influence cues of disinformation in news media texts. Further, AI is used to identify temporal features of users’ behaviors that can be used to distinguish them online. A particular research thrust showed that online users have unique computer usage behaviors which can be used to distinguish them easily. These results have a significant impact on continuous authentication-related research.
Sociolinguistic computing for detection of foreign influence:
Foreign influence campaigns may attempt to inflict harm, often appealing to moral dimensions and identities, as a strategy to induce polarization in other societies. Our research explores potential indicators of influence attempts in language, for example, a sudden introduction of highly controversial and/or polarizing topics in online posts/messages. Language that reflects (and speaks to) the moral values of the target audience can increase in-group cohesion, but further contribute to polarization. Thus, social computing techniques leverage moral values expressed in language to enhance the detection of stances and concerns, as a step toward detection of influence and potentially harmful polarization.
Preference-aware Decision Making:
Humans excel autonomy in their cognitive flexibility to make satisficing decisions in uncertain and dynamic environments. However, when designing autonomous systems, existing formal methods with Boolean truth of logical “correctness” fundamentally limit machines to achieve human-like intelligent planning that trades off between task completion, correctness, and preferences over alternative outcomes. Our research focus on developing new formal specification and methods for preference-based planning in uncertain environments. We investigate three key questions: How to rigorously specify human’s preferences and temporal goals in formal logic? Given a task specific in this language, how to synthesize policies to achieve preferred satisfaction of the mission in an uncertain environment? How to enable an agent to adapt its preference-based policy while learning the human’s preference and the environment it interacts with?
Reverse Engineering for Integrated Circuits:
Integrated Circuit (IC) manufacturing leverages a global supply chain to maintain economic competitiveness. Modularization of the manufacturing workflow leaves it vulnerable to malicious attacks. For instance, untrusted foundries may deliberately install backdoors, i.e., hardware Trojans, into cyber systems for an adversary to exploit at will, or the circuit design -the intellectual property (IP) of the designer, maybe stolen and duplicated. Reverse Engineering (RE) is the only approach for security experts to verify IC design, detect stolen IP, and, potentially, guarantee trust in the device. However, the existing RE process is ad-hoc, unscalable, error-prone, and requires manual intervention by subject matter experts -thereby, limiting its potential as the go-to tool for hardware assurance.
This thrust currently focuses on resolving these limitations by developing critical algorithmic infrastructure to advance automation in the IC RE process. This project uses concepts from image processing, computer vision, machine learning and artificial intelligence (AI) to acquire, process and gain insights from electron microscopy images of the IC and further develop AI-driven security policy for generating RE-compliant IC design for seamless cost/time-efficient design verification. The knowledge gained through this project will also be disseminating through educational programs and collaborations with the semiconductor industry.
The _why_ of AI. Artificial intelligence systems are continuing to grow in popularity for pulling useful predictions from complex data. However, as the decisions of these AI systems increase in their impact, we must decide just how much to trust those decisions. We study _why_ models make their decisions and try to develop research that enables user trust of those decisions. The complexity of AI models make it non-trivial to understand exactly why a model may make mistakes, or even more subtley why a model might make the right decision for the wrong reasons. Reliable & trustworthy AI research aims to build the necessary understanding and tooling to provide solid answers to the rationale behind model predictions; when they are right, and more importantly, when they are wrong.