New Tools Target Shadow AI Risks

Data security platform BigID has launched four new AI governance capabilities designed to help organisations discover unauthorised AI models and control sensitive data use in artificial intelligence systems.

The New York-based company announced the suite is targeting what it claims are growing risks from "shadow AI" – unauthorised AI models operating without security oversight.

The capabilities include Shadow AI Discovery to uncover rogue models, Data Labeling for AI to classify appropriate datasets, Data Cleansing for AI to remove sensitive information, and what BigID calls the industry's first Prompt-Based Classification system using natural language.

BigID's Shadow AI Discovery automatically identifies unmanaged AI models across cloud and collaboration platforms, while flagging personal or regulated training data. The system aims to provide visibility into AI usage that traditional security tools miss, integrating across model repositories, developer tools, cloud platforms and collaboration environments.

The capability goes beyond discovery to enable direct enforcement actions. Security teams can trigger policy enforcement, restrict risky access and launch remediation workflows within the BigID platform. The system correlates models to underlying datasets and maps out user activity patterns to show who is using what AI tools, where and how across the enterprise environment.

Data Labeling for AI

The Data Labeling for AI feature helps organisations classify and tag data for appropriate AI use through usage-based labels. BigID provides out-of-the-box classifications including "AI-approved," "restricted," and "prohibited," while allowing organisations to create custom labels aligned with internal risk frameworks and regulatory requirements.

The system supports both structured and unstructured data across cloud, software-as-a-service and collaboration environments. It aims to enforce usage policies early in data pipelines before information reaches AI models, combining classification with policy enforcement and remediation workflows to prevent sensitive or high-risk data from entering large language models, copilots and retrieval-augmented generation systems.

Data Cleansing for AI focuses on removing or tokenising sensitive information at scale before it enters generative AI tools and large language models. The capability works across both structured and unstructured data formats, including emails, PDFs, collaboration files and databases, to prevent confidential data from being embedded into model outputs or leaked in prompts.

The system forms part of BigID's broader Secure Data Pipeline solution, working alongside other capabilities including GenAI Catalog, Search and Safe-for-AI Labeling features. Security teams can apply policy-based controls and continuously reduce exposure across AI initiatives, with the goal of strengthening generative AI pipeline security through pre-cleansed, policy-compliant datasets.

Prompt-Based Classification

BigID's Prompt-Based Classification system replaces traditional rule-based data classification with what the company claims is an industry-first natural language interface. Users can describe sensitive data in plain English, paste regulatory language, or articulate AI policy requirements, with BigID's AI-powered engine automatically generating classification logic.

The system aims to address limitations of traditional classification tools that rely on technical rules, pattern libraries and complex configurations often inaccessible to non-technical teams. The capability scales discovery across cloud, software-as-a-service, data lakes and file systems without manual rule creation, while promising reduced false positives through context-aware intelligence that understands use case and risk factors.

https://www.bigid.com