Search engines
Search engines provide broad access to indexed web content and are often the first entry point in OSINT investigations. When combined with advanced operators, they enable analysts to efficiently filter noise, uncover hidden resources, and identify relevant documents, metadata, and references across the open web.
Google |
|
| Description |
General-purpose search engine widely used for OSINT investigations through advanced search operators. |
| Access |
Web: https://www.google.com |
| Key Features |
- Web pages and documents
- Cached content
- Metadata via advanced operators (site:, filetype:, intitle:)
|
| License |
Free |
Bing |
|
| Description |
General search engine useful as an alternative data source and for cross-validation of results. |
| Access |
Web: https://www.bing.com |
| Key Features |
- Web pages
- Indexed documents
- Image and video search results
|
| License |
Free |
DuckDuckGo |
|
| Description |
Privacy-focused search engine with reduced personalization and tracking. |
| Access |
Web: https://duckduckgo.com |
| Key Features |
- Web search results
- Instant answers
- Reduced user-based bias
|
| License |
Free |
Tor Browser |
|
| Description |
Privacy-focused web browser designed to access the Tor network, enabling anonymous communication and access to .onion services while reducing tracking and fingerprinting risks. |
| Access |
Web: https://www.torproject.org |
| Key Features |
- Routes traffic through the Tor network
- Built-in protections against tracking and fingerprinting
- Supports access to .onion services
- Includes security levels and HTTPS-first defaults
|
| License |
Open Source (BSD 3-Clause License) |
Vertical search engines
Vertical search engines focus on a specific type of data or domain, offering deeper and more specialized results than general-purpose search engines. In OSINT, they are essential for reducing information overload and retrieving high-signal data such as exposed services, images, social media content, or historical web snapshots.
Google Images |
|
| Description |
Image search engine with built-in reverse image search |
| Access |
Web: https://images.google.com |
| Key Features |
- Reverse image search (upload image or paste URL)
- Detection of visually similar images
- Source webpages and image context
|
| License |
Free |
Yandex Images |
|
| Description |
Image search engine particularly effective for reverse image searches. |
| Access |
Web: https://yandex.com/images |
| Key Features |
- Visually similar images
- Image sources
- Alternate resolutions
|
| License |
Free |
crt.sh |
|
| Description |
Search engine for public TLS/SSL certificates. |
| Access |
Web: https://crt.sh |
| Key Features |
- Issued certificates
- Subdomain discovery
- Certificate metadata
|
| License |
Free |
Shodan |
|
| Description |
Internet-wide scanning service for inventorying exposed services and fingerprinting IoT and networked devices. |
| Access |
Web: https://www.shodan.io |
| Key Features |
- Search and filter exposed services by banner, port, or vulnerability
- Command-line interface and API for automation and integration
- Historical data and monitoring of infrastructure changes
|
| License |
Freemium |
Ahmia |
|
| Description |
Search engine designed to index publicly accessible Tor (.onion) services, with a focus on transparency and abuse reduction. Commonly used for discovery and initial reconnaissance of dark web content. |
| Access |
Web: https://ahmia.fi |
| Key Features |
- Indexes reachable .onion services
- Filters known abusive or illegal content
- Provides both clearnet and Tor-accessible interfaces
- Useful for initial discovery and mapping of Tor sites
|
| License |
Open Source (BSD 3-Clause License) |
Web archives
Web archives preserve historical versions of websites and online content, allowing analysts to examine how information has changed over time. They are particularly valuable for attribution, timeline reconstruction, and recovering deleted or altered content that is no longer available on the live web.
Wayback Machine |
|
| Description |
Web archiving service providing historical snapshots of websites. |
| Access |
Web: https://web.archive.org/ |
| Key Features |
- Archived versions of websites
- Historical page content
- Deleted or modified pages
|
| License |
Free |
Waybackpack |
|
| Category |
Web Archiving, Attribution, Historical Analysis |
| Access |
Repository: https://github.com/jsvine/waybackpack |
| Description |
Tool for downloading all archived versions of a website from the Wayback Machine, useful for attribution and historical analysis. |
| Key Features |
- Bulk download of archived snapshots of websites
- Support for filtering by date and file type
- Useful for tracking content changes over time
|
| License |
Open source (MIT) |
Common Crawl |
|
| Description |
Open repository of large-scale web crawl data for bulk analysis and research. |
| Access |
Web: https://commoncrawl.org |
| Key Features |
- HTML content
- Web metadata
- Large-scale historical datasets
|
| License |
Open source |
archive.ph |
|
| Description |
Web archiving service that captures snapshots of social media web pages, preserving content even if the original is removed. |
| Access |
Web: https://archive.ph |
| Key Features |
- On-demand page snapshots
- Permanent links to archived pages
- Content preservation for deleted/modified social media
|
| License |
Free |
Sentiment analysis
BERT / RoBERTa |
|
| Description |
Advanced transformer-based NLP models used for high-accuracy sentiment analysis and contextual understanding of social media content. |
| Access |
Web: https://pypi.org/project/fast-bert/ |
| Key Features |
- Deep contextual understanding
- High sentiment classification accuracy
- Handles sarcasm, negation, and complex language
- Can be fine-tuned for SOCMINT-specific domains
|
| License |
Open Source |
Flair |
|
| Description |
NLP framework that provides efficient sentiment analysis using contextual string embeddings, suitable for real-time social media monitoring. |
| Access |
Repository: https://github.com/flairNLP/flair |
| Key Features |
- Good balance between accuracy and performance
- Fast inference compared to transformers
- Easy integration into SOCMINT pipelines
- Supports multiple languages
|
| License |
Open Source (MIT License) |
VADER |
|
| Description |
Lexicon and rule-based sentiment analysis tool optimized for social media language and short informal texts. |
| Access |
Repository: https://github.com/cjhutto/vaderSentiment |
| Key Features |
- Optimized for social media text
- Handles emojis, slang, and punctuation
- Very fast processing speed
- No training required
|
| License |
Open Source (MIT License) |
TextBlob |
|
| Description |
Simple NLP library providing basic sentiment analysis, mainly used for rapid prototyping and exploratory analysis. |
| Access |
Web: https://textblob.readthedocs.io/en/dev/ Repository: https://github.com/sloria/TextBlob |
| Key Features |
- Very easy to use
- Lightweight and fast
- Good for educational or prototype SOCMINT projects
|
| License |
Open Source (MIT License) |
Government records
Government records include publicly accessible databases and official publications released by state institutions. These sources provide authoritative data such as corporate registrations, legal notices, financial disclosures, and open datasets, making them critical for verification, attribution, and contextual analysis.
data.gov |
|
| Description |
US government open data portal providing access to public datasets. |
| Access |
Web: https://www.data.gov |
| Key Features |
- Government datasets
- Statistics and reports
- Geospatial data
|
| License |
Free |
data.europa.eu |
|
| Description |
European Union open data portal aggregating datasets from EU institutions and member states. |
| Access |
Web: https://data.europa.eu |
| Key Features |
- EU datasets
- Policy-related data
- Economic and social indicators
|
| License |
Free |
datos.gob.es |
|
| Description |
Spanish national open data portal for public sector information. |
| Access |
Web: https://datos.gob.es |
| Key Features |
- Administrative datasets
- Geographical information
- Public sector statistics
|
| License |
Free |
SEC EDGAR |
|
| Description |
US SEC database of corporate financial filings. |
| Access |
Web: https://www.sec.gov/edgar |
| Key Features |
- Annual and quarterly reports
- Ownership disclosures
- Regulatory filings
|
| License |
Free |
Domain and network
Domain and network intelligence tools provide visibility into internet infrastructure, including domain ownership, IP address allocation, DNS records, and TLS certificates. They are fundamental for mapping digital assets, identifying relationships between entities, and supporting technical attribution in OSINT and cyber intelligence investigations.
WHOIS |
|
| Description |
Classic protocol to query information about domains and IPs, including registrant, creation/expiration dates, and DNS servers. |
| Access |
CLI: whois example.com Web: https://whois.domaintools.com |
| Key Features |
- Domain owner / registrant
- Creation and expiration dates
- DNS servers and administrative contacts
|
| License |
Open source |
RDAP |
|
| Description |
Modern protocol that replaces WHOIS, providing standardized JSON responses; ideal for automation and structured analysis of domains and IPs. |
| Access |
CLI: curl https://rdap.org/domain/example.com Web: https://rdap.org |
| Key Features |
- Domain owner / registrant
- Creation, update, and expiration dates
- DNS servers, contacts, and legal notes
|
| License |
Open source |
Nmap |
|
| Description |
Network scanning and reconnaissance tool used to discover hosts, open ports, services, and operating systems. |
| Access |
CLI: nmap target Web: https://nmap.org Repository: https://svn.nmap.org/ |
| Key Features |
- Host discovery and port scanning
- Service and version detection
- OS fingerprinting and scripting engine (NSE)
|
| License |
Open source |
Media monitoring tools aggregate and analyze digital news sources at local, national, and global levels. They enable continuous tracking of events, narratives, and trends, and support large-scale analysis of media coverage, making them valuable for situational awareness and strategic intelligence.
Google News |
|
| Description |
News aggregation platform with advanced search and filtering options. |
| Access |
Web: https://news.google.com |
| Key Features |
- News articles
- Source attribution
- Timeline-based results
|
| License |
Free |
GDELT 2.0 |
|
| Description |
Global database for monitoring worldwide news media and events. |
| Access |
Web: https://www.gdeltproject.org |
| Key Features |
- Global news coverage
- Event metadata
- Geopolitical indicators
|
| License |
Free |
Media monitoring tools aggregate and analyze digital news sources at local, national, and global levels. They enable continuous tracking of events, narratives, and trends, and support large-scale analysis of media coverage, making them valuable for situational awareness and strategic intelligence.
BRAND24 |
|
| Description |
Social listening and online brand reputation management tool that tracks mentions across platforms. |
| Access |
Web: https://brand24.com |
| Key Features |
- Real-time social mentions tracking
- Sentiment analysis
- Influencer identification
|
| License |
Commercial (SaaS) |
GEOINT
GEOINT tools leverage geospatial data such as satellite imagery, sensor feeds, and location-based tracking systems. These sources allow analysts to verify locations, monitor physical movements, and correlate events in the real world with digital information.
Sentinel-2 |
|
| Description |
Earth observation satellite providing free multispectral imagery. |
| Access |
Web: https://sentinel.esa.int |
| Key Features |
- Satellite imagery
- Environmental monitoring data
- Temporal change detection
|
| License |
Free |
MarineTraffic |
|
| Description |
AIS-based vessel tracking platform. |
| Access |
Web: https://www.marinetraffic.com |
| Key Features |
- Ship positions
- Vessel metadata
- Historical movement tracks
|
| License |
Freemium |
Data leaks
Data leak sources include platforms where raw text, databases, or credentials are publicly shared, intentionally or unintentionally. In OSINT and CTI, they are commonly used to identify exposed information, monitor breach activity, and detect early indicators of compromise or criminal activity.
Pastebin |
|
| Description |
Website used to publish text content, often associated with leaks or dumps. |
| Access |
Web: https://pastebin.com |
| Key Features |
- Leaked credentials
- Configuration files
- Source code snippets
|
| License |
Free / Freemium |
People search
People search tools focus on identifying and correlating information related to individuals across public sources. They support the discovery of digital footprints, social connections, and publicly available personal data, and are often used in background research and attribution workflows.
Recon-ng |
|
| Description |
Modular Metasploit-style framework for OSINT; ideal for large-scale scripting. |
| Access |
Repository: https://github.com/lanmaster53/recon-ng |
| Key Features |
- Modular architecture with multiple reconnaissance modules
- Automated data collection from public sources and APIs
- Workspace-based data management and export capabilities
|
| License |
Open source (GPL 3.0) |
Maltego |
|
| Description |
Graph-based visualization and pivoting between entities using hundreds of built-in transforms, widely used in SOCMINT and CTI investigations. |
| Access |
Web: https://www.maltego.com |
| Key Features |
- Interactive graph visualization for complex relationship analysis
- Extensive library of transforms for domains, people, organizations, and infrastructure
- Pivoting across multiple data sources from a single entity
|
| License |
Free for non-commercial use (Community Edition) |
SpiderFoot |
|
| Description |
Automated OSINT collection framework with 200+ modules for domains, IPs, and digital identities; integrates with services such as Shodan and Have I Been Pwned. |
| Access |
Repository: https://github.com/smicallef/spiderfoot |
| Key Features |
- Large modular scanning engine with extensive data source coverage
- Automated correlation of results across multiple sources
- Integration with external threat intelligence platforms and APIs
|
| License |
Open source (MIT) |
theHarvester |
|
| Description |
Fast enumeration tool for emails, subdomains, hosts, and open ports using search engines and public APIs. |
| Access |
Repository: https://github.com/laramies/theHarvester |
| Key Features |
- Rapid discovery of email addresses and subdomains
- Support for multiple search engines and data sources
- Lightweight and easy integration into reconnaissance workflows
|
| License |
Open source |
ExifTool |
|
| Description |
De facto standard for reading, writing, and normalizing EXIF, XMP, and IPTC metadata across a wide range of file formats. |
| Access |
Website: https://exiftool.org/ Repository: https://github.com/exiftool/exiftool |
| Key Features |
- Supports hundreds of file formats
- Read, write, and delete metadata
- Highly scriptable and automation-friendly
- Widely used in digital forensics and OSINT
|
| License |
Open Source (GPL 3.0) |
MediaInfo |
|
| Description |
Technical analysis tool for audio and video files, providing detailed information about codecs, bitrates, containers, and timestamps. |
| Access |
Website: https://mediaarea.net/en/MediaInfo Repository: https://github.com/MediaArea/MediaInfo |
| Key Features |
- Detailed codec and container inspection
- Supports audio, video, and subtitle streams
- CLI and GUI versions available
- Commonly used in media forensics
|
| License |
Open Source (BSD 2-Clause License) |
FFmpeg / FFprobe |
|
| Description |
Comprehensive multimedia framework for inspecting, processing, and extracting metadata, keyframes, and media streams. |
| Access |
Website: https://ffmpeg.org/ Repository: https://git.ffmpeg.org/ffmpeg.git |
| Key Features |
- Extract and inspect detailed media metadata
- Keyframe and stream analysis with FFprobe
- Supports virtually all audio/video formats
- Powerful CLI for forensic workflows
|
| License |
Open Source (LGPL and GPL) |
MAT2 |
|
| Description |
Tool designed to detect and remove metadata from files before sharing, helping to prevent unintentional information disclosure. |
| Access |
Repository: https://github.com/tpet/mat2 |
| Key Features |
- Removes metadata from images, documents, audio, and video
- Focus on privacy and operational security
- CLI and GUI versions available
- Used by journalists and activists
|
| License |
Open Source (LGPL 3.0) |
XnView MP |
|
| Description |
Cross-platform media viewer with basic EXIF/IPTC metadata viewing and editing capabilities. |
| Access |
Website: https://www.xnview.com/en/xnviewmp/ |
| Key Features |
- View and edit basic EXIF/IPTC metadata
- Supports a large number of image formats
- Batch processing capabilities
- User-friendly graphical interface
|
| License |
Freemium |
General purpose
General-purpose OSINT tools provide flexible functionality that can be applied across multiple investigative domains. Rather than targeting a single data type, they act as frameworks or aggregation platforms that help structure, automate, or correlate information from diverse open sources.
Lampyre OSINT Studio |
|
| Description |
OSINT analysis platform for mixed datasets (financial, telecommunications, and social networks) using predefined query templates. |
| Access |
Web: https://lampyre.io |
| Key Features |
- Correlation of heterogeneous data sources in a single workspace
- Predefined analytical templates for investigations
- Advanced visualization and link analysis capabilities
|
| License |
Freemium |
OSINT Combine |
|
| Description |
AI-powered SaaS platform for prioritizing alerts and deduplicating findings from multiple OSINT and CTI sources. |
| Access |
Web: https://www.osintcombine.com |
| Key Features |
- Automated correlation and deduplication of OSINT data
- AI-assisted prioritization of alerts and findings
- Centralized dashboard for multi-source intelligence analysis
|
| License |
Commercial (SaaS) |
OSINT Framework |
|
| Description |
Web-based framework that organizes a large collection of OSINT tools and resources by category, helping analysts find relevant tools for different investigative tasks. It presents links in a structured, interactive layout. |
| Access |
Web: https://osintframework.com Repository: https://github.com/lockfale/osint-framework |
| Key Features |
- Comprehensive directory of free and paid OSINT tools
- Categorized by data type and investigative use
- Interactive map/tree structure for navigation
- Links to third-party tools for deep dives
|
| License |
Open source (MIT license) |