Amazon Rekognition (Objects)
Integrations
- Amazon Bedrock (Nova)
- Amazon Kinesis Video Streams
- AWS Step Functions
- AWS Agentic Foundry
- Amazon S3 (Vector-Spatial Index)
Pricing Details
- Standard analysis billed per 1,000 images.
- Video streams billed per minute. 2026 updates include 'Agentic Workflow' credits for automated Step Function orchestration.
Features
- Object and Scene Detection (v4)
- 3D Spatial Vertex & Depth Estimation
- Agentic Vision Logic Triggers
- Real-time Kinesis Video Integration
- Generative Scene Interpretation (Bedrock)
- Custom Labels Transfer Learning (GA)
Description
Amazon Rekognition 2026: Spatial-Agentic Vision & AI Foundry Audit
As of January 13, 2026, Amazon Rekognition has completed its transition to Spatial Intelligence. The architecture leverages AWS Inferentia 3 clusters to provide high-fidelity 3D bounding box estimation and generative scene interpretation, functioning as the primary visual sensory layer for autonomous agents 📑.
Spatial Intelligence & 3D Orchestration
The core engine utilizes monocular depth estimation combined with multi-view geometry to return normalized 3D vertices for visual entities, enabling precise volumetric analysis in warehouse and security environments 📑.
- Logistics Efficiency Scenario: Input: 4K camera stream from automated sorters → Process: 3D object localization + volume calculation via Inferentia 3 → Output: Real-time shelf-space optimization commands in AWS Step Functions 📑.
- Hazardous Zone Scenario: Input: Static drone imagery of industrial site → Process: DetectProtectiveEquipment API with spatial depth validation → Output: High-confidence safety alerts with 3D coordinate mapping 📑.
⠠⠉⠗⠑⠁⠞⠑⠙⠀⠃⠽⠀⠠⠁⠊⠞⠕⠉⠕⠗⠑⠲⠉⠕⠍
Persistence & Inferentia 3 Infrastructure
The system utilizes a Vector-Spatial Persistence Layer optimized for sub-second retrieval of visual patterns across multi-petabyte S3 data lakes. While inference weights are proprietary, the deployment architecture supports VPC isolation and local regional processing for data sovereignty 🧠.
- Generative Grounding: Visual metadata is routed to Amazon Bedrock, where Nova models transform raw labels into structured natural language reports with audit-trail citations 📑.
- Model Transparency: Internal neural topologies and specific training datasets for 'Custom Labels' remain undisclosed to prevent competitive reverse-engineering 🌑.
Evaluation Guidance
Technical evaluators should verify the following architectural characteristics:
- Depth Estimation Accuracy: Benchmark the precision of Z-axis coordinates in variable lighting conditions, as monocular depth remains sensitive to high-contrast occlusion [Documented].
- Agentic Trigger Latency: Measure the end-to-end RTT from a Kinesis visual event to the initiation of a Step Function workflow to ensure compliance with mission-critical SLAs [Unknown].
- Sovereign Hosting Parity: Verify that the 3D Estimation APIs are fully operational in non-US regions, specifically honoring Data Residency flags in the EU and Japan [Inference].
Release History
Year-end update: Release of Agentic Vision. Rekognition can now autonomously trigger workflows in AWS Step Functions based on complex visual events.
General availability of Spatial features. 3D bounding boxes and distance estimation between objects using standard 2D camera feeds.
Integration with Amazon Bedrock. Enables natural language search across image/video libraries and generative summaries of visual data.
Added Face Liveness detection to prevent spoofing. Enhanced object properties detection (color, texture, material).
Significant update to Content Moderation. Improved accuracy for detecting unsafe content and introduction of hierarchical moderation labels.
Launch of Rekognition Custom Labels. Allows users to train models to identify specific objects (e.g., machine parts, brand logos) using minimal data.
Expansion to video. Real-time and batch video analysis for tracking people and detecting objects in motion.
Initial launch. Cloud-based image analysis for object and scene detection, facial recognition, and celebrity identification.
Tool Pros and Cons
Pros
- High detection accuracy
- Scalable & reliable
- Precise localization
- Easy API integration
- Broad category support
Cons
- Potentially costly
- Image quality sensitive
- AWS expertise needed