About Reality Defender
Reality Defender provides accurate, multi-modal AI-generated media detection solutions to enable enterprises and governments to identify and prevent fraud, disinformation, and harmful deepfakes in real time. A Y Combinator graduate, Comcast NBCUniversal LIFT Labs alumni, and backed by DCVC, Reality Defender is tdhe first company to pioneer multi-modal and multi-model detection of AI-generated media. Our web app and platform-agnostic API built by our research-forward team ensures that our customers can swiftly and securely mitigate fraud and cybersecurity risks in real time with a frictionless, robust solution.
Youtube: Reality Defender Wins RSA Most Innovative Startup
Why we stand out:
Our best-in-class accuracy is derived from our sole, research-backed mission and use of multiple models per modality
We can detect AI-generated fraud and disinformation in near- or real time across all modalities including audio, video, image, and text.
Our platform is designed for ease of use, featuring a versatile API that integrates seamlessly with any system, an intuitive drag-and-drop web application for quick ad hoc analysis, and platform-agnostic real-time audio detection tailored for call center deployments.
We’re privacy first, ensuring the strongest standards of compliance and keeping customer data away from the training of our detection models.
Role Overview
We are looking for a Data Engineer (Video Data) to enhance video ingestion, processing, and augmentation workflows to support our deepfake detection models and benchmarking efforts. The ideal candidate will have expertise in video processing pipelines, data engineering at scale, and experience working with large, diverse video datasets. You’ll work on building scalable, high-performance workflows for acquiring, storing, and transforming large-scale video datasets. You will collaborate with machine learning engineers and researchers to optimize data workflows, ensuring high-quality input for AI-driven fraud detection and disinformation mitigation.
Key Responsibilities
Video Ingestion & Processing: Build scalable pipelines for batch and streaming video data, automating preprocessing, transcoding, and augmentation.
Infrastructure Optimization: Optimize video storage, retrieval, and transformation workflows, including compression, metadata extraction, and format conversions.
Data Sourcing & Augmentation: Expand video dataset coverage through API integrations, web scraping, and social media ingestion.
Collaboration & Research Support: Partner with ML teams to enhance training datasets, support benchmarking efforts, and contribute to synthetic media generation.
Basic Skills & Experience
A Bachelor’s or Master’s degree in Computer Science, Data Engineering, Physics, Mathematics, Electrical Engineering, or a related STEM field
3-6+ years of experience in data engineering or another developer role with a focus on video data.
Strong software development experience in Python or a systems programming language, with experience using video processing libraries (e.g. libav, OpenCV).
Experience with video dataset augmentation and processing
Experience building systems on cloud infrastructure
Experience with SQL and NoSQL databases
Preferred Skills
Exposure to deepfake detection techniques and synthetic media processing.
Familiarity with machine learning concepts and how video data is used in AI/ML pipelines.
Experience building distributed systems for data streaming and processing at scale.
Experience with GPU acceleration for video processing.
Hands-on experience with social media scraping and API-based video collection
Familiarity with the statistical foundations of bias and balanced dataset construction
Experience with data lake or lake house technology, such as DataBricks
Knowledge of multi-modal AI models