Computer Vision Engineer
Computer Vision Engineer
Location: Remote
Type: Full-Time
Reports to: CPO
About Astera Holdings
Astera is building AsteraOS — an intelligence substrate for event-driven markets. The system is organized around three tightly integrated layers:
Ontology: a canonical graph of entities, states, and events
Intelligence: models and agents that reason over state transitions and uncertainty
Surfaces: products and interfaces that expose intelligence in real time
All layers read from and write to the same real-time pipelines and data primitives.
Astera Vision is our perception layer: it converts raw video into structured, machine-readable representations that power downstream intelligence and decision systems.
Role Overview
You will build the perception stack that transforms raw video into canonical state and derived events that attach directly to Astera’s entity graph and feed live intelligence agents, analytics, and pricing systems.
This role sits at the boundary between computer vision research, data systems, and applied intelligence. The goal is not demos or benchmarks — it is to build a reliable perception engine and data factory whose outputs can be trusted, evaluated, and acted upon.
What You’ll Do
1) Build perception systems that extract structured state from video
Design models for detection, tracking, and temporal understanding.
Convert pixel-space observations into normalized, context-aware state representations.
Reason about identity, roles, trajectories, interactions, and temporal structure.
2) Convert perception into ontology-native events
Encode actions, transitions, and regime changes as timestamped, structured events.
Ensure outputs integrate cleanly into Astera’s canonical graph.
Produce signals suitable for real-time consumption by downstream agents and analytics.
3) Own the data engine: labeling, evaluation, iteration
Design dataset strategies: weak supervision, bootstrapping, active learning, human-in-the-loop.
Define metrics that matter: stability, latency, calibration, drift, failure modes.
Build replayable evaluation harnesses for offline and live systems.
4) Do production-minded research
Ship models that operate under real-time constraints with graceful degradation.
Collaborate closely with backend and infrastructure teams to integrate perception into pipelines and storage.
Balance research ambition with reliability, observability, and iteration speed.
90-Day Success Criteria
A working end-to-end perception pipeline: video → state → events → graph/streams.
Clear, repeatable evaluation showing you can quantify quality, latency, and robustness.
At least one derived signal in production that demonstrably improves downstream reasoning or decision quality.
Requirements (Must-Have)
Strong computer vision fundamentals: detection, tracking, temporal modeling, representation learning.
Proven ability to ship applied research: models that run, are measured, and are iterated.
Python proficiency; PyTorch preferred.
Comfort with messy real-world video (compression artifacts, cuts, overlays, occlusion).
High ownership and bias toward action.
Preferred (Nice-to-Have)
Experience with multimodal modeling (video + audio + text).
Experience designing labeling pipelines or large-scale datasets.
Familiarity mapping model outputs into structured schemas or knowledge graphs.
Prior work on real-time or streaming systems.
How to Apply
careers@astera.holdings