Astra Data API
The Astra Data API is an HTTP/JSON interface to Astra DB. Instead of connecting to Cassandra with a driver, you send HTTP requests with JSON bodies. KillrVideo uses the Data API exclusively — there is no CQL driver dependency anywhere in the codebase.
Why Use the Data API
No driver installation or version management. Traditional Cassandra applications require a language-specific driver (Java driver, Python driver, etc.) that must be kept in sync with the Cassandra version. The Data API is just HTTP — any language that can make HTTP requests can use it.
Consistent interface across languages. The same JSON operations work from Python, JavaScript, Java, or curl. This is why KillrVideo's API contract translates cleanly to other implementation languages.
Built-in vector support. The Data API natively supports vector storage and similarity search, including automatic embedding generation with $vectorize. No separate vector database required.
AstraPy for Python. The AstraPy SDK wraps the Data API with a Python-native async interface. KillrVideo uses AstraPy throughout the backend, but everything AstraPy does is standard HTTP under the hood.
Tables vs Collections
The Data API supports two storage modes:
- Collections — schemaless JSON documents. Flexible structure, automatic
_idgeneration, MongoDB-like feel. Good for prototyping. - Tables — structured schema with explicit column types, matching how Cassandra tables work with CQL. Better performance, explicit schema contracts.
KillrVideo uses Tables throughout. This is the recommended approach for applications with a known schema, which is true of most production applications.
Core Operations
All operations are performed on a specific table. The basic CRUD operations:
Find one document (equivalent to SELECT ... WHERE ... LIMIT 1):
{
"findOne": {
"filter": { "user_id": "550e8400-e29b-41d4-a716-446655440000" }
}
}
Find multiple documents (with pagination):
{
"find": {
"filter": { "video_id": "a8098c1a-f86e-11da-bd1a-00112444be1e" },
"options": { "limit": 20 }
}
}
Insert one document:
{
"insertOne": {
"document": {
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"email": "dev@example.com",
"first_name": "Dev",
"created_at": { "$date": 1742400000000 }
}
}
}
Update one document (partial update):
{
"updateOne": {
"filter": { "video_id": "a8098c1a-f86e-11da-bd1a-00112444be1e" },
"update": {
"$set": { "title": "Updated Title" }
}
}
}
Delete one document:
{
"deleteOne": {
"filter": { "video_id": "a8098c1a-f86e-11da-bd1a-00112444be1e" }
}
}
Operators
The Data API supports MongoDB-style operators for filtering and updates:
| Operator | Use | Example |
|---|---|---|
$set |
Partial update (set specific fields) | { "$set": { "title": "New Title" } } |
$inc |
Atomic increment | { "$inc": { "view_count": 1 } } |
$in |
Match against a list of values | { "status": { "$in": ["active", "pending"] } } |
$regex |
Text pattern match | { "title": { "$regex": "cassandra" } } |
$gt, $lt, $gte, $lte |
Numeric and date comparisons | { "created_at": { "$gt": { "$date": 1742000000000 } } } |
The $set operator is important for Cassandra-style partial updates: instead of reading a document, modifying it, and writing it back, you specify only the fields to change. This is more efficient and avoids race conditions.
Vector Operations
The Data API's vector support is used for KillrVideo's semantic search and recommendations.
Store a vector (generated externally, e.g. by NVIDIA NV-Embed-QA):
{
"insertOne": {
"document": {
"video_id": "...",
"title": "Introduction to Cassandra",
"$vector": [0.021, -0.034, 0.156, ...]
}
}
}
Automatic vectorization (let Astra generate the embedding):
{
"insertOne": {
"document": {
"video_id": "...",
"title": "Introduction to Cassandra",
"$vectorize": "Introduction to Cassandra data modeling"
}
}
}
Vector similarity search:
{
"find": {
"sort": { "$vector": [0.021, -0.034, 0.156, ...] },
"options": {
"limit": 10,
"includeSimilarity": true
}
}
}
This returns the 10 most similar documents to the query vector, ordered by cosine similarity. The includeSimilarity option adds a $similarity field (0.0 to 1.0) to each result.
Using AstraPy
In KillrVideo's Python backend, the Data API is accessed through AstraPy's async client:
from astrapy import AsyncDataAPIClient
client = AsyncDataAPIClient(token=ASTRA_DB_APPLICATION_TOKEN)
db = client.get_async_database(ASTRA_DB_API_ENDPOINT)
videos_table = db.get_table("videos")
# Find a video
video = await videos_table.find_one({"video_id": video_id})
# Insert a video
await videos_table.insert_one({
"video_id": str(uuid4()),
"title": "My Video",
"created_at": datetime.utcnow()
})
# Vector search
results = await videos_table.find(
{},
sort={"$vector": query_vector},
limit=10,
include_similarity=True
)
AstraPy's interface mirrors the Data API JSON operations directly, making it easy to understand what HTTP request each call produces.