POST /api/v1/videos/{video_id}/comments - Add a Comment
Overview
This endpoint lets an authenticated viewer post a comment on a video. It writes the comment into two separate Cassandra tables simultaneously—one organized by video and one organized by user—so that both "comments on a video" and "comments by a user" queries are fast. An optional sentiment analysis step scores the comment text before it is persisted.
Why it exists: Comments drive community engagement. Storing them in two tables is a deliberate Cassandra design decision: because Cassandra cannot efficiently filter across partition boundaries, we keep two physical copies of every comment so each query pattern has its own perfectly-shaped partition to hit.
HTTP Details
- Method: POST
- Path:
/api/v1/videos/{video_id}/comments - Auth Required: Yes — viewer role (JWT bearer token)
- Success Status: 201 Created
Path Parameters
| Parameter | Type | Description |
|---|---|---|
video_id |
UUID | The video receiving the comment |
Request Body
{
"text": "This tutorial finally made clustering keys click for me!"
}
| Field | Type | Constraints |
|---|---|---|
text |
string | 1–1000 characters, required |
Response Body
{
"commentid": "a3b4c5d6-0000-11ee-be56-0242ac120002",
"videoid": "550e8400-e29b-41d4-a716-446655440000",
"userid": "7f3e1a2b-dead-beef-cafe-123456789abc",
"comment": "This tutorial finally made clustering keys click for me!",
"sentiment_score": 0.87,
"firstName": "Jane",
"lastName": "Developer"
}
Cassandra Concepts Explained
What is a TimeUUID?
A TimeUUID (UUID version 1) encodes a timestamp directly inside the UUID value. The first 60 bits represent a 100-nanosecond-resolution timestamp, and the remaining bits add uniqueness to prevent collisions.
Why this matters for comments:
- Cassandra can sort TimeUUIDs chronologically using clustering key order
- You get a globally unique comment ID and a built-in timestamp in a single value
- No separate
created_atcolumn is needed for ordering (though you may still store one for human-readable display)
Compare UUID versions:
| Version | Source of uniqueness | Sortable by time? |
|---|---|---|
| v1 (TimeUUID) | MAC address + timestamp | Yes |
| v4 (random) | Random bits | No |
Clustering Keys and Sort Order
In Cassandra, the clustering key defines the order of rows within a partition. For comments:
PRIMARY KEY (videoid, commentid)
videoidis the partition key — all comments for a video live on the same nodecommentidis the clustering key — rows within that partition are sorted by this value- Adding
WITH CLUSTERING ORDER BY (commentid DESC)means newest comments come first
This is powerful because Cassandra's on-disk storage already keeps these rows in order. Fetching "the latest 20 comments" is a sequential read of the first 20 rows in the partition — no sort step needed.
Dual-Table Writes (Denormalization)
Cassandra's golden rule: model your tables around your queries. This endpoint has two distinct access patterns:
- "Show me comments on video X" → partition by
videoid - "Show me comments by user Y" → partition by
userid
Since Cassandra cannot efficiently join tables or filter across partitions, the solution is to write the same comment to two tables at insert time. This is denormalization: intentionally storing duplicate data to make reads fast.
Trade-off: Writes are slightly more expensive (two inserts instead of one), but reads for either pattern are O(1) partition lookups.
Sentiment Analysis Integration
Before the comment is persisted, the service optionally scores the comment text using a sentiment analysis model. The resulting sentiment_score (a float between 0.0 and 1.0, where higher = more positive) is stored alongside the comment. This enriches the data model without requiring a separate enrichment pipeline after the fact.
Data Model
Table: comments_by_video
CREATE TABLE killrvideo.comments_by_video (
videoid uuid,
commentid timeuuid,
userid uuid,
comment text,
sentiment_score float,
PRIMARY KEY (videoid, commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);
Key Characteristics:
- Partition Key:
videoid— all comments for one video in one partition - Clustering Key:
commentid DESC— newest comments at the top - TimeUUID:
commentidencodes creation time, enabling ordered retrieval without a separate timestamp
Table: comments_by_user
CREATE TABLE killrvideo.comments_by_user (
userid uuid,
commentid timeuuid,
videoid uuid,
comment text,
sentiment_score float,
PRIMARY KEY (userid, commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);
Key Characteristics:
- Partition Key:
userid— all comments by one user in one partition - Clustering Key: Same
commentid DESCordering - Mirror of
comments_by_video: Same data, different partition key
Database Queries
1. Generate a TimeUUID
from uuid import uuid1
comment_id = uuid1() # Timestamp-based UUID
Why uuid1(): The timestamp encoded inside v1 UUIDs lets Cassandra keep comment rows in chronological order automatically. Using uuid4() here would still work for uniqueness but would lose the ordering guarantee.
2. Score Sentiment (Optional)
async def analyze_sentiment(text: str) -> float:
# Calls an internal or external NLP service
score = await sentiment_service.score(text)
return score # 0.0 (negative) to 1.0 (positive)
This step runs before the database writes. If the sentiment service is unavailable, the score can default to null so the comment write still succeeds.
3. Insert into comments_by_video
await comments_by_video_table.insert_one({
"videoid": str(video_id),
"commentid": str(comment_id),
"userid": str(current_user.userid),
"comment": body.text,
"sentiment_score": sentiment_score
})
Equivalent CQL:
INSERT INTO killrvideo.comments_by_video
(videoid, commentid, userid, comment, sentiment_score)
VALUES (
550e8400-e29b-41d4-a716-446655440000,
a3b4c5d6-0000-11ee-be56-0242ac120002,
7f3e1a2b-dead-beef-cafe-123456789abc,
'This tutorial finally made clustering keys click for me!',
0.87
);
Performance: O(1) — single partition write, no index lookup required.
4. Insert into comments_by_user
await comments_by_user_table.insert_one({
"userid": str(current_user.userid),
"commentid": str(comment_id),
"videoid": str(video_id),
"comment": body.text,
"sentiment_score": sentiment_score
})
Equivalent CQL:
INSERT INTO killrvideo.comments_by_user
(userid, commentid, videoid, comment, sentiment_score)
VALUES (
7f3e1a2b-dead-beef-cafe-123456789abc,
a3b4c5d6-0000-11ee-be56-0242ac120002,
550e8400-e29b-41d4-a716-446655440000,
'This tutorial finally made clustering keys click for me!',
0.87
);
Performance: O(1) — single partition write.
Implementation Flow
┌─────────────────────────────────────────────────────────┐
│ 1. Client sends POST /api/v1/videos/{video_id}/comments │
│ Header: Authorization: Bearer <jwt> │
│ Body: { "text": "Great video!" } │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 2. JWT middleware validates token │
│ └─ Extracts current user (userid, firstName, lastName)│
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 3. Validate request body │
│ └─ text: 1–1000 chars, required │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 4. Generate commentid = uuid1() (TimeUUID) │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 5. Run sentiment analysis on comment text │
│ └─ Returns score 0.0–1.0 (or null on failure) │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 6. Write to TWO tables (can run in parallel) │
│ ├─ INSERT INTO comments_by_video │
│ └─ INSERT INTO comments_by_user │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ 7. Return 201 Created │
│ { commentid, videoid, userid, comment, │
│ sentiment_score, firstName, lastName } │
└─────────────────────────────────────────────────────────┘
Special Notes
1. No Cassandra-Level Transaction Across Tables
The two INSERT statements are independent. Cassandra does not support multi-table ACID transactions. If the second insert fails after the first succeeds, the data will be inconsistent: the comment will appear in one view but not the other.
Mitigation strategies:
- Retry the failed insert (idempotent by nature — same
commentidvalue) - Accept that rare inconsistencies exist and handle them in read paths
- Use a background reconciliation job for production systems
2. TimeUUID Precision and Ordering
Because TimeUUIDs encode time at 100-nanosecond resolution, comments posted extremely close together (within the same process) might share the same timestamp bits. The MAC address component provides uniqueness in that case, but ordering between those rows is arbitrary. In practice, user-submitted comments are never close enough in time for this to matter.
3. Sentiment Score Availability
The sentiment_score field may be null if:
- The sentiment service is unavailable
- The comment text is too short or ambiguous for reliable scoring
- Sentiment analysis is disabled for the deployment
Callers should treat this field as optional in their UI.
4. Text Length Limit
The 1,000-character limit is enforced at the API layer by Pydantic validation. Cassandra's text type has no inherent length restriction — the limit is a product decision to keep partitions manageable and prevent abuse.
Developer Tips
Common Pitfalls
-
Using
uuid4()for commentid: This works for uniqueness but loses chronological sort order. Useuuid1()for any clustering key that should be time-ordered. -
Writing to only one table: Both tables must be populated. A comment visible on the video page but invisible on the user's profile (or vice versa) is a data consistency bug.
-
Blocking on sentiment analysis: If sentiment scoring is slow, run it concurrently with table preparation rather than sequentially.
-
Large comment partitions: A very popular video with millions of comments will have a very large partition. Consider time-bucketing (e.g., adding a
bucketcolumn derived from the month) if partition size becomes a concern.
Query Performance Expectations
| Operation | Performance | Why |
|---|---|---|
| Insert into comments_by_video | < 10ms | Single partition write |
| Insert into comments_by_user | < 10ms | Single partition write |
| Sentiment scoring | < 50ms | Depends on model/service |
| Total (writes in parallel) | < 60ms | Network + sentiment dominates |
Testing Tips
When testing this endpoint, verify both table writes occurred:
async def test_comment_writes_to_both_tables():
response = await client.post(
f"/api/v1/videos/{video_id}/comments",
json={"text": "Test comment"},
headers={"Authorization": f"Bearer {viewer_token}"}
)
assert response.status_code == 201
data = response.json()
assert "commentid" in data
assert data["comment"] == "Test comment"
# Verify the comment appears in video feed
video_comments = await client.get(f"/api/v1/videos/{video_id}/comments")
assert any(c["commentid"] == data["commentid"]
for c in video_comments.json()["items"])
# Verify the comment appears in user feed
user_comments = await client.get(f"/api/v1/users/{user_id}/comments")
assert any(c["commentid"] == data["commentid"]
for c in user_comments.json()["items"])
Related Endpoints
- GET /api/v1/videos/{video_id}/comments - Retrieve comments for a video
- GET /api/v1/users/{user_id}/comments - Retrieve all comments by a user
- POST /api/v1/videos/{video_id}/ratings - Rate a video