I recently learned about the W3 Spec for Webmentions!
This spec is described by W3 as:
[…] a simple way to notify any URL when you mention it on your site. From the receiver’s perspective, it’s a way to request notifications when other sites mention it.
I’ve always wanted to write an implementation for a spec and this one did not seem extremely complex, plus I had a long weekend coming up, so I thought it’d make for a fun challenge!
The Plan
- Write a serverless implementation of webmentions.
- It should use AWS Cloudformation to stand everything up automatically
- It should be cheap to run indefinitely
- It should be testable locally
Breaking a receiver down, there are really only 3 endpoints required:
- Submit a webmention
- Get the status of a webmention
- Query webmentions
The basic flow of our function to receive webmentions is as follows:
sequenceDiagram
participant User
participant Lambda
participant DynamoDB
User ->> Lambda: Send webmention request
Lambda ->> Lambda: Confirm valid request
Lambda ->> DynamoDB: Create status object
DynamoDB ->> Lambda: Return status object identifier
Lambda ->> User: Return 201 & correct headers
Lambda ->> Lambda: Confirm mention exists
Lambda ->> DynamoDB: Create webmention object
Lambda ->> DynamoDB: Update status object
User ->> Lambda: Query webmention
Lambda ->> DynamoDB: Request webmention
DynamoDB ->> Lambda: Return webmention
Lambda ->> User: Return webmention
Cloud?! Lambda?! What’s the cost?!
I’ve chosen using DynamoDB & Lambda on AWS since when used correctly they’re extremely cheap and efficient. Both are charged based on usage so if you’re running a small site like mine you can expect the bill to come out at basically nothing. Lets do some quick math.
DynamoDB
DynamoDB charges based on read & write units. With queries against an index costing 1 read unit and writes costing 1 unit per index on the table. The downside of Dynamo is that if you expect your data structure to be complex or have intricate relationships - it can end up costing much, much, more than a standard database.
Amazon currently has the cost of read and write units priced at the following:
Cost | |
---|---|
Write Unit | $1.4232 / Million |
Read Unit | $0.2846 / Million |
Then if we break down how many units we use per function in our application:
Function | Read Units Used | Write Units Used |
---|---|---|
Initial Request | 0 | 1 |
Query Request Status | 1 | 0 |
Create Webmention | 0 | 21 |
Query URL for webmention | 1 | 0 |
We can calculate the cost as the following:
Function | Read units used | Write units used | Cost |
---|---|---|---|
1000 Mentions2 | 1000 | 3000 | $0.0045542 |
1000 Page-views | 1000 | 0 | $0.0002846 |
Lambda
AWS Lambda is a serverless compute platform where you can just run code. Amazon will provision servers globally for you, and these servers will only run while your function is active.
We run our lambda with 512mb of memory3, and with a timeout of 15 seconds. AWS charge per million requests and per second the function runs based on the memory used.
Cost | |
---|---|
Request | $0.20 / Million |
Memory | $0.0000000083 / second |
Assuming worst case scenario and our function runs for 15 seconds we’re paying 0.0000003245
per execution. In reality the cost is less than this, especially when querying for webmentions & status objects since its runs at almost sub second speeds.
Requests made | Worst case exec | Cost | |
---|---|---|---|
Creating 1000 Mentions | 1000 | 15 seconds | $0.00032449 |
Querying 1000 Mentions | 1000 | 1 second | $0.0002083 |
Total stack cost
Combining both our DynamoDB & Lambda costs we can figure out at what point our bill with become “excessive” (more than a dollar).
Dynamo | Lambda | Total | |
---|---|---|---|
Creating 1000 Mentions | $0.00032449 | $0.00032449 | $0.00064898 |
Querying 1000 Mentions | $0.0002846 | $0.0002083 | $0.0004929 |
This means that It’ll take around 1,540,880 webmention creation requests before our bill exceeds a dollar! We can use AWS Budgets & API Gateway Rate limiting to stop execution before we hit this, which I’d highly recommend. Nothing is worse than an unexpected bill at the end of the month.
Implementation
Our final stack should consist of the following:
- 1 Lambda: Running express with 3 entry points.
- 2 Tables: One with a single index and the other with two indexes.
Tables
The status table is extremely simple, since it only really needs to store basic information temporarily. It would be possible to add a TTL to records, however for now I plan to manually clean this out over time.
The table consists of a single index: id
. Which will be used by other services to query the status of webmentions.
statusTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: webmentions-status-table
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
BillingMode: PAY_PER_REQUEST
The mention table has a few extra definitions, most importantly it’s target global secondary index (GSI).
We use id
as the main unique index, however we do not query it. Instead, our functions query based on the webmention target
. By indexing this field, we can get Webmentions for a page at incredibly fast speeds.
mentionTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: webmentions-mention-table
AttributeDefinitions:
- AttributeName: id
AttributeType: S
- AttributeName: target
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
GlobalSecondaryIndexes:
- IndexName: target-index
KeySchema:
- AttributeName: target
KeyType: HASH
Projection:
ProjectionType: ALL
BillingMode: PAY_PER_REQUEST
Lambda
We then define our Lambda & its corresponding API Gateway. This is pretty chunky since we also need to give our function permission to access the tables we’ve just defined.
ApiGatewayApi:
Type: AWS::Serverless::Api
Properties:
StageName: !Ref Stage
Cors: "'*'"
MethodSettings:
- ResourcePath: "/*"
HttpMethod: "*"
ThrottlingRateLimit: 100
ThrottlingBurstLimit: 500
Webmention:
Type: AWS::Serverless::Function
Properties:
MemorySize: 512
Timeout: 15
CodeUri: .build
Handler: handler.handler
Runtime: nodejs14.x
Architectures:
- x86_64
Policies:
- Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
- dynamodb:BatchGetItem
Resource:
- !Sub ${statusTable.Arn}
- !Sub ${statusTable.Arn}/index/*
- !Sub ${mentionTable.Arn}
- !Sub ${mentionTable.Arn}/index/*
Events:
Root:
Type: Api
Properties:
Path: /web-mentions
Method: any
RestApiId:
Ref: ApiGatewayApi
Sub:
Type: Api
Properties:
Path: /web-mentions/{any+}
Method: any
RestApiId:
Ref: ApiGatewayApi
A minor complaint with AWS SAM is that it’s built in esbuild
bundler for functions does not behave as expected, to get around this we manually bundle ourselves.
Luckily esbuild
works pretty much on its own and doesn’t require very much external config like webpack or other bundlers.
esbuild src/handler.ts \
--target=es2020 \
--platform=node \
--external:aws-sdk \
--sourcemap=linked \
--outfile=.build/handler.js \
--bundle
Express
Getting express to work on a Lambda is super easy once you know what you’re doing4.
We’ve set up our API Gateway to forward requests at /web-metions
or /web-mentions/{any+}
through to our Lambda.
import express, { Express, NextFunction, Request, Response } from 'express';
import {
,
APIGatewayEventRequestContext,
APIGatewayProxyEvent'aws-lambda';
} from import serverless from 'serverless-http';
export type RequestContext = Request & {
: APIGatewayEventRequestContext;
context;
}
= (): Express => {
export const createApp = express();
const app .use(express.urlencoded({ extended: true }));
app.use(express.json());
app.use(cors(corsOptions));
app.options('*', cors(corsOptions));
app.use((req: RequestContext, res: Response, next: NextFunction) => {
appconsole.log(`Request: ${req.method} ${req.originalUrl}`);
next();
;
})
;
return app;
}
= (app: Express) =>
export const createHandler serverless(app, {
request(request: RequestContext, event: APIGatewayProxyEvent) {
.context = event.requestContext;
request,
};
})
= createApp();
const app = createHandler(app); export const handler
Once we’ve configured the app we can now use express as we normally would!
.post(
app'/web-mentions',
async (request: RequestContext, response: Response) => {},
;
)
.get(
app'/web-mentions/status/:id',
async (request: RequestContext, response: Response) => {},
;
)
.post(
app'/web-mentions/query',
async (request: RequestContext, response: Response) => {},
; )
Deploying
AWS SAM provides single command deploy, I’ve wrapped both build and deploy up into scripts in the projects package.json
.
export REGION='ap-southeast-2' # Our target AWS region
export PROFILE='pfych-aws' # Our target AWS profile as configured in AWS-CLI
npm run build
npm run deploy:dev
This process takes a few minutes on first deploy but should be sub minute on any subsequent deploys as long as you don’t change the SAM Template.
In closing
It was a fun challenge to implement Webmentions in a serverless manner. It wasn’t extremely difficult since the spec is well-defined. I did skip over web mention updates & deletions, but I’ll implement them at another time. Serverless definitely has warts & almost all of it is in its tooling & documentation. I hope that with time this can improve.
My next steps is to implement sending Webmentions from my CMS when I make new posts, but I haven’t had too much time to work on personal projects recently.
The project is available on Github. It currently is deployed with the Serverless Framework but there is a pending PR to use AWS SAM instead since it has less 3rd party dependencies.