What you’ll learn
  • performance of write operations on Webiny Headless CMS
  • optimization suggestions

Results
anchor

image

TestRecords insertedAvg. response time (ms)p95 response time (ms)Throughput (req/sec)
Test A77,197408.05619.95116.88
Test B759,679414.64662.001,149.83
Test C1,506,112418.21656.002,279.87
Test D3,319,812432.45652.003,949.16
Test E3,343,666429.27661.003,977.34

What does this mean?
anchor

Requests per second is a number that helps you calculate how many users you can actually serve. The other part of that calculation is to know how your users behave. How many calls to the read API they are doing in a set time period.

As an example say your average visitors stays on your site 5 minutes, and does around 10 calls to the write API. Based on the throughput (req/sec) and this user behavior you can exact the following estimated values for how many concurrent users you can serve within that period:

TestThroughput (req/sec)Concurrency
Test A116.883,506
Test B1,149.8334,495
Test C2,279.8768,396
Test D3,949.16118,474
Test E3,977.34119,320

Formula: (Throughput*60sec*5min)/10 calls per user = total number of concurrent users in a 5 minute period.

Note: This is the formula for ideal conditions where user requests have an ideal spread of time between requests.


Benchmark overview
anchor

In this benchmark we are doing a GraphQL mutation request to the Headless CMS manage API. The mutation is inserting a new “Order” record and upon successful save, returning back the id of the new record.

Here is the full mutation that is being issued:

mutation {
    createOrder(data:
      {
        orderId: ${OrderID}
        orderDate: "${OrderDate}"
        shippingDate: "${ShipDate}"
        unitsSold: ${UnitsSold}
        unitPrice: ${UnitPrice}
        totalPrice: ${TotalRevenue}
        country: {
          modelId: "country"
          entryId: "${Country}"
        },
        itemType: {
          modelId: "itemType"
          entryId: "${ItemType}"
        },
        salesChannel: {
          modelId: "salesChannel"
          entryId: "${SalesChannel}"
        },
        orderPriority: {
          modelId: "orderPriority"
          entryId: "${OrderPriority}"
        },
      }) {
      data {
        id
      }
      error {
        message
      }
    }
}

The variables are replaced with real values during the test by Apache JMeter.

Content model structure

You can view the full content model structure and relations between different models on this link here .

Test plan
anchor

We performed 5 variations of this test. The reason is that Webiny uses Elasticsearch, which is the only non-serverless infrastructure piece that’s part of the architecture. Being non-serverless means we need to manually scale it. So we decided to run 5 different test variations. In each of variations we changed the load amount and the instance type.

TestNumber of usersRamp up timeInstancePrice per hour
Test A5060 sect3.small.elasticsearch (single AZ)1x $0.038
Test B50060 secm5.large.elasticsearch (2 zones)2x $0.164
Test C100060 secm5.2xlarge.elasticsearch (2 zones)2x $0.655
Test D20005 minr4.4xlarge.elasticsearch (2 zones)2x $1.841
Test E20005 minc5.18xlarge.elasticsearch (2 zones)2x $5.363

Load structure
anchor

The test is configured by ramping up to the defined number of user threads within the defined period. After the ramp up we keep a steady state where the maximum amount of user threads are doing requests as fast as the system can handle for a period of 10 minutes. The reason the ramp up time differs between the first 3 tests and the last 2 is that eu-west-2 region as a burst limit of 500 concurrent lambda executions per 1 minute. Meaning to achieve 2500 user in the stead state, we need to ramp up for 5 minutes. For more details on how the burst limit works, visit this AWS resource page .

Also, for test D and E, we had to increase the lambda concurrency to 2500 by filing an AWS support ticket. Webiny can go above that number, but for that the AWS team needs to additionally raise the concurrency limit. Finally to mention, the Elasticsearch instance in test D and E can take higher load than what our current lambda concurrency allowed for. (applies only to the write API)

Request flow
anchor

Each write request has the following flow:

Client -> CloudFront -> ApiGateway -> Lambda -> DynamoDB -> (async stream) -> Lambda -> Elasticsearch

Webiny writes in Elasticsearch via a DynamoDB stream, which is an asynchronous process. We do this as Elasticsearch can be a performance bottleneck and with this approach we prevent it from lowering the overall throughput.

Although Elasticsearch is not part of the main request, the service can still be overwhelmed if there is a lot of data that needs to be synced from DynamoDB into Elasticsearch in a short period of time.

About DynamoDB stream

The stream details, such as batchSize and maximumBatchingWindowInSeconds can be adjusted inside api/pulumi/elasticSearch.ts .

For additional info, check out the official AWS guide .

Additional thing to mention is that every write request has 1 DynamoDB write in general as we tend to do bulk writes, but there is also at least 1 read operation that happens as well. That read operation is issued by the Webiny security module to verify if the current user, or token, has the right to perform this action.


Report
anchor

We only extracted the charts for the last 2 tests as most charts showed the same behavior. In case you want to see the full report with all the charts for all the tests, click here .

Response times
anchor

Test D
anchor

Headless CMS benchmark - Response time

The response time was mostly consistent, with the exception around the 14:21 mark. At that point the response times spiked for a few seconds but then leveled off to the previous performance and stayed stable for next 5 minutes, until the end of the test.

Test E
anchor

Headless CMS benchmark - Response time

Throughput
anchor

Test D
anchor

Headless CMS benchmark - Throughput

Test E
anchor

Headless CMS benchmark - Throughput

In line to what we see in the response times, we see a drop in the throughput around the same time.

After investigating the CloudWatch metrics, we couldn’t find anything that would point to the fact that the drop in performance was to the AWS services. The response times on the CloudFront, API Gateway and Lambda functions haven’t changed. The only explanation we have is that it was a limit we hit either on the network or CPU on the load test machine.


Optimization suggestions
anchor

There are 3 key components that control your throughput and cost.

1. DynamoDB

By default DynamoDB is configured with on demand capacity. To lower your cost, once you have a good sense of your access patterns, we recommend you switch to provisioned capacity mode. This mode will be cheaper to use if configured correctly.

2. Lambda concurrency

By default your AWS account will have a soft limit of 1000 lambda concurrent executions. We recommend you increase this limit by filing a support request with AWS.

It’s also important to be aware of the burst capacity limits, for more info visit this AWS resource page .

3. Elasticsearch service

This is the only service which is not fully serverless and needs to be manually scaled. The above numbers should help you determine how to size accordingly.

Additionally, adjusting the batchSize and maximumBatchingWindowInSeconds can also help, but this is determined by your access patterns.


Cost
anchor

Total cost
anchor

TestCloudFrontApiGatewayLambdaDynamoDBElasticsearchTotal
Test A$0.08$0.08$0.26$0.58$0.01$1.01
Test B$0.83$0.76$2.58$3.48$0.08$7.73
Test C$1.65$1.51$5.20$6.56$0.33$15.25
Test D$3.64$3.32$11.79$14.85$0.92$34.52
Test E$3.75$3.34$11.78$13.24$2.68$34.79

The cost of serverless components has been calculated based on their usage. The cost of Elasticsearch has been calculated for a 15min period, based on the hourly rate.

Cost per 10k requests
anchor

image

TestHitsTotal costCost per 10k requests
Test A77,197$1.01$0.13083
Test B759,679$7.73$0.10175
Test C1,506,112$15.25$0.10125
Test D3,319,812$34.52$0.10398
Test E3,343,666$34.79$0.10405

CloudFront
anchor

TestHitsTraffic (GB)Cost
Test A77.3k0.09$0.08
Test B760k0.85$0.83
Test C1.51M1.69$1.65
Test D3.32M3.70$3.64
Test E3.34M3.75$3.66

API Gateway
anchor

TestHitsCost
Test A77.3k$0.08
Test B760k$0.76
Test C1.51M$1.51
Test C3.32M$3.32
Test C3.34M$3.34

Lambda
anchor

TestRequestsAvg. Duration (ms)Cost
Test A80.2k371$0.26
Test B762k382$2.58
Test C1.51M389$5.20
Test D3.32M402$11.79
Test E3.34M399$11.78

Lambda costs also include the cost that’s occurred by the DynamoDB stream. Webiny uses Lambda functions with 512MB of memory.

Notice how test E is cheaper by 1 cent from test D, it’s only 3ms difference in the execution time.

DynamoDB
anchor

TestRead opsWrite opsCost
Test A163k232k$0.58
Test B1.52M2.28M$3.48
Test C3.39M4.37M$6.56
Test D9.28M9.82M$14.85
Test E7.44M8.90M$13.24

Notice how test E is cheaper than test D. This is because we were able to fit a bit more records inside the DynamoDB batch window so the stream had less operations to execute.


You can download and check the full report here: https://github.com/webiny/benchmark/tree/main/benchmarks/results/hc-write-data