Screenshot system with Node.js and Lambda

Ats
7 min readFeb 11, 2024

--

I needed to build a system to take screenshots to automate the collection of visual data from the Internet. This is the notes of what I did.

Photo by sarandy westfall on Unsplash

Background

I started a new project in my job last week. I can’t write the details of it here but, Briefly, I needed to automate to take a screenshot of some materials and store the image data on the cloud. What I wanted to take screenshots of is a table filled in with our service data like the following image. The data is changed depending on who and when the table is needed by. So it’s quite dynamic.

I had known there were some tools to do that. However, I had never done it before. Then I started the investigation to achieve it as easily as possible. Firstly, I came up with a solution using Figma. But its API is quite limited to use, especially for creating objects for now (2024/Feb/10). So I changed my direction to using HTML to make tables and started coding.

What I did

Firstly, I made HTML and CSS quickly and tried to take a screenshot of it somewhere. I tested two tools, Satori and Puppeteer.

Satori is provided by Vercel for creating Open Graph images and Puppeteer is a high-level API to control Chrome/Chromium in the headless browser. Playwrite can be my option instead of Puppeteer but I read an article saying GitHub uses Puppeteer for Open Graph images so that I chose Puppeteer rather than Playwrite. I made a simple function for each and tested both.

After testing, I decided to go with Puppeteer because there are some limitations of HTML and CSS in Satori. I can’t use table tags or display: table so I had to make my table with only div and display: flex . When I didn’t have any limitations, the number of my HTML and CSS lines was about 50 lines for each. However, when I did, the number was more than 100 lines for each. So to keep the code simple, I decided to use Puppeteer.

Then I needed to host the HTML and CSS somewhere because Puppeteer just opens web pages in a headless browser and takes screenshots. So I made a page for it on the existing service and the data is provided by the server and tables change dynamically. If I used Satori, I wouldn’t need to host them because Satori generate HTML snippet as JSX. So hopefully, I want to replace Puppeteer with Satori after the situation is improved.

I chose lambda to run the node function and used SAM to make a quick working environment. This is my first time using SAM but it was much easier than I thought.

I followed the article to set up a node v20 environment in lambda.

Then I tried to run the test script which I shared above. I got the error.

<--- Last few GCs --->
al[14:0x5602ec806830] 956 ms: Mark-Compact (reduce) 115.6 (118.3) -> 115.6 (118.3) MB, 69.38 / 0.00 ms (+ 2.4 ms in 2 steps since start of marking, biggest step 2.2 ms, walltime since start of marking 75 ms) (average mu = 0.175, current mu = 0.114) al[14:0x5602ec806830] 1008 ms: Mark-Compact (reduce) 115.6 (118.3) -> 115.6 (118.3) MB, 46.23 / 0.00 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 47 ms) (average mu = 0.149, current mu = 0.113) al

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

1: 0x5602e77c33a3 node::Abort() [/var/lang/bin/node]
2: 0x5602e76669d3 [/var/lang/bin/node]
3: 0x5602e7a0a31d v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/var/lang/bin/node]
4: 0x5602e7a0a6e9 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/var/lang/bin/node]
5: 0x5602e7c82f3a [/var/lang/bin/node]
6: 0x5602e7c83472 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/var/lang/bin/node]
7: 0x5602e7c9caab v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [/var/lang/bin/node]
8: 0x5602e7c9d318 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/var/lang/bin/node]
9: 0x5602e7c9ecb0 v8::internal::Heap::CollectAllAvailableGarbage(v8::internal::GarbageCollectionReason) [/var/lang/bin/node]
10: 0x5602e7c6ec81 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/var/lang/bin/node]
11: 0x5602e7c48d1e v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/var/lang/bin/node]
12: 0x5602e811d02d v8::internal::Runtime_AllocateInOldGeneration(int, unsigned long*, v8::internal::Isolate*) [/var/lang/bin/node]
13: 0x5602e85cb2b6 [/var/lang/bin/node]
06 Feb 2024 18:51:54,339 [ERROR] (rapid) Invoke failed error=Runtime exited with error: signal: aborted InvokeID=f9e563a6-0b3b-49de-8b64-231db0d4df70
06 Feb 2024 18:51:54,356 [ERROR] (rapid) Invoke DONE failed: Runtime.ExitError

2024-02-06 18:51:55 127.0.0.1 - - [06/Feb/2024 18:51:55] "GET /screenshot HTTP/1.1" 500 -

So I needed to increase the heap memory. Based on the following documents, I needed to set NODE_OPTIONS like below.

Resources:
ScreenshotFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: lambda_screenshot/
Handler: app.lambdaHandler
Runtime: nodejs20.x
Architectures:
- x86_64
Events:
Screenshot:
Type: Api
Properties:
Path: /screenshot
Method: get
Environment:
Variables:
NODE_OPTIONS: --max-old-space-size=1536

Then I ran the lambda again and I got another error.

{
"timestamp":"2024-02-06T20:02:31.899Z",
"level":"INFO",
"requestId":"b8181b7e-5dec-460d-b20f-71f79e3fb2b5",
"message":{
"errorType":"Error",
"errorMessage":"Could not find Chrome (ver. 121.0.6167.85). This can occur if either\n 1. you did not perform an installation before running the script (e.g. `npx puppeteer browsers install chrome`) or\n 2. your cache path is incorrectly configured (which is: /root/.cache/puppeteer).\nFor (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.",
"stackTrace":[
"Error: Could not find Chrome (ver. 121.0.6167.85). This can occur if either",
" 1. you did not perform an installation before running the script (e.g. `npx puppeteer browsers install chrome`) or",
" 2. your cache path is incorrectly configured (which is: /root/.cache/puppeteer).",
"For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.",
" at rV.resolveExecutablePath (/deps/90387c80-da05-4019-80ff-da8e0a010f0b/node_modules/puppeteer-core/src/node/ProductLauncher.ts:434:17)",
" at rV.executablePath (/deps/90387c80-da05-4019-80ff-da8e0a010f0b/node_modules/puppeteer-core/src/node/ChromeLauncher.ts:283:19)",
" at rV.computeLaunchArguments (/deps/90387c80-da05-4019-80ff-da8e0a010f0b/node_modules/puppeteer-core/src/node/ChromeLauncher.ts:149:31)",
" at rV.launch (/deps/90387c80-da05-4019-80ff-da8e0a010f0b/node_modules/puppeteer-core/src/node/ProductLauncher.ts:99:24)",
" at r (/private/var/folders/fh/zhsb2cb120l_tc8xv1jth9r40000gn/T/tmp03oa41b3/app.ts:6:23)",
" at Runtime.Vfr (/private/var/folders/fh/zhsb2cb120l_tc8xv1jth9r40000gn/T/tmp03oa41b3/app.ts:36:22)"
]
}
}

Long story short, the error says I didn’t install puppeteer correctly so that the headless browser didn’t work properly. I think the right way to solve this problem is to create a custom Dockerfile and download it correctly. But I didn’t want to create my Dockerfile because the main JS file was quit short like less than 30 lines and I didn’t want to maintain the Dockerfile for the small task. So I started to look for another way to sort out. Then I found the package to run node on Lambda.

There were some old similar packages and this is currently main package to run the headless browser in Lambda. So I installed it to my project and changed the settings of launch function like below.

const browser = await puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath(),
headless: chromium.headless,
});

Then I ran the lambda and got the error.

{
"timestamp":"2024-02-11T10:41:11.318Z",
"level":"INFO",
"requestId":"2f7f7779-17d3-4cb3-8197-8311316e89e9",
"message":{
"errorType":"Error",
"errorMessage":"The input directory \"/var/bin\" does not exist.",
"stackTrace":[
"Error: The input directory \"/var/bin\" does not exist.",
" at Function.executablePath (/deps/90387c80-da05-4019-80ff-da8e0a010f0b/node_modules/@sparticuz/chromium/build/index.js:241:19)",
" at getEncodedImage (/private/var/folders/fh/zhsb2cb120l_tc8xv1jth9r40000gn/T/tmpnf6fz4i8/app.ts:10:40)",
" at Runtime.swA (/private/var/folders/fh/zhsb2cb120l_tc8xv1jth9r40000gn/T/tmpnf6fz4i8/app.ts:33:28)",
" at Runtime.handleOnceNonStreaming (file:///var/runtime/index.mjs:1173:29)"
]
}
}

Actually, I didn’t get the meaning of the error. However, I saw the issues on the repository and found some comments saying that there was something wrong with Node.js v20.x and TypeScript. So I downgraded the Node.js version to v18.x and disabled TypeScript then the error had gone.

Then the lambda completed successly. But it was missing emojis. In my case, emojis are very important to show the tables. So I followed the README like below.

await chromium.font(
"https://raw.githack.com/googlei18n/noto-emoji/master/fonts/NotoColorEmoji.ttf"
);

However, it didn’t work well. I did a quick investigation and found an issue similar to mine.

There seems to be a problem with downloading fonts through CDN. So I downloaded the Noto Color Emoji ttf file and located it in Lambda like below.

|____lambda_screenshot
| |____package.json
| |____app.mjs
| |____fonts
| | |____NotoColorEmoji.ttf

Then I changed the code to load the font like below.

await chromium.font("/var/task/fonts/NotoColorEmoji.ttf");

Finally, screenshots display emojis.

That’s it!

--

--

Ats
Ats

Written by Ats

I like building something tangible like touch, gesture, and voice. Ruby on Rails / React Native / Yocto / Raspberry Pi / Interaction Design / CIID IDP alumni

Responses (1)