How I broke Gatsby JS conditional page build and learned to debug the tool chain
5/29/2020
I built my AWS CodeBuild Pipeline with a new feature called "Conditional Page Builds" and it did not worked as expected in the build environment.
Starting the build process again with no change in the source code and with identical copies of the .cache
and public
folder generated this output:
1...2info One or more of your plugins have changed since the last time you ran Gatsby. As a precaution, we're deleting your site's cache to ensure there's no stale data.3success initialize cache - 0.011s4...5success Building production JavaScript and CSS bundles - 30.860s6success Rewriting compilation hashes - 0.005s7...8success run queries - 49.184s - 44/44 0.89/s9success Building static HTML for pages - 17.453s - 40/40 2.29/s10success Delete previous page data - 0.000s11success Generating image thumbnails - 166.557s - 316/316 1.90/s12...
a second call to GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES=true gatsby build --log-pages
immediately following the first one shows the expected output:
1...2success onPreInit - 0.020s3success initialize cache - 0.012s4...5success Building production JavaScript and CSS bundles - 4.623s6success run queries - 4.804s - 4/4 0.83/s7success Building static HTML for pages - 0.732s - 0/0 0.00/s8success Delete previous page data - 0.002s9...
This proves the fact that the conditional page build is working and no page contains any data which is different on each creation.
Comparing the two outputs I found these problems:
- Cache invalidation by changed plugins
info One or more of your plugins have changed since the last time you ran Gatsby. As a precaution, we're deleting your site's cache to ensure there's no stale data.
- Rewriting compilation hashes
- Regenerating Images
success Generating image thumbnails
To open an issue on github I had to create a simple setup simulating the problems with the environment.
Issue 1 and 2 (posted on github) where my fault by creating a temporal file with a name starting with gatsby-*
which conflicted with
a part of gatsby creating a hash on the content of all files starting with this name.
creating a simpler setup
- create a gatsby site based on the default starter template
- upgrade gatsby and the tool chain to the latest stable versions
1npx gatsby new gatsby-starter2cd gatsby-starter3yarn upgrade --latest
on the first run of GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES=true yarn build --log-pages
the result was as expected - there was no .cache
and no public
folder.
The build process creates the compilation hashes [24], the static HTML [26] and the image thumbnails [28].
1$ gatsby build --log-pages2success open and validate gatsby-configs - 0.034s3success load plugins - 1.153s4success onPreInit - 0.020s5success initialize cache - 0.012s6success copy gatsby files - 0.122s7success onPreBootstrap - 0.018s8success createSchemaCustomization - 0.008s9success source and transform nodes - 0.140s10success building schema - 0.535s11success createPages - 0.002s12success createPagesStatefully - 0.082s13success onPreExtractQueries - 0.002s14success update schema - 0.051s15success extract queries from components - 0.340s16success write out requires - 0.007s17success write out redirect data - 0.002s18success Build manifest and related icons - 0.211s19success onPostBootstrap - 0.220s20⠀21info bootstrap finished - 8.052s22⠀23success Building production JavaScript and CSS bundles - 6.059s24success Rewriting compilation hashes - 0.061s25success run queries - 6.757s - 7/7 1.04/s26success Building static HTML for pages - 0.904s - 4/4 4.43/s27success Delete previous page data - 0.003s28success Generating image thumbnails - 7.766s - 6/6 0.77/s29success onPostBuild - 0.018s30info Done building in 15.908345021 sec31info Built pages:32Updated page: /404/33Updated page: /34Updated page: /page-2/35Updated page: /404.html36✨ Done in 16.24s.
running the same build command, a second time gives this output:
1$ gatsby build --log-pages2success open and validate gatsby-configs - 0.033s3success load plugins - 0.827s4success onPreInit - 0.018s5success initialize cache - 0.007s6success copy gatsby files - 0.109s7success onPreBootstrap - 0.009s8success createSchemaCustomization - 0.006s9success source and transform nodes - 0.067s10success building schema - 0.375s11success createPages - 0.002s12success createPagesStatefully - 0.139s13success onPreExtractQueries - 0.004s14success update schema - 0.046s15success extract queries from components - 0.286s16success write out requires - 0.016s17success write out redirect data - 0.004s18success Build manifest and related icons - 0.223s19success onPostBootstrap - 0.249s20⠀21info bootstrap finished - 6.451s22⠀23success Building production JavaScript and CSS bundles - 4.623s24success run queries - 4.804s - 4/4 0.83/s25success Building static HTML for pages - 0.732s - 0/0 0.00/s26success Delete previous page data - 0.002s27success onPostBuild - 0.002s28info Done building in 12.096156853 sec29✨ Done in 12.45s.
as expected the following steps have been skipped:
- Rewriting compilation hashes
- Building static HTML 0/0
- Generating Images
simulating the build process of the build Pipeline
- create temporal directory
- clone the git repository to a temporal directory
- restore the build cache
- run the build command
- save the build cache
a simple bash script codepipeline.sh
would look
like this (don't use this for production as many edge cases are missing!):
1#!/bin/bash2set -x34BUILD_CACHE=/tmp/buildcache.tgz5TMPDIR=$(mktemp -d)6git clone https://github.com/aheissenberger/gatsby-cache-problem.git $TMPDIR7if [ -f "$BUILD_CACHE" ]; then8 tar -xf "$BUILD_CACHE" --directory $TMPDIR9fi10pushd $TMPDIR11yarn install12GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES=true yarn build --log-pages || exit 113if [ -f "$BUILD_CACHE" ]; then14 rm "$BUILD_CACHE"15fi16tar czf "$BUILD_CACHE" .cache/ public/17popd
Run this script twice and you will see that on the second run the image generation still happens:
1...2success Building static HTML for pages - 1.621s - 0/0 0.00/s3success Delete previous page data - 0.004s4success Generating image thumbnails - 14.491s - 6/6 0.41/s <<==LOOK HERE5success onPostBuild - 0.012s6info Done building in 21.23928199 sec7✨ Done in 21.56s.8+ '[' -f /tmp/buildcache.tgz ']'9+ rm /tmp/buildcache.tgz10+ tar czf /tmp/buildcache.tgz .cache/ public/11+ popd
I documented my environment and posted an issue on github after not finding any other issue with a similar problem.
Since I am not very patient, I decided to look at the problem myself.
How to debug Gatsby JS in VS Code
The first thing I do when testing out a new framework is setting up the Javascript debugger in VS Code. Gatsby provides a very good documentation which makes this a breeze.
Add this to your existing VS Code launch.json
to debug conditional page building:
1{2 "name": "Gatsby build Conditional",3 "type": "node",4 "request": "launch",5 "protocol": "inspector",6 "program": "${workspaceRoot}/node_modules/gatsby/dist/bin/gatsby",7 "args": ["build", "--write-to-file", "--log-pages"],8 "env": {"GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES":"true","CI":"true"},9 "skipFiles": [],10 "stopOnEntry": false,11 "runtimeArgs": ["--nolazy"],12 "sourceMaps": false13}
activate searching in the node_modules
folder
open the VS Code configuration .vscode/settings.json
and add this:
1{2 "search.exclude": {3 "**/node_modules":false4 },5 "search.useIgnoreFiles":false6}
Dig into the problem
The goal is to find the place in the code where the logic is which compares values from the cache with the existing environment and decides to rerun the image generation.
let's search for
Generating image thumbnails
in thenode_modules
folder the result will listCHANGELOG.md
andnode_modules/gatsby-plugin-sharp/utils.js:38
looking into the second file will show us the code which prints the success messageset a break point in this line [38]
remove the
.cache
andpublic
folder:rm -fr .cache/ ./public/
start the VS Code Debugger and choose
Gatsby build Conditional
from the pulldownthe debugger stops at line 38
[VS Code debugging / utils.js:38] look at the
CALL STACK
in the left column and click on the second line - this is the script which called this function[VS Code debugging / gatsby-node.js:91] we see that this code part of the
gatsby-plugin-sharp
is activated by the eventCREATE_JOB_V2
if this would have been a normal function call, simple looking up the CALL STACK would be easy, but this is an event which could have been raised anywhere in the code based
cancel the debugger and search again for
CREATE_JOB_V2
inside thenode_modules
folder the search result contains 6 files wherenode_modules/gatsby/dist/redux/actions/public.js
looks most promising[VS Code search result / public.js:1198] looking at the 8 lines of this function:
1actions.createJobV2 = (job, plugin) => (dispatch, getState) => {2 const currentState = getState();3 const internalJob = createInternalJob(job, plugin);4 const jobContentDigest = internalJob.contentDigest; // Check if we already ran this job before, if yes we return the result5 // We have an inflight (in progress) queue inside the jobs manager to make sure6 // we don't waste resources twice during the process78 if (currentState.jobsV2 && currentState.jobsV2.complete.has(jobContentDigest)) {9 return Promise.resolve(currentState.jobsV2.complete.get(jobContentDigest).result);10 }11 // removed...12}
- line 2 gets the state from the redux store which was populated from the
.cache
folder - line 3 creates a new Job object
- line 4 gets the content digest hash from the new job
- line 8 does a lookup in the redux store if the job has been done before
set a breakpoint on
const internalJob = createInternalJob(job, plugin);
remove the
.cache
andpublic
folder:rm -fr .cache/ ./public/
run debugger - choose
Gatsby build Conditional
from the pulldownwhen the debugger pauses we can inspect the job attributes
[VS Code debug / public.js:1182] the left column shows us the following attributes with absolute paths
/User/xxxxxxxx/...
:
job.inputPaths, job.outputDir, plugin.pluginFilepath, plugin.resolvestep into the function
createInternalJob
to check on the creation of the diggest[VS Code debug / jobs-manager.js] scrolldown until you see the place where the
contentDigest
is created:1internalJob.contentDigest = createContentDigest({2 name: job.name,3 inputPaths: internalJob.inputPaths.map(inputPath => inputPath.contentDigest),4 outputDir: internalJob.outputDir,5 args: internalJob.args,6 plugin: internalJob.plugin7});
We found the Problem
The hash created by Gatsby Job Queue Manager will always break when compared to values in the cache which have been created in a source folder with a different absolute path as long as the Job and the Plugin object contain absolute paths.
I also create an issue on github where you can follow the next steps.
Quick Fix
Always copy the source folder to the same path which has been used when the .cache
folder was created.
Long term Solutions to the Problem
- convert the object to a JSON string and replace the path to the root folder with a stable string or hash (still only a quick fix)
- do not allow to store any absolute path in any Gatsby data structures:
Job
objects,plugin
objects
I hope I could give some people a first impression on how to track down a problem by yourselves and provide maintainers of a project a better documentation to fix architectural problems.