A site server with live reload

The core of my static site generator is the build command: take some input files, process them —render templates, convert other markup formats into HTML, minify— and output a ready-to-serve website. This is where I started for jorge, not only because it was core functionality but because I needed to see the org-mode output as early as possible, to learn if I could expect this project to ultimately replace my Jekyll setup.

I technically had a static site generator as soon as the build command was working, but for it to be minimally useful I needed to be able to preview a site while working on it: a serve command. It could be as simple as running a local file server of the build target directory, but ideally, it would also watch for changes in the source files and live-reload the browser tabs looking at them.

I was aiming for more than just the basics here because serve was the only non-trivial command of this project: the one with the most Go learning potential —and the most fun. For similar reasons, I wanted to tackle it early on: since it wasn’t immediately obvious how I would implement it, it was here where unknown-unknowns and blockers were most likely to come up. Once build and serve were out of the way, I’d be almost done with the project, with only nice-to-have features and UX improvements remaining.

The beauty of the serve command was that I could start with a naive implementation and iterate towards the ideal one, keeping it functional at every step of the way. This post summarizes that process.

A basic file server

The simplest serve implementation consisted of building the site once and serving the target directory locally. The standard net/http package has a file server for that:

import (
	"net/http"

	"github.com/facundoolano/jorge/config"
	"github.com/facundoolano/jorge/site"
)

func Serve(config config.Config) error {
	// load and build the project
	if err := site.Build(config); err != nil {
		return err
	}

	// mount the target dir on a local file server
	fs := http.FileServer(http.Dir(config.TargetDir))
	http.Handle("/", fs)

	fmt.Println("server listening at http://localhost:4001/")
	return http.ListenAndServe(":4001", nil)
}

I only had to make a minor change to the code above (based on this StackOverflow answer), to omit the .html extension from URLs such that, for instance, target/blog/hello.html would be served at /blog/hello:

type HTMLFileSystem struct {
	dirFS http.Dir
}

func (htmlFS HTMLFileSystem) Open(name string) (http.File, error) {
	// Try name as supplied
	f, err := htmlFS.dirFS.Open(name)
	if os.IsNotExist(err) {
		// Not found, try with .html
		if f, err := htmlFS.dirFS.Open(name + ".html"); err == nil {
			return f, nil
		}
	}
	return f, err
}

This HTMLFileSystem wrapped around the standard http.Dir one I was handing to the file server:

-	fs := http.FileServer(http.Dir(config.TargetDir))
+	fs := http.FileServer(HTMLFileSystem{http.Dir(config.TargetDir)})
	http.Handle("/", fs)

	fmt.Println("server listening at http://localhost:4001/")
	return http.ListenAndServe(":4001", nil)

Watching for changes

As a next step, I needed the command to watch the project source directory and trigger new builds whenever a file changed. I found the fsnotify library for this exact purpose; the fact that both Hugo and gojekyll listed as a dependency suggested that it was a reasonable choice for the job.

Following an example from the fsnotify documentation, I created a watcher and a goroutine that triggered a site.Build call every time a file-change event was received:

func runWatcher(config *config.Config) {
	watcher, _ := fsnotify.NewWatcher()
	defer watchProjectFiles(watcher, config)

	go func() {
		for event := range watcher.Events {
			fmt.Printf("file %s changed\n", event.Name)

			// src directories could have changed
			// so project files need to be re-watched every time
			watchProjectFiles(watcher, config)
			site.Build(*config)
		}
	}()
}

Then made this watcher look for changes in the project src/ directory:

func watchProjectFiles(watcher *fsnotify.Watcher, config *config.Config) {
	// fsnotify watches all files within a dir, but non-recursively.
	// This walks through the src dir adding watches for each subdir
	filepath.WalkDir(config.SrcDir, func(path string, entry fs.DirEntry, err error) error {
		if entry.IsDir() {
			watcher.Add(path)
		}
		return nil
	})
}

Build optimizations

At this point I had a useful file server, always responding with the most recent version of the site. But the responsiveness of the serve command wasn’t ideal: it processed the entire website for every small edit I made on a source file. I wanted to attempt some performance improvements here, but without introducing much complexity: rather than supporting incremental or conditional builds —which would have required tracking state and dependencies between files—, I wanted to keep building the entire site on every change, only faster.

The first cheap optimization was obvious from looking at the command output: most of the work was copying static assets (e.g. images, static CSS files, etc.). So I changed the site.Build implementation to optionally create links instead of copying the files over to the target.

The next thing I wanted to try was to process source files concurrently. The bulk of the work was done by an internal site method:

type site struct {
	config  config.Config
	// ...
}

func (site *site) build() error {
	// clear previous target contents
	os.RemoveAll(site.Config.TargetDir)

	// walk the source directory, creating directories and files at the target dir
	return filepath.WalkDir(site.Config.SrcDir, func(path string, entry fs.DirEntry, err error) error {
		subpath, _ := filepath.Rel(site.Config.SrcDir, path)
		targetPath := filepath.Join(site.Config.TargetDir, subpath)

		// if it's a directory, just create the same at the target
		if entry.IsDir() {
			return os.MkdirAll(targetPath, FILE_RW_MODE)
		}

		// if it's a file render or copy it to the target
		return site.buildFile(path, targetPath)
	})
}

This site.build method walks the source file tree, recreating it at the target. For non-directory files, it calls another method, site.buildFile, to do the actual processing (rendering templates, converting markdown and org-mode syntax to HTML, and writing the results to the target files). I wanted multiple site.buildFile calls to run in parallel; I found the facilities I needed (worker pools and wait groups) in a couple of Go by Example entries:

// Runs a pool of workers to build files.
// Returns a channel to send the paths of files to be built
// and a WaitGroup to wait for them to finish processing.
func spawnBuildWorkers(site *site) (*sync.WaitGroup, chan string) {
	var wg sync.WaitGroup
	files := make(chan string, 20)

	for range runtime.NumCPU() {
		wg.Add(1)
		go func(files <-chan string) {
			defer wg.Done()
			for path := range files {
				site.buildFile(path)
			}
		}(files)
	}
	return &wg, files
}

The function above creates a buffered channel to send source file paths and a worker pool that reads from it. Each worker registers itself on a WaitGroup that can be used by callers to block until all work is done.

Now I just needed to adapt the build function to spawn the workers and send them paths through the channel, instead of processing the files inline:

func (site *site) build() error {
	// clear previous target contents
	os.RemoveAll(site.Config.TargetDir)

+	wg, files := spawnBuildWorkers(site)
+	defer wg.Wait()
+	defer close(files)

	// walk the source directory, creating directories and files at the target dir
	return filepath.WalkDir(site.config.SrcDir, func(path string, entry fs.DirEntry, err error) error {
		subpath, _ := filepath.Rel(site.Config.SrcDir, path)
		targetPath := filepath.Join(site.Config.TargetDir, subpath)

		// if it's a directory, just create the same at the target
		if entry.IsDir() {
			return os.MkdirAll(targetPath, FILE_RW_MODE)
		}

-		// if it's a file render or copy it to the target
-		return site.buildFile(path, targetPath)
+		// if it's a file send the path to a worker
+		// to render or copy it to the target
+		files <- path
+		return nil
	})
}

the close(files) call informs the workers that no more work will be sent, and wg.Wait() blocks until all of them finish executing.

I was very satisfied to see a sequential piece of code turned into a concurrent one with minimal structural changes, without affecting its outer function callers. In other languages, a similar operation would have required me to add async and await statements all over the place1.

These couple of optimizations resulted in a good enough user experience, so I didn’t need to attempt more complex ones.

Live reload

Without having looked into their code, I presumed that the live-reloading tools I had used in the past (jekyll serve, livedown) worked by running WebSocket servers and injecting some JavaScript in the HTML files they served. I wanted to see if I could get away with implementing live reloading for jorge serve with Server-sent events, a slightly simpler alternative to WebSockets that didn’t require a dedicated server.

Some Googling yielded the boilerplate code to send events from my Go HTTP server:

func ServerEventsHandler (res http.ResponseWriter, req *http.Request) {
	res.Header().Set("Content-Type", "text/event-stream")
	res.Header().Set("Connection", "keep-alive")
	res.Header().Set("Cache-Control", "no-cache")
	res.Header().Set("Access-Control-Allow-Origin", "*")

	for {
		select {
		case <-time.After(5 * time.Second):
			// send an event to the connected client.
			fmt.Fprint(res, "data: rebuild\n\n")
			res.(http.Flusher).Flush()
		case <-req.Context().Done():
			// client connection closed
			return
		}
	}
}
	fs := http.FileServer(HTMLFileSystem{http.Dir(config.TargetDir)})
	http.Handle("/", fs)
+	http.Handle("/_events/", ServerEventsHandler)

In this test setup, clients connected to the /_events/ endpoint would receive a "rebuild" message every 5 seconds. After a few attempts to get error handling right, I arrived at the corresponding JavaScript:

<script type="text/javascript">
var eventSource;

function newSSE() {
  console.log("connecting to server events");
  const url = location.origin + '/_events/';
  eventSource = new EventSource(url);

  // when the server sends an event, refresh the page
  eventSource.onmessage = function () {
    location.reload()
  };

  // close connection before refreshing the page
  window.onbeforeunload = function() {
    eventSource.close();
  }

  // on errors disconnect and attempt reconnection after a delay
  // this handles server restarting, laptop sleeping, etc.
  eventSource.onerror = function (event) {
    console.error('an error occurred:', event);
    eventSource.close();
    setTimeout(newSSE, 5000)
  };
}

newSSE();
</script>

Clients would establish an EventSource connection through the /_events/ endpoint and reload the window whenever a server-sent event arrived. I updated site.buildFile to inject this script tag in the header of every HTML file written to the target directory.

With the code above I had everything in place to send and receive events and reload the browser accordingly. I just needed to update the HTTP handler to only send those events in response to site rebuilds triggered by source file changes. I couldn’t just use a channel to connect the handler with the fsnotify watcher, since there could be multiple clients connected at a time (multiple tabs browsing the site), and each needed to receive the reload event —a single-channel message would be consumed by a single client. I needed some method to broadcast rebuild events; I introduced an EventBroker2 struct for this purpose:

// The event broker mediates between the file watcher
// that publishes site rebuild events
// and the clients listening for them to refresh the browser
type EventBroker struct

func newEventBroker() *EventBroker

// Adds a subscription to this broker events
// returning a subscriber id (useful for unsubscribing)
// and a channel where events will be delivered.
func (broker *EventBroker) subscribe() (uint64, <-chan string)

// Remove the subscriber with the given id from the broker,
// closing its associated channel.
func (broker *EventBroker) unsubscribe(id uint64)

// Publish an event to all the broker subscribers.
func (broker *EventBroker) publish(event string)

See here for the full EventBroker implementation.

The HTTP handler now needed to subscribe every connected client to the broker:

-func ServerEventsHandler (res http.ResponseWriter, req *http.Request) {
+func makeServerEventsHandler(broker *EventBroker) http.HandlerFunc {
+	return func(res http.ResponseWriter, req *http.Request) {
		res.Header().Set("Content-Type", "text/event-stream")
		res.Header().Set("Connection", "keep-alive")
		res.Header().Set("Cache-Control", "no-cache")
		res.Header().Set("Access-Control-Allow-Origin", "*")

+		id, events := broker.subscribe()
		for {
			select {
-			case <-time.After(5 * time.Second):
+			case <-events:
				// send an event to the connected client.
				fmt.Fprint(res, "data: rebuild\n\n")
				res.(http.Flusher).Flush()
			case <-req.Context().Done():
				// client connection closed
+				broker.unsubscribe(id)
				return
			}
		}
	}
}

The watcher, in turn, had to publish an event after every rebuild:

-func runWatcher(config *config.Config) {
+func runWatcher(config *config.Config, broker *EventBroker) {
	watcher, _ := fsnotify.NewWatcher()
	defer watchProjectFiles(watcher, config)

	go func() {
		for event := range watcher.Events {
			fmt.Printf("file %s changed\n", event.Name)

			// new src directories could be triggering this event
			// so project files need to be re-added every time
			watchProjectFiles(watcher, config)
			site.Build(*config)
+			broker.publish("rebuild")
		}
	}()
}

The command function connected the pieces:

func Serve(config config.Config) error {
	// load and build the project
	if err := site.Build(config); err != nil {
		return err
	}

	broker := newEventBroker()
	runWatcher(config, broker)

	// mount the target dir on a local file server
	fs := http.FileServer(http.Dir(config.TargetDir))
	http.Handle("/", fs)
	// handle client requests to listen to server-sent events
	http.Handle("/_events/", makeServerEventsHandler(broker))

	fmt.Println("server listening at http://localhost:4001/")
	return http.ListenAndServe(":4001", nil)
}

Handling event bursts

The code above worked, but not consistently. A file change would occasionally cause a browser refresh to a 404 page as if the new version of the file wasn’t written to the target directory yet. This happened because a single file edit could result in multiple writes, and those in a burst of fsnotify events (as mentioned in the documentation). The solution (also suggested by an example in the fsnotify repository) was to de-duplicate events by introducing a delay between event arrival and response. time.AfterFunc helped here:

func runWatcher(config *config.Config) *EventBroker {
	watcher, _ := fsnotify.NewWatcher()
-	defer watchProjectFiles(watcher, config)
	broker := newEventBroker()

+	rebuildAfter := time.AfterFunc(0, func() {
+		watchProjectFiles(watcher, config)
+		site.Build(*config)
+		broker.publish("rebuild")
+	})

	go func() {
		for event := range watcher.Events {
			fmt.Printf("file %s changed\n", event.Name)

-			watchProjectFiles(watcher, config)
-			site.Build(*config)
-			broker.publish("rebuild")
+			// Schedule a rebuild to trigger after a delay.
+			// If there was another one pending it will be canceled.
+			rebuildAfter.Stop()
+			rebuildAfter.Reset(100 * time.Millisecond)
		}
	}()
	return broker
}

The initial build is triggered immediately on setup (time.AfterFunc(0, ...)) but subsequent rebuilds are delayed 100 milliseconds (rebuildAfter.Reset(100 * time.Millisecond)), canceling previous pending ones.


That’s (approximately) the current implementation of the jorge serve command, which I used to write this post. You can see the full code here.

Notes


1

Related discussion: What Color is Your Function?

2

I’m not sure if “broker” is a proper name in this context since there’s a single event type and it’s sent to all subscribers. “Broadcaster” is probably more accurate, but it also sounds worse.