Rendering Markdown in cgit on OpenBSD

Act 1: Background

OpenBSD.

Much of the software developed by the operating system’s developers is focussed on minimising potential security issues, such as developing the pledge() syscall to drop program capabilities and privleges, or their development of LibreTLS following the Heartbleed incident, or even disabling SMT by default.

The OpenBSD developers eventually grew tired of patching web servers to drop privileges and chroot to the webroot (having done so with both Apache and Nginx), so they instead opted to write their own simple HTTP server. This chroot turns out to be a little bit of problem, though.

If you poke at it enough, you might discover that this website is hosted on an OpenBSD server. Given that httpd is part of the base system, I decided to give it a try since the whole point of this server when I set it up was to explore OpenBSD. A year and half has gone by, and projects on this server have come and gone, but not the web server. I wanted to try out setting up cgit, and while there was some struggle to set it up (since you can’t directly use references like the Arch Wiki), I eventually got it going. Enough background, though, what actually was the questionable decision?

Act 2: In Which Markdown Forces My Hand

Markdown is nice for quick writing and not fussing about with other things. This website is implemented with Hugo and Markdown documents, for example. It’s also good for writing READMEs in git repositories. SourceHut and GitHub take care of rendering your README.md files without you even giving it thought. On the other hand, cgit requires some fiddling. The cgitrc file can have an about-filter option specified, which expects the path to a script. The script should then execute a program or another script which takes the file to convert to HTML on stdin and return the converted HTML on stdout.

On other platforms, cgit ships with a set of Python scripts which can convert various file types into rendered HTML, notably Markdown. These scripts to do not ship with cgit from OpenBSD’s ports collection – and why should they? Python is not available inside the webroot and thus unvailable in the httpd process' chroot.

I wanted my Markdown-written READMEs to be rendered into HTML, so I needed a solution to this.

Act 3: Basiclaly Building a New System

There is a program called lowdown available in the ports collection. It is a simple binary written in C that is can take the place of the Python script from upstream cgit. The only trouble is, it’s dynamically linked to a few things.

1
2
3
4
5
6


/usr/local/bin/lowdown:                                                                         
	Start            End              Type  Open Ref GrpRef Name                            
	000003de4d065000 000003de4d092000 exe   1    0   0      /usr/local/bin/lowdown          
	000003e056a91000 000003e056ac1000 rlib  0    1   0      /usr/lib/libm.so.10.1           
	000003e0da388000 000003e0da47c000 rlib  0    1   0      /usr/lib/libc.so.96.0           
	000003e0b20ce000 000003e0b20ce000 ld.so 0    1   0      /usr/libexec/ld.so              

Now, I could have cloned the ports tree and build lowdown statically, but my server is running low on disk space. I chose to make do with what I had. Copying the binary is not enough, though. We need the rest of the dynamically linked libraries. So we create /var/www/usr/lib and /var/www/usr/libexec and copy the libraries into those directories.

Some issues remain though.

Since cgit expects a script, executing the script is bound to fail since there is no /bin/sh in the webroot. bash has too many linked libraries and we want to minimise the amount of mess we make. Let’s take a look at one of the shells that forms part of the base system, ksh:

1
2
3


/bin/ksh:                                                                       
	Start            End              Type  Open Ref GrpRef Name            
	00000a96af156000 00000a96af1fc000 dlib  1    0   0      /bin/ksh        

Perfect! No other linked libraries. We’ll copy this into the webroot and use this as our shell. Now let’s take a look at the default about-formatting.sh script from upstream cgit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27


#!/bin/sh

# This may be used with the about-filter or repo.about-filter setting in cgitrc.
# It passes formatting of about pages to differing programs, depending on the usage.

# Markdown support requires python and markdown-python.
# RestructuredText support requires python and docutils.
# Man page support requires groff.

# The following environment variables can be used to retrieve the configuration
# of the repository for which this script is called:
# CGIT_REPO_URL        ( = repo.url       setting )
# CGIT_REPO_NAME       ( = repo.name      setting )
# CGIT_REPO_PATH       ( = repo.path      setting )
# CGIT_REPO_OWNER      ( = repo.owner     setting )
# CGIT_REPO_DEFBRANCH  ( = repo.defbranch setting )
# CGIT_REPO_SECTION    ( = section        setting )
# CGIT_REPO_CLONE_URL  ( = repo.clone-url setting )

cd "$(dirname $0)/html-converters/"
case "$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]')" in
	*.markdown|*.mdown|*.md|*.mkd) exec /bin/lowdown; ;;
	*.rst) exec ./rst2html; ;;
	*.[1-9]) exec ./man2html; ;;
	*.htm|*.html) exec cat; ;;
	*.txt|*) exec ./txt2html; ;;
esac

If we try to let cgit call this script as is, it’s going to fail every time. The reason being, commands like dirname, printf, and tr are still unavailable. Ignoring the dirname line, we just need to copy in printf and tr.

From here, we set our cgitrc file to point to this script, and let the final script be:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


#!/bin/sh

# This may be used with the about-filter or repo.about-filter setting in cgitrc.
# It passes formatting of about pages to differing programs, depending on the usage.

# Markdown support requires python and markdown-python.
# RestructuredText support requires python and docutils.
# Man page support requires groff.

# The following environment variables can be used to retrieve the configuration
# of the repository for which this script is called:
# CGIT_REPO_URL        ( = repo.url       setting )
# CGIT_REPO_NAME       ( = repo.name      setting )
# CGIT_REPO_PATH       ( = repo.path      setting )
# CGIT_REPO_OWNER      ( = repo.owner     setting )
# CGIT_REPO_DEFBRANCH  ( = repo.defbranch setting )
# CGIT_REPO_SECTION    ( = section        setting )
# CGIT_REPO_CLONE_URL  ( = repo.clone-url setting )

case "$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]')" in
	*.markdown|*.mdown|*.md|*.mkd) exec /bin/lowdown; ;;
	*.rst) exec ./rst2html; ;;
	*.[1-9]) exec ./man2html; ;;
	*.htm|*.html) exec cat; ;;
	*.txt|*) exec ./txt2html; ;;
esac

After all of this work, cgit should render our README.md file into HTML. Sadly, it does not do syntax highlighting. Another problem for different day, I suppose.

unusable systems

Rendering Markdown in cgit on OpenBSD