ezyang’s blog

the arc of software bends towards understanding

Setting up Cabal, the FFI and c2hs

This part two of a six part introduction to c2hs. Today, we discuss getting the damn thing to compile in the first place.

Reader prerequisites. You should know how to write, configure and build a vanilla Cabal file for pure Haskell. Fortunately, with cabal init, this is easier than ever. I'll talk about how to setup a Cabal file for linking in C files, which is applicable to any sort of FFI writing (as it turns out, enabling c2hs is the trivial bit).

Enabling c2hs. Trick question; Cabal will automatically detect files with the extension chs and run c2hs with appropriate flags on them. However, since this operation might fail if the user hasn't installed c2hs, you should add the following line to your Cabal file:

Build-tools: c2hs

You should be able to compile an empty Haskell module that has chs as its file extension now.

(There is some Cabal hook code for adding c2hs preprocessor support, but it is completely unnecessary.)

Looking at the resulting hs. Once a chs file has been preprocessed, Cabal will not look at it any more. You should not be afraid of looking at the preprocessor output; in many cases, it will be far more elucidating when you're trying to fix a type error. In general, the hs file will be located in the dist/build directory, as this build message (generated by GHC, not c2hs) shows:

 Building abcBridge-0.1...
[ 5 of 11] Compiling Data.ABC.Internal.VecPtr (
  dist/build/Data/ABC/Internal/VecPtr.o )

The code you see will look something like this:

-- GENERATED by C->Haskell Compiler, version 0.16.2 Crystal Seed, 24 Jan 2009 (Haskell)
-- Edit the ORIGNAL .chs file instead!

{-# LINE 1 "src/Data/ABC/Internal/VecPtr.chs" #-}{-# LANGUAGE ForeignFunctionInterface #-}

module Data.ABC.Internal.VecPtr where

The LINE pragmas ensure that when a type error is generated by the resulting Haskell code, you will get back line numbers that refer to the original chs file. This is not error-proof; sometimes errors will show up one or two lines where c2hs claims the error is.

Imports and language features. c2hs generates Haskell code that needs some language features and imports. You should explicitly add the ForeignFunctionInterface language pragma to the top of your program; while it is possible to enable this via the Cabal file, it's good form to make your hs files as standalone as possible.

In the current version of c2hs, module imports are a little more subtle. c2hs has a legacy module named C2HS that performs imports, re-exports and extra marshalling functions (necessary only if you're using fun) that C2HS may generate by default. However, it is on its way to the dustbin, and the c2hs Cabal package doesn't actually supply this module: you need to copy it into your source directory with c2hs -l. This module depends on haskell98. You should not re-export this module, so it should go in your Other-modules Cabal field.

The modern approach is to do the imports and definitions explicitly yourself. The modules to import are Foreign and Foreign.C, and there is a small assortment of marshalling functions that Haskell will complain are not defined when you try to use fun with that marshaller. Future versions of c2hs will further reduce the necessary functions. gtk2hs takes this approach (although they also forgo most of C2HS's automated marshalling support).

Loading the library. If you are lucky, your package manager has the library you'd like to create bindings for available. In this case, you only need to add the name of the library to Extra-libraries in the Library section of your Cabal. For example, if you want to use readline, add readline to your field, and GHC will know to find the headers in /usr/include/readline and dynamically link in /usr/lib/libreadline.so. In some cases, a library will install itself in a standard location that is not searched for by default (for example, Oracle on Linux systems, and basically any library on Windows); in this case, you can tell Cabal where this "non-standard standard" location is with Extra-lib-dirs.

If your C library is not a good citizen (which is the case with many niche libraries), some extra steps need to be taken. Here are some common situations, and suggestions for how to deal with them:

  1. The library is small and has a simple build process. In this case, it is feasible to bundle the library's source with your package and manage its compilation entirely with Cabal. If your library offers no make install, this may be your only option, besides asking your users to manually supply the necessary linker options to hook up the two installs (not a very user-friendly option, in particular, it makes running cabal install complicated). You should only do this with small amounts of source code, since the GHC-directed compilation is much slower than a usual build. See Compiling the library with Cabal and Managing includes.
  2. I want to bundle the library for X reason, but its build process is complicated. In such a case, it is possible to setup Cabal to call the libraries build process, and then use the resulting files for the Haskell build process. There are numerous disadvantages to this, including a messy Cabal file and a messy install process, so if you're able to do (3), I recommend that instead. See Compiling the library with hooks for details. You should also read Managing includes.
  3. I don't want to bundle the library. In this case, you will need to give instructions for end-users to download, compile and install the external library. It will be a lot easier for users if you, the package author, go and package the library for various distributions, so that it becomes a well-behaved, albeit seldom installed, library. If a user is unwilling to install the library in the canonical paths, they will need to pass cabal the appropriate options. See Manual linking.

Compiling the library with Cabal. Cabal has the ability to compile C code in a very simple fashion: it takes a list of files from the Cabal field C-sources and compiles them in that order. In particular, it doesn't do any dependency tracking, so when you feed it the list of files, make sure they're in the right order! This makes this mechanism appropriate only for small amounts of C, including C that you may write yourself to aid the binding process. There is a growing convention to place c files in cbits, and h files in include. You can then tell Cabal about these directories with the following lines:

-- This ensures that Cabal places these files in the release tarball,
-- which is important if you plan to release
Extra-source-files: cbits, include
-- ...
Library foobar
  -- ...
  -- The C source files to compile, in that order
  C-sources: cbits/foobar.c, cbits/foobaz.c
  -- The location of the header files
  Include-dirs: include
  -- The header files to be included
  Includes: foobar.h, foobaz.h
  -- Header files to install
  Install-includes: foobar.h, foobaz.h

A few words about the "includes" fields:

  • The Includes field will probably not make a user-visible difference when the compilation goes well. However, it is good form to specify because Cabal will then go and check that those include files exist and are usable prior to compilation, giving the user a better error message if there are problems. Usage. Specify any standard headers and any bundled headers that your package uses.
  • The Install-includes field will cause Cabal to place those header files in a public location upon installation. This is necessary for older versions of GHC to compile your code or if modules that use your module need to perform C includes of your library or cbits; it's generally good form to install your headers. Usage. Specify just the bundled headers that your package uses and exports.

Compiling the library with hooks. If there are over a dozen C files to be compiled, you may want to let the traditional configure && make process handle things for you. In this case, it may be appropriate to setup a small hook in Cabal's Setup.hs using the experimental hooks interface to invoke the compilation. Here is a simple sample build script:

import Distribution.Simple
import Distribution.Simple.Setup
import Distribution.Simple.Utils (rawSystemExit)

main = defaultMainWithHooks simpleUserHooks
    { preBuild = \a b -> makeLib a b >> preBuild simpleUserHooks a b }

makeLib :: Args -> BuildFlags -> IO ()
makeLib _ flags =
    rawSystemExit (fromFlag $ buildVerbosity flags) "env"
        ["CFLAGS=-D_LIB", "make", "--directory=abc", "libabc.a"]

We've added our own makeLib build script to the preBuild (while preserving the old simpleUserHooks version), and use a Cabal utility function rawSystemExit to do most of the lifting for us. Notice that --directory=abc needed to be passed to make; Cabal runs in the same directory as the cabal file, and so you'll probably need to adjust your working directory to the library directory. setCurrentDirectory may come in handy.

Your build process will probably place the resulting libfoo.a file somewhere not dist/build. You can tell Cabal to look in that directory using the Extra-lib-dirs field.

The above steps are enough to get a clean source checkout of your software working, but to ensure that users will be able to install the result of cabal sdist, you will need to go a little further.

First, any source file that the build processes you will need to explicitly list in Extra-source-files. Cabal only affords a limited form of globbing, which must be in the filename and contain a file extension, so this list can get quite long (and we recommend you generate it with a script.)

Second, the static/dynamic libraries that the build process creates probably will not be placed in a place that GHC will look when compiling, resulting in this error:

Linking dist/build/abc-test/abc-test ...
/usr/bin/ld: cannot find -labc
collect2: ld returned 1 exit status

We can place our library in the same place where Cabal places the static libraries of Haskell modules during installation with another hook:

import Distribution.Simple
import Distribution.Simple.Setup
import Distribution.Simple.Utils (rawSystemExit)
import Distribution.PackageDescription (PackageDescription(..))
import Distribution.Simple.LocalBuildInfo (
        LocalBuildInfo(..), InstallDirs(..), absoluteInstallDirs)

main = defaultMainWithHooks simpleUserHooks
    { preConf = \a f -> makeAbcLib a f >> preConf simpleUserHooks a f
    , copyHook = copyAbcLib

-- ...

copyAbcLib :: PackageDescription -> LocalBuildInfo -> UserHooks -> CopyFlags -> IO ()
copyAbcLib pkg_descr lbi _ flags = do
    let libPref = libdir . absoluteInstallDirs pkg_descr lbi
                . fromFlag . copyDest
                $ flags
    rawSystemExit (fromFlag $ copyVerbosity flags) "cp"
        ["abc/libabc.a", libPref]

The incant to the right of libPref determines where Cabal is going to install the library files, and then we simply copy our libraries to that location.

(Nota bene. You should really only use this trick if you're sure no one is going to install this library globally, because having non-binary compatible libraries floating around with the same name is no fun at all.)

Managing includes. Any non-standard directories that need to be in the include path should be added to Include-dirs. If there are a lot of such directories in the library, consider an alternate solution: create symlinks to all of the relevant header files in include and then just add that directory to Include-dirs.

Manual linking. If you need to manually tell Cabal where the relevant headers and libraries are, you can use the --extra-include-dirs and --extra-lib-dirs flags with cabal configure or cabal install. They function just like Include-dirs and Extra-lib-dirs.

Cohabiting Library and Executable sections. You may find it convenient to define a number of Executable sections in your Cabal file for testing, in which case you'll notice that you seem to need to duplicate all of the C-related Cabal fields to each of your executable sections. Well, in Cabal, you can now set Build-depends to point to your same package ("self-reference"); so you declare a Build-depends on your own package for each executable and the C-related Cabal fields are unnecessary.

You will need to tell Cabal that it's OK to use this feature with this field:

Cabal-version:      >=

Postscript. Thanks Duncan Coutts for helping clarify and suggest improvements sections of this tutorial.

Next time. Principles of FFI API design.

2 Responses to “Setting up Cabal, the FFI and c2hs”

  1. Ha says:

    Are you the same EZY that wrote an O’Caml implementation of Count-Min? If so, how fascinating, because I was led to this page in attempt to learn how to statically link a c implementation of that algorithm to Haskell using cabal.

Leave a Comment