Overview

emacs-module-rs provides high-level Rust binding and tools to write Emacs's dynamic modules. It is easy to use if you know either Rust or Emacs.

It currently supports:

  • Stable Rust 1.56+.
  • Emacs 25 or above, built with module support.
  • macOS, Linux, Windows.

Setting up

  • Install the Rust toolchain with rustup.
  • Make sure that your Emacs was compiled with module support. Check that module-file-suffix is not nil, and the function module-load is defined.
    • On macOS, the recommended installation method is MacPorts (emacs-app and emacs-mac-app).
    • On Windows, the recommended installation method for older Emacs versions (before 27.2) is pacman -S mingw-w64-x86_64-emacs (msys2), as the archives on GNU FTP's server were built without module support.

Notes

  • On Windows, only Rust's msvc toolchain was confirmed to work, not the gnu toolchain.
  • When the optional feature bindgen is enabled, the raw binding will be generated from emacs-module.h at build time. Therefore you will also need to install clang. (This is recommended only for troubleshooting, though.) For example, on Windows:
    # In Powershell
    scoop install llvm
    
    $env:LIBCLANG_PATH = "$(scoop prefix llvm)\bin"
    cargo build --all
    

Known Issues

There is a bug (see issue #1) with Emacs 26 on Linux that prevents it from loading any dynamic modules (even those written in C), if:

  • Emacs is built without thread support.
  • The OS is Ubuntu 16.04 (Xenial).

Hello, Emacs!

Create a new project:

cargo new greeting
cd greeting

Modify Cargo.toml:

[package]
edition = "2018"

[lib]
crate-type = ["cdylib"]

[dependencies]
emacs = "0.19"

Write code in src/lib.rs:

#![allow(unused)]
fn main() {
use emacs::{defun, Env, Result, Value};

// Emacs won't load the module without this.
emacs::plugin_is_GPL_compatible!();

// Register the initialization hook that Emacs will call when it loads the module.
#[emacs::module]
fn init(env: &Env) -> Result<Value<'_>> {
    env.message("Done loading!")
}

// Define a function callable by Lisp code.
#[defun]
fn say_hello(env: &Env, name: String) -> Result<Value<'_>> {
    env.message(&format!("Hello, {}!", name))
}
}

Build the module and create a symlink with .so extension so that Emacs can recognize it:

cargo build
cd target/debug

# If you are on Linux
ln -s libgreeting.so greeting.so

# If you are on macOS
ln -s libgreeting.dylib greeting.so

Add target/debug to your Emacs's load-path, then load the module:

(add-to-list 'load-path "/path/to/target/debug")
(require 'greeting)
(greeting-say-hello "Emacs")

The minibuffer should display the message Hello, Emacs!.

Declaring a Module

Each dynamic module must have an initialization function, marked by the attribute macro #[emacs::module]. The function's type must be fn(&Env) -> Result<()>.

In addition, in order to be loadable by Emacs, the module must be declared GPL-compatible.

#![allow(unused)]
fn main() {
emacs::plugin_is_GPL_compatible!();

#[emacs::module]
fn init(env: &Env) -> Result<()> {
    // This is run when Emacs loads the module.
    // More concretely, it is run after all the functions it defines are exported,
    // but before `(provide 'feature-name)` is (automatically) called.
    Ok(())
}
}

Options

  • name: By default, the name of the feature provided by the module is the crate's name (with _ replaced by -). There is no need to explicitly call provide inside the initialization function. This option allows the function's name, or a string, to be used instead.

    #![allow(unused)]
    fn main() {
    // Putting `rs` in crate's name is discouraged so we use the function's name
    // instead. The feature will be `rs-module-helper`.
    #[emacs::module(name(fn))]
    fn rs_module_helper(_: &Env) -> Result<()> { Ok(()) }
    }
  • defun_prefix and separator: Function names in Emacs are conventionally prefixed with the feature name followed by -. These 2 options allow a different prefix and separator to be used.

    #![allow(unused)]
    fn main() {
    // Use `/` as the separator that goes after feature name, like some other packages.
    #[emacs::module(separator = "/")]
    fn init(_: &Env) -> Result<()> { Ok(()) }
    }
    #![allow(unused)]
    fn main() {
    // The whole package contains other Lisp files, so the module is named
    // `tree-sitter-dyn`. But we want functions to be `tree-sitter-something`,
    // not `tree-sitter-dyn-something`.
    #[emacs::module(name = "tree-sitter-dyn", defun_prefix = "tree-sitter")]
    fn init(_: &Env) -> Result<()> { Ok(()) }
    }
  • mod_in_name: Whether to use Rust's mod path to construct function names. Default to true. For example, supposed that the crate is named parser, a #[defun] named next_child inside mod cursor will have the Lisp name of parser-cursor-next-child. This can also be overridden for each individual function, by an option of the same name on #[defun].

Note: Often time, there's no initialization logic needed. A future version of this crate will support putting #![emacs::module] on the crate, without having to define a no-op function. See Rust's issue #54726.

Writing Functions

You can use the attribute macro #[defun] to export Rust functions to the Lisp runtime, so that Lisp code can call them. The exporting process happens when the module is loaded, even if the definitions are inside another function that is never called, or inside a private mod.

Input Parameters

Each parameter must be one of the following:

  • An owned value of a type that implements FromLisp. This is for simple data types that have an equivalent in Lisp.
    #![allow(unused)]
    fn main() {
    /// This docstring will appear in Lisp too!
    #[defun]
    fn inc(x: i64) -> Result<i64> {
        Ok(x + 1)
    }
    }
  • A shared/mutable reference. This gives access to data structures that other module functions have created and embedded in the Lisp runtime (through user-ptr objects).
    #![allow(unused)]
    fn main() {
    #[defun]
    fn stash_pop(repo: &mut git2::Repository) -> Result<()> {
        repo.stash_pop(0, None)?;
        Ok(())
    }
    }
  • A Lisp Value, or one of its "sub-types" (e.g. Vector). This allows holding off the conversion to Rust data structures until necessary, or working with values that don't have a meaningful representation in Rust, like Lisp lambdas.
    #![allow(unused)]
    fn main() {
    #[defun]
    fn maybe_call(lambda: Value) -> Result<()> {
        if some_hidden_native_logic() {
            lambda.call([])?;
        }
        Ok(())
    }
    
    #[defun(user_ptr)]
    fn to_rust_vec_string(input: Vector) -> Result<Vec<String>> {
        let mut vec = vec![];
        for e in input {
            vec.push(e.into_rust()?);
        }
        Ok(vec)
    }
    }
  • An &Env. This enables interaction with the Lisp runtime. It does not appear in the function's Lisp signature. This is unnecessary if there is already another parameter with type Value, which allows accessing the runtime through Value.env.
    #![allow(unused)]
    fn main() {
    // Note that the function takes an owned `String`, not a reference, which would
    // have been understood as a `user-ptr` object containing a Rust string.
    #[defun]
    fn hello(env: &Env, name: String) -> Result<Value<'_>> {
        env.message(format!("Hello, {}!", name))
    }
    }

Return Value

The return type must be Result<T>, where T is one of the following:

  • A type that implements IntoLisp. This is for simple data types that have an equivalent in Lisp.
    #![allow(unused)]
    fn main() {
    /// Return the path to the .git dir.
    /// Return `nil' if the given path is not in a repo,
    /// or if the .git path is not valid utf-8.
    #[defun]
    fn dot_git_path(path: String) -> Result<Option<String>> {
        Ok(git2::Repository::discover(&path).ok().and_then(|repo| {
            repo.path().to_str().map(|s| s.to_owned())
        }))
    }
    }
  • An arbitrary type. This allows embedding a native data structure in a user-ptr object, for read-write use cases. It requires user_ptr option to be specified. If the data is to be shared with background Rust threads, user_ptr(rwlock) or user_ptr(mutex) must be used instead.
    #![allow(unused)]
    fn main() {
    #[defun(user_ptr)]
    fn repo(path: String) -> Result<git2::Repository> {
        Ok(git2::Repository::discover(&path)?)
    }
    }
  • A type that implements Transfer. This allows embedding a native data structure in a user-ptr object, for read-only use cases. It requires user_ptr(direct) option to be specified.
  • Value, or one of its "sub-types" (e.g. Vector). This is mostly useful for returning an input parameter unchanged.

See Custom Types for more details on embedding Rust data structures in Lisp's user-ptr objects.

Naming

By default, the function's Lisp name has the form <feature-prefix>[mod-prefix]<base-name>.

  • feature-prefix is the feature name followed by -. This can be customized by the name, defun_prefix, and separator options on #[emacs::module].
  • mod-prefix is constructed from the function's Rust mod path (with _ and :: replaced by -). This can be turned off crate-wide, or for individual function, using the option mod_in_name.
  • base-name is the function's Rust name (with _ replaced by -). This can be overridden with the option name.

Examples:

#![allow(unused)]
fn main() {
// Assuming crate's name is `native_parallelism`.

#[emacs::module(separator = "/")]
fn init(_: &Env) -> Result<()> { Ok(()) }

mod shared_state {
    mod thread {
        // Ignore the nested mod's.
        // (native-parallelism/make-thread "name")
        #[defun(mod_in_name = false)]
        fn make_thread(name: String) -> Result<Value<'_>> {
            ..
        }
    }

    mod process {
        // (native-parallelism/shared-state-process-launch "bckgrnd")
        #[defun]
        fn launch(name: String) -> Result<Value<'_>> {
            ..
        }

        // Specify a name explicitly, since Rust identifier cannot contain `:`.
        // (native-parallelism/process:pool "http-client" 2 8)
        #[defun(mod_in_name = false, name = "process:pool")]
        fn pool(name: String, min: i64, max: i64) -> Result<Value<'_>> {
            ..
        }
    }
}
}

Documentation

#[defun] converts Rust's docstring into Lisp's docstring. It also automatically constructs and appends the function's signature to the end of the docstring, so that help modes can correctly display it.

#![allow(unused)]
fn main() {
// `(fn X Y)` is automatically appended, so you don't have to manually do so.
// In help modes, the signature will be (add X Y).

/// Add 2 numbers.
#[defun]
fn add(x: usize, y: usize) -> Result<usize> {
    Ok(x + y)
}
}

Calling Lisp Functions

Frequently-used Lisp functions are exposed as methods on env:

#![allow(unused)]
fn main() {
env.intern("defun")?;

env.message("Hello")?;

env.type_of(5.into_lisp(env)?)?;

env.provide("my-module")?;

env.list((1, "str", true))?;
}

To call arbitrary Lisp functions, use env.call(func, args).

  • func can be:
    • A string identifying a named function in Lisp.
    • Any Lisp-callable Value (a symbol with a function assigned, a lambda, a subr). This can also be written as func.call(args).
  • args can be:
    • An array, or a slice of Value.
    • A tuple of different types, each satisfying the IntoLisp trait.
#![allow(unused)]
fn main() {
// (list "str" 2)
env.call("list", ("str", 2))?;
}
#![allow(unused)]
fn main() {
let list = env.intern("list")?;
// (symbol-function 'list)
let subr = env.call("symbol-function", [list])?;
// (funcall 'list "str" 2)
env.call(list, ("str", 2))?;
// (funcall (symbol-function 'list) "str" 2)
env.call(subr, ("str", 2))?;
subr.call(("str", 2))?; // Like the above, but shorter.
}
#![allow(unused)]
fn main() {
// (add-hook 'text-mode-hook 'variable-pitch-mode)
env.call("add-hook", [
    env.intern("text-mode-hook")?,
    env.intern("variable-pitch-mode")?,
])?;
}
#![allow(unused)]
fn main() {
#[defun]
fn listify_vec(vector: Vector) -> Result<Value> {
    let mut args = vec![];
    for e in vector {
        args.push(e)
    }
    vector.0.env.call("list", &args)
}
}

Type Conversions

The type Value represents Lisp values:

  • They can be copied around, but cannot outlive the Env they come from.
  • They are "proxy values": only useful when converted to Rust values, or used as arguments when calling Lisp functions.

Converting a Lisp Value to Rust

This is enabled for types that implement FromLisp. Most built-in types are supported. Note that conversion may fail, so the return type is Result<T>.

#![allow(unused)]
fn main() {
let i: i64 = value.into_rust()?; // error if Lisp value is not an integer
let f: f64 = value.into_rust()?; // error if Lisp value is nil

let s = value.into_rust::<String>()?;
let s: Option<&str> = value.into_rust()?; // None if Lisp value is nil
}

It's better to declare input types for #[defun] than calling .into_rust(), unless delayed conversion is needed.

Converting a Rust Value to Lisp

This is enabled for types that implement IntoLisp. Most built-in types are supported. Note that conversion may fail, so the return type is Result<Value<'_>>.

#![allow(unused)]
fn main() {
"abc".into_lisp(env)?;
"a\0bc".into_lisp(env)?; // NulError (Lisp string cannot contain null byte)

5.into_lisp(env)?;
65.3.into_lisp(env)?;

().into_lisp(env)?; // nil
true.into_lisp(env)?; // t
false.into_lisp(env)?; // nil
}

It's better to declare return type for #[defun] than calling .into_lisp(env), whenever possible.

Integers

Integer conversion is lossless by default, which means that a module will signal an "out of range" rust-error in cases such as:

  • A #[defun] expecting u8 gets passed -1.
  • A #[defun] returning u64 returns a value larger than i64::max_value().

To disable this behavior, use the lossy-integer-conversion feature:

[dependencies.emacs]
features = ["lossy-integer-conversion"]

Support for Rust's NonZero integer types is disabled by default. To enable it, use the nonzero-integer-conversion feature:

[dependencies.emacs]
features = ["nonzero-integer-conversion"]

Strings

By default, no utf-8 validation is done when converting Lisp strings into Rust strings, because the string data returned by Emacs is guaranteed to be valid utf-8 sequence. If you think you've otherwise encountered an Emacs bug, utf-8 validation can be enabled through a feature:

[dependencies.emacs]
features = ["utf-8-validation"]

Vectors

Lisp vectors are represented by the type Vector, which can be considered a "sub-type" of Value.

To construct Lisp vectors, use env.make_vector and env.vector, which are efficient wrappers of Emacs's built-in subroutines make-vector and vector.

#![allow(unused)]
fn main() {
env.make_vector(5, ())?;

env.vector([1, 2, 3])?;

env.vector((1, "x", true))?;
}

Embedding Rust Values in Lisp

Speeding up Emacs is one of the goals of dynamic modules. Too many back-and-forth conversions between Rust's data structures and Lisp's can defeat the purpose. The solution to this is embedding Rust data structures in opaque user-ptr Lisp objects.

If a type implements Transfer, its heap-allocated (Box-wrapped) values can be moved into the Lisp runtime, where the GC will become its owner.

Lisp code sees these as opaque "embedded user pointers", whose printed representation is something like #<user-ptr ptr=0x102e10b60 finalizer=0x103c9c390>. For these values to be useful, a Rust module needs to export additional functions to manipulate them.

Since these values are owned by the GC, Rust code can only safely access them through immutable references. Therefore, interior mutability is usually needed. As a result, Transfer is implemented for the smart pointer types RefCell, Mutex, RwLock, Rc, and Arc.

To return an embedded value, a function needs to be exported with a user_ptr option:

  • user_ptr: Embedding through a RefCell. This is suitable for common use cases, where module functions can borrow the underlying data back for read/write. It is safe because Lisp threads are subjected to the GIL. BorrowError/BorrowMutError may be signaled at runtime, depending on how module functions call back into the Lisp runtime.
  • user_ptr(rwlock), user_ptr(mutex): Embedding through a RwLock/Mutex. This is suitable for sharing data between module functions (on Lisp threads, with Env access) and pure Rust code (on background threads, without access to an Env).
  • user_ptr(direct): Embedding a Transfer value directly. This is suitable for immutable data that will only be read back (not written to) by module functions (writing requires unsafe access, and is discouraged).

As an example, a module that allows Emacs to use Rust's HashMap may look like this:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use emacs::{defun, Env, Result, Value};

#[emacs::module(name = "rs-hash-map", separator = "/")]
fn init(env: &Env) -> Result<()> {
    type Map = HashMap<String, String>;

    #[defun(user_ptr)]
    fn make() -> Result<Map> {
        Ok(Map::new())
    }

    #[defun]
    fn get(map: &Map, key: String) -> Result<Option<&String>> {
        Ok(map.get(&key))
    }

    #[defun]
    fn set(map: &mut Map, key: String, value: String) -> Result<Option<String>> {
        Ok(map.insert(key,value))
    }

    Ok(())
}
}
(let ((m (rs-hash-map/make)))
  (rs-hash-map/get m "a")     ; -> nil

  (rs-hash-map/set m "a" "1") ; -> nil
  (rs-hash-map/get m "a")     ; -> "1"

  (rs-hash-map/set m "a" "2") ; -> "1"
  (rs-hash-map/get m "a"))    ; -> "2"

Notes:

  • Value.into_rust() has a runtime type check, which fails with the error 'rust-wrong-type-user-ptr if the value is a user-ptr object of a different type.
  • Input parameters with reference types are interpreted as RefCell-embedded user-ptr objects. For other kinds of embedding, you will have to use a Value parameter, and acquire the reference manually, since locking strategy (including deadlock avoidance/detection) should be module-specific.
    #![allow(unused)]
    fn main() {
    use std::sync::RwLock;
    
    #[defun(user_ptr(rwlock))]
    fn make() -> Result<Map> {
        Ok(Map::new())
    }
    
    #[defun]
    fn get(v: Value<'_>, key: String) -> Result<Value<'_>> {
        let lock: &RwLock<Map> = v.into_rust()?;
        let map = lock.try_read().map_err(|_| Error::msg("map is busy"))?;
        map.get(&key).into_lisp(v.env)
    }
    }

Lifetime-constrained Types

When a type is constrained by a (non-static) lifetime, its value cannot be embedded unchanged. Before embedding, the lifetime must be soundly elided. In other words, static ownership must be correctly given up.

The typical example is a struct holding a reference to another struct:

#![allow(unused)]
fn main() {
pub struct Tree;

pub struct Node<'t> {
    pub tree: &'t Tree,
}

impl Tree {
    pub fn root_node(&self) -> Node<'_> {
        ...
    }
}

impl<'t> Node<'t> {
    pub fn child(&self) -> Node<'t> {
        ...
    }
}
}

In this case, the lifetime can be elided by turning the static reference into a dynamic ref-counted pointer. The rental crate provides a convenient way to do this:

#![allow(unused)]
fn main() {
#[macro_use]
extern crate rental;

use std::{rc::Rc, marker::PhantomData};
use emacs::{defun, Result};

// PhantomData is need because map_suffix requires a type parameter.
// See https://github.com/jpernst/rental/issues/35.
pub struct PhantomNode<'t, T>(Node<'t>, PhantomData<T>);

impl<'t> PhantomNode<'t, ()> {
    fn child(&self) -> Self {
        PhantomNode(self.0.child(), PhantomData)
    }
}

rental! {
    pub mod inner {
        use std::rc::Rc;

        // Self-referential struct that holds both
        // the actual Node and the ref-counted Tree.
        #[rental(map_suffix = "T")]
        pub struct RentingNode<T: 'static> {
            tree: Rc<super::Tree>,
            node: super::PhantomNode<'tree, T>
        }
    }
}

type RentingNode = inner::RentingNode<()>;

#[defun(user_ptr)]
fn root_node(tree: Value) -> Result<RentingNode> {
    let rc: &Rc<Tree> = tree.into_rust()?;
    Ok(RentingNode::new(rc.clone(), |tree| tree.root_node()))
}

#[defun(user_ptr)]
fn child(node: &RentingNode) -> Result<RentingNode> {
    node.map(|n| n.child())
}
}

Note that there's no unsafe involved directly, as the soundness proofs are already encapsulated in rental macros.

Error Handling and Signaling

Emacs Lisp's error handling mechanism uses non-local exits. Rust uses Result enum. emacs-module-rs converts between the 2 at the Rust-Lisp boundaries (more precisely, Rust-C).

The chosen error type is the Error struct from anyhow crate:

#![allow(unused)]
fn main() {
pub type Result<T> = result::Result<T, anyhow::Error>;
}

Handling Lisp Errors in Rust

When calling a Lisp function, it's usually a good idea to propagate signaled errors with the ? operator, letting higher level (Lisp) code handle them. If you want to handle a specific error, you can use error.downcast_ref:

#![allow(unused)]
fn main() {
match env.call("insert", &[some_text]) {
    Err(error) => {
        // Handle `buffer-read-only` error.
        if let Some(Signal { symbol, .. }) = error.downcast_ref::<ErrorKind>() {
            let buffer_read_only = env.intern("buffer-read-only")?;
            // `symbol` is a `TempValue` that must be converted to `Value`.
            let symbol = unsafe { Ok(symbol.value(env)) };
            if env.eq(symbol, buffer_read_only) {
                env.message("This buffer is not writable!")?;
                return Ok(())
            }
        }
        // Propagate other errors.
        Err(error)
    },
    v => v,
}
}

Note the use of unsafe to extract the error symbol as a Value. The reason is that, ErrorKind::Signal is marked Send+Sync, for compatibility with anyhow, while Value is lifetime-bound by env. The unsafe contract here requires the error being handled (and its TempValue) to come from this env, not from another thread, or from a global/thread-local storage.

Catching Values Thrown by Lisp

This is similar to handling Lisp errors. The only difference is ErrorKind::Throw being used instead of ErrorKind::Signal.

Signaling Lisp Errors from Rust

The function env.signal allows signaling a Lisp error from Rust code. The error symbol must have been defined, e.g. by the macro define_errors!:

#![allow(unused)]
fn main() {
// The parentheses denote parent error signals.
// If unspecified, the parent error signal is `error`.
emacs::define_errors! {
    my_custom_error "This number should not be negative" (arith_error range_error)
}

#[defun]
fn signal_if_negative(env: &Env, x: i16) -> Result<()> {
    if (x < 0) {
        return env.signal(my_custom_error, ("associated", "DATA", 7))
    }
    Ok(())
}
}

Handling Rust Errors in Lisp

In addition to standard errors, Rust module functions can signal Rust-specific errors, which can also be handled by condition-case:

  • rust-error: The message is Rust error. This covers all generic Rust-originated errors.
  • rust-wrong-type-user-ptr: The message is Wrong type user-ptr. This happens when Rust code is passed a user-ptr of a type it's not expecting. It is a sub-type of rust-error.
    #![allow(unused)]
    fn main() {
    // May signal if `value` holds a different type of hash map,
    // or is a `user-ptr` defined in a non-Rust module.
    let r: &RefCell<HashMap<String, String>> = value.into_rust()?;
    }

Panics

Unwinding from Rust into C is undefined behavior. emacs-module-rs prevents that by using catch_unwind at the Rust-to-C boundary to convert a panic into a Lisp's signal/throw of the appropriate type:

  • Normally the panic is converted into a Lisp's error signal of the type rust-panic. Note that it is not a sub-type of rust-error.
  • If the panic value is an ErrorKind, it is converted to the corresponding signal/throw, as if a Result was returned. This allows propagating Lisp's non-local exits through contexts where Result is not appropriate, e.g. callbacks whose types are dictated by 3rd-party libraries, such as tree-sitter.

Testing

You can define tests using ert, then use a bash script to load the module and run the tests. Examples:

For continuous testing during development, run this (requires cargo-watch):

bin/test watch

A future version will have tighter integration with either cargo or Cask.

Live Reloading

Live code reloading is very useful during development. However, Emacs does not support unloading modules. Live reloading thus requires a custom module loader, e.g. emacs-rs-module, which is itself a dynamic module.

To use it, load it in Emacs:

(require 'rs-module)

Then use it to load other modules instead of require or module-load:

;; Will unload the old version of the module first.
(rs-module/load "full/path/to/module.so")

cargo doesn't support installing dynamic libs yet, so you have to include emacs-rs-module as a dev dependency to compile it on your own:

[dev-dependencies]
emacs-rs-module = { version = "0.13.0" }

magit-libgit2 is an example of how to set this all up, to have live-reloading on-save.

A future version will have tighter integration with cargo.

Notes:

  • It mainly works on Linux, but potentially because Linux's dynamic loading system is unsafe (i.e. ridden with UB traps).
  • It doesn't work on macOS 10.13+ (High Sierra and up), because macOS doesn't unload dynamic libraries that use TLS (thread-local storage), for safety reason. See Rust's issue #28794.
  • It doesn't work on Windows, since loading the dll prevents writing to its file.