Overview
emacs-module-rs
provides high-level Rust binding and tools to write Emacs's dynamic modules. It is easy to use if you know either Rust or Emacs.
It currently supports:
- Stable Rust 1.56+.
- Emacs 25 or above, built with module support.
- macOS, Linux, Windows.
Setting up
- Install the Rust toolchain with rustup.
- Make sure that your Emacs was compiled with module support. Check that
module-file-suffix
is notnil
, and the functionmodule-load
is defined.- On macOS, the recommended installation method is MacPorts (
emacs-app
andemacs-mac-app
). - On Windows, the recommended installation method for older Emacs versions (before 27.2) is
pacman -S mingw-w64-x86_64-emacs
(msys2), as the archives on GNU FTP's server were built without module support.
- On macOS, the recommended installation method is MacPorts (
Notes
- On Windows, only Rust's
msvc
toolchain was confirmed to work, not thegnu
toolchain. - When the optional feature
bindgen
is enabled, the raw binding will be generated fromemacs-module.h
at build time. Therefore you will also need to installclang
. (This is recommended only for troubleshooting, though.) For example, on Windows:# In Powershell scoop install llvm $env:LIBCLANG_PATH = "$(scoop prefix llvm)\bin" cargo build --all
Known Issues
There is a bug (see issue #1) with Emacs 26 on Linux that prevents it from loading any dynamic modules (even those written in C), if:
- Emacs is built without thread support.
- The OS is Ubuntu 16.04 (Xenial).
Hello, Emacs!
Create a new project:
cargo new greeting
cd greeting
Modify Cargo.toml
:
[package]
edition = "2018"
[lib]
crate-type = ["cdylib"]
[dependencies]
emacs = "0.19"
Write code in src/lib.rs
:
#![allow(unused)] fn main() { use emacs::{defun, Env, Result, Value}; // Emacs won't load the module without this. emacs::plugin_is_GPL_compatible!(); // Register the initialization hook that Emacs will call when it loads the module. #[emacs::module] fn init(env: &Env) -> Result<Value<'_>> { env.message("Done loading!") } // Define a function callable by Lisp code. #[defun] fn say_hello(env: &Env, name: String) -> Result<Value<'_>> { env.message(&format!("Hello, {}!", name)) } }
Build the module and create a symlink with .so
extension so that Emacs can recognize it:
cargo build
cd target/debug
# If you are on Linux
ln -s libgreeting.so greeting.so
# If you are on macOS
ln -s libgreeting.dylib greeting.so
Add target/debug
to your Emacs's load-path
, then load the module:
(add-to-list 'load-path "/path/to/target/debug")
(require 'greeting)
(greeting-say-hello "Emacs")
The minibuffer should display the message Hello, Emacs!
.
Declaring a Module
Each dynamic module must have an initialization function, marked by the attribute macro #[emacs::module]
. The function's type must be fn(&Env) -> Result<()>
.
In addition, in order to be loadable by Emacs, the module must be declared GPL-compatible.
#![allow(unused)] fn main() { emacs::plugin_is_GPL_compatible!(); #[emacs::module] fn init(env: &Env) -> Result<()> { // This is run when Emacs loads the module. // More concretely, it is run after all the functions it defines are exported, // but before `(provide 'feature-name)` is (automatically) called. Ok(()) } }
Options
-
name
: By default, the name of the feature provided by the module is the crate's name (with_
replaced by-
). There is no need to explicitly callprovide
inside the initialization function. This option allows the function's name, or a string, to be used instead.#![allow(unused)] fn main() { // Putting `rs` in crate's name is discouraged so we use the function's name // instead. The feature will be `rs-module-helper`. #[emacs::module(name(fn))] fn rs_module_helper(_: &Env) -> Result<()> { Ok(()) } }
-
defun_prefix
andseparator
: Function names in Emacs are conventionally prefixed with the feature name followed by-
. These 2 options allow a different prefix and separator to be used.#![allow(unused)] fn main() { // Use `/` as the separator that goes after feature name, like some other packages. #[emacs::module(separator = "/")] fn init(_: &Env) -> Result<()> { Ok(()) } }
#![allow(unused)] fn main() { // The whole package contains other Lisp files, so the module is named // `tree-sitter-dyn`. But we want functions to be `tree-sitter-something`, // not `tree-sitter-dyn-something`. #[emacs::module(name = "tree-sitter-dyn", defun_prefix = "tree-sitter")] fn init(_: &Env) -> Result<()> { Ok(()) } }
-
mod_in_name
: Whether to use Rust'smod
path to construct function names. Default totrue
. For example, supposed that the crate is namedparser
, a#[defun]
namednext_child
insidemod cursor
will have the Lisp name ofparser-cursor-next-child
. This can also be overridden for each individual function, by an option of the same name on#[defun]
.
Note: Often time, there's no initialization logic needed. A future version of this crate will support putting #![emacs::module]
on the crate, without having to define a no-op function. See Rust's issue #54726.
Writing Functions
You can use the attribute macro #[defun]
to export Rust functions to the Lisp runtime, so that Lisp code can call them. The exporting process happens when the module is loaded, even if the definitions are inside another function that is never called, or inside a private mod
.
Input Parameters
Each parameter must be one of the following:
- An owned value of a type that implements
FromLisp
. This is for simple data types that have an equivalent in Lisp.#![allow(unused)] fn main() { /// This docstring will appear in Lisp too! #[defun] fn inc(x: i64) -> Result<i64> { Ok(x + 1) } }
- A shared/mutable reference. This gives access to data structures that other module functions have created and embedded in the Lisp runtime (through
user-ptr
objects).#![allow(unused)] fn main() { #[defun] fn stash_pop(repo: &mut git2::Repository) -> Result<()> { repo.stash_pop(0, None)?; Ok(()) } }
- A Lisp
Value
, or one of its "sub-types" (e.g.Vector
). This allows holding off the conversion to Rust data structures until necessary, or working with values that don't have a meaningful representation in Rust, like Lisp lambdas.#![allow(unused)] fn main() { #[defun] fn maybe_call(lambda: Value) -> Result<()> { if some_hidden_native_logic() { lambda.call([])?; } Ok(()) } #[defun(user_ptr)] fn to_rust_vec_string(input: Vector) -> Result<Vec<String>> { let mut vec = vec![]; for e in input { vec.push(e.into_rust()?); } Ok(vec) } }
- An
&Env
. This enables interaction with the Lisp runtime. It does not appear in the function's Lisp signature. This is unnecessary if there is already another parameter with typeValue
, which allows accessing the runtime throughValue.env
.#![allow(unused)] fn main() { // Note that the function takes an owned `String`, not a reference, which would // have been understood as a `user-ptr` object containing a Rust string. #[defun] fn hello(env: &Env, name: String) -> Result<Value<'_>> { env.message(format!("Hello, {}!", name)) } }
Return Value
The return type must be Result<T>
, where T
is one of the following:
- A type that implements
IntoLisp
. This is for simple data types that have an equivalent in Lisp.#![allow(unused)] fn main() { /// Return the path to the .git dir. /// Return `nil' if the given path is not in a repo, /// or if the .git path is not valid utf-8. #[defun] fn dot_git_path(path: String) -> Result<Option<String>> { Ok(git2::Repository::discover(&path).ok().and_then(|repo| { repo.path().to_str().map(|s| s.to_owned()) })) } }
- An arbitrary type. This allows embedding a native data structure in a
user-ptr
object, for read-write use cases. It requiresuser_ptr
option to be specified. If the data is to be shared with background Rust threads,user_ptr(rwlock)
oruser_ptr(mutex)
must be used instead.#![allow(unused)] fn main() { #[defun(user_ptr)] fn repo(path: String) -> Result<git2::Repository> { Ok(git2::Repository::discover(&path)?) } }
- A type that implements
Transfer
. This allows embedding a native data structure in auser-ptr
object, for read-only use cases. It requiresuser_ptr(direct)
option to be specified. Value
, or one of its "sub-types" (e.g.Vector
). This is mostly useful for returning an input parameter unchanged.
See Custom Types for more details on embedding Rust data structures in Lisp's user-ptr
objects.
Naming
By default, the function's Lisp name has the form <feature-prefix>[mod-prefix]<base-name>
.
feature-prefix
is the feature name followed by-
. This can be customized by thename
,defun_prefix
, andseparator
options on#[emacs::module]
.mod-prefix
is constructed from the function's Rustmod
path (with_
and::
replaced by-
). This can be turned off crate-wide, or for individual function, using the optionmod_in_name
.base-name
is the function's Rust name (with_
replaced by-
). This can be overridden with the optionname
.
Examples:
#![allow(unused)] fn main() { // Assuming crate's name is `native_parallelism`. #[emacs::module(separator = "/")] fn init(_: &Env) -> Result<()> { Ok(()) } mod shared_state { mod thread { // Ignore the nested mod's. // (native-parallelism/make-thread "name") #[defun(mod_in_name = false)] fn make_thread(name: String) -> Result<Value<'_>> { .. } } mod process { // (native-parallelism/shared-state-process-launch "bckgrnd") #[defun] fn launch(name: String) -> Result<Value<'_>> { .. } // Specify a name explicitly, since Rust identifier cannot contain `:`. // (native-parallelism/process:pool "http-client" 2 8) #[defun(mod_in_name = false, name = "process:pool")] fn pool(name: String, min: i64, max: i64) -> Result<Value<'_>> { .. } } } }
Documentation
#[defun]
converts Rust's docstring into Lisp's docstring. It also automatically constructs and appends the function's signature to the end of the docstring, so that help modes can correctly display it.
#![allow(unused)] fn main() { // `(fn X Y)` is automatically appended, so you don't have to manually do so. // In help modes, the signature will be (add X Y). /// Add 2 numbers. #[defun] fn add(x: usize, y: usize) -> Result<usize> { Ok(x + y) } }
Calling Lisp Functions
Frequently-used Lisp functions are exposed as methods on env
:
#![allow(unused)] fn main() { env.intern("defun")?; env.message("Hello")?; env.type_of(5.into_lisp(env)?)?; env.provide("my-module")?; env.list((1, "str", true))?; }
To call arbitrary Lisp functions, use env.call(func, args)
.
func
can be:- A string identifying a named function in Lisp.
- Any Lisp-callable
Value
(a symbol with a function assigned, a lambda, a subr). This can also be written asfunc.call(args)
.
args
can be:- An array, or a slice of
Value
. - A tuple of different types, each satisfying the
IntoLisp
trait.
- An array, or a slice of
#![allow(unused)] fn main() { // (list "str" 2) env.call("list", ("str", 2))?; }
#![allow(unused)] fn main() { let list = env.intern("list")?; // (symbol-function 'list) let subr = env.call("symbol-function", [list])?; // (funcall 'list "str" 2) env.call(list, ("str", 2))?; // (funcall (symbol-function 'list) "str" 2) env.call(subr, ("str", 2))?; subr.call(("str", 2))?; // Like the above, but shorter. }
#![allow(unused)] fn main() { // (add-hook 'text-mode-hook 'variable-pitch-mode) env.call("add-hook", [ env.intern("text-mode-hook")?, env.intern("variable-pitch-mode")?, ])?; }
#![allow(unused)] fn main() { #[defun] fn listify_vec(vector: Vector) -> Result<Value> { let mut args = vec![]; for e in vector { args.push(e) } vector.0.env.call("list", &args) } }
Type Conversions
The type Value
represents Lisp values:
- They can be copied around, but cannot outlive the
Env
they come from. - They are "proxy values": only useful when converted to Rust values, or used as arguments when calling Lisp functions.
Converting a Lisp Value
to Rust
This is enabled for types that implement FromLisp
. Most built-in types are supported. Note that conversion may fail, so the return type is Result<T>
.
#![allow(unused)] fn main() { let i: i64 = value.into_rust()?; // error if Lisp value is not an integer let f: f64 = value.into_rust()?; // error if Lisp value is nil let s = value.into_rust::<String>()?; let s: Option<&str> = value.into_rust()?; // None if Lisp value is nil }
It's better to declare input types for #[defun]
than calling .into_rust()
, unless delayed conversion is needed.
Converting a Rust Value to Lisp
This is enabled for types that implement IntoLisp
. Most built-in types are supported. Note that conversion may fail, so the return type is Result<Value<'_>>
.
#![allow(unused)] fn main() { "abc".into_lisp(env)?; "a\0bc".into_lisp(env)?; // NulError (Lisp string cannot contain null byte) 5.into_lisp(env)?; 65.3.into_lisp(env)?; ().into_lisp(env)?; // nil true.into_lisp(env)?; // t false.into_lisp(env)?; // nil }
It's better to declare return type for #[defun]
than calling .into_lisp(env)
, whenever possible.
Integers
Integer conversion is lossless by default, which means that a module will signal an "out of range" rust-error
in cases such as:
- A
#[defun]
expectingu8
gets passed-1
. - A
#[defun]
returningu64
returns a value larger thani64::max_value()
.
To disable this behavior, use the lossy-integer-conversion
feature:
[dependencies.emacs]
features = ["lossy-integer-conversion"]
Support for Rust's NonZero
integer types is disabled by default. To enable it, use the nonzero-integer-conversion
feature:
[dependencies.emacs]
features = ["nonzero-integer-conversion"]
Strings
By default, no utf-8 validation is done when converting Lisp strings into Rust strings, because the string data returned by Emacs is guaranteed to be valid utf-8 sequence. If you think you've otherwise encountered an Emacs bug, utf-8 validation can be enabled through a feature:
[dependencies.emacs]
features = ["utf-8-validation"]
Vectors
Lisp vectors are represented by the type Vector
, which can be considered a "sub-type" of Value
.
To construct Lisp vectors, use env.make_vector
and env.vector
, which are efficient wrappers of Emacs's built-in subroutines make-vector
and vector
.
#![allow(unused)] fn main() { env.make_vector(5, ())?; env.vector([1, 2, 3])?; env.vector((1, "x", true))?; }
Embedding Rust Values in Lisp
Speeding up Emacs is one of the goals of dynamic modules. Too many back-and-forth conversions between Rust's data structures and Lisp's can defeat the purpose. The solution to this is embedding Rust data structures in opaque user-ptr
Lisp objects.
If a type implements Transfer
, its heap-allocated (Box
-wrapped) values can be moved into the Lisp runtime, where the GC will become its owner.
Lisp code sees these as opaque "embedded user pointers", whose printed representation is something like #<user-ptr ptr=0x102e10b60 finalizer=0x103c9c390>
. For these values to be useful, a Rust module needs to export additional functions to manipulate them.
Since these values are owned by the GC, Rust code can only safely access them through immutable references. Therefore, interior mutability is usually needed. As a result, Transfer
is implemented for the smart pointer types RefCell
, Mutex
, RwLock
, Rc
, and Arc
.
To return an embedded value, a function needs to be exported with a user_ptr
option:
user_ptr
: Embedding through aRefCell
. This is suitable for common use cases, where module functions can borrow the underlying data back for read/write. It is safe because Lisp threads are subjected to the GIL.BorrowError
/BorrowMutError
may be signaled at runtime, depending on how module functions call back into the Lisp runtime.user_ptr(rwlock)
,user_ptr(mutex)
: Embedding through aRwLock
/Mutex
. This is suitable for sharing data between module functions (on Lisp threads, withEnv
access) and pure Rust code (on background threads, without access to anEnv
).user_ptr(direct)
: Embedding aTransfer
value directly. This is suitable for immutable data that will only be read back (not written to) by module functions (writing requiresunsafe
access, and is discouraged).
As an example, a module that allows Emacs to use Rust's HashMap
may look like this:
#![allow(unused)] fn main() { use std::collections::HashMap; use emacs::{defun, Env, Result, Value}; #[emacs::module(name = "rs-hash-map", separator = "/")] fn init(env: &Env) -> Result<()> { type Map = HashMap<String, String>; #[defun(user_ptr)] fn make() -> Result<Map> { Ok(Map::new()) } #[defun] fn get(map: &Map, key: String) -> Result<Option<&String>> { Ok(map.get(&key)) } #[defun] fn set(map: &mut Map, key: String, value: String) -> Result<Option<String>> { Ok(map.insert(key,value)) } Ok(()) } }
(let ((m (rs-hash-map/make)))
(rs-hash-map/get m "a") ; -> nil
(rs-hash-map/set m "a" "1") ; -> nil
(rs-hash-map/get m "a") ; -> "1"
(rs-hash-map/set m "a" "2") ; -> "1"
(rs-hash-map/get m "a")) ; -> "2"
Notes:
Value.into_rust()
has a runtime type check, which fails with the error'rust-wrong-type-user-ptr
if the value is auser-ptr
object of a different type.- Input parameters with reference types are interpreted as
RefCell
-embeddeduser-ptr
objects. For other kinds of embedding, you will have to use aValue
parameter, and acquire the reference manually, since locking strategy (including deadlock avoidance/detection) should be module-specific.#![allow(unused)] fn main() { use std::sync::RwLock; #[defun(user_ptr(rwlock))] fn make() -> Result<Map> { Ok(Map::new()) } #[defun] fn get(v: Value<'_>, key: String) -> Result<Value<'_>> { let lock: &RwLock<Map> = v.into_rust()?; let map = lock.try_read().map_err(|_| Error::msg("map is busy"))?; map.get(&key).into_lisp(v.env) } }
Lifetime-constrained Types
When a type is constrained by a (non-static) lifetime, its value cannot be embedded unchanged. Before embedding, the lifetime must be soundly elided. In other words, static ownership must be correctly given up.
The typical example is a struct holding a reference to another struct:
#![allow(unused)] fn main() { pub struct Tree; pub struct Node<'t> { pub tree: &'t Tree, } impl Tree { pub fn root_node(&self) -> Node<'_> { ... } } impl<'t> Node<'t> { pub fn child(&self) -> Node<'t> { ... } } }
In this case, the lifetime can be elided by turning the static reference into a dynamic ref-counted pointer. The rental crate provides a convenient way to do this:
#![allow(unused)] fn main() { #[macro_use] extern crate rental; use std::{rc::Rc, marker::PhantomData}; use emacs::{defun, Result}; // PhantomData is need because map_suffix requires a type parameter. // See https://github.com/jpernst/rental/issues/35. pub struct PhantomNode<'t, T>(Node<'t>, PhantomData<T>); impl<'t> PhantomNode<'t, ()> { fn child(&self) -> Self { PhantomNode(self.0.child(), PhantomData) } } rental! { pub mod inner { use std::rc::Rc; // Self-referential struct that holds both // the actual Node and the ref-counted Tree. #[rental(map_suffix = "T")] pub struct RentingNode<T: 'static> { tree: Rc<super::Tree>, node: super::PhantomNode<'tree, T> } } } type RentingNode = inner::RentingNode<()>; #[defun(user_ptr)] fn root_node(tree: Value) -> Result<RentingNode> { let rc: &Rc<Tree> = tree.into_rust()?; Ok(RentingNode::new(rc.clone(), |tree| tree.root_node())) } #[defun(user_ptr)] fn child(node: &RentingNode) -> Result<RentingNode> { node.map(|n| n.child()) } }
Note that there's no unsafe
involved directly, as the soundness proofs are already encapsulated in rental
macros.
Error Handling and Signaling
Emacs Lisp's error handling mechanism uses non-local exits. Rust uses Result
enum. emacs-module-rs
converts between the 2 at the Rust-Lisp boundaries (more precisely, Rust-C).
The chosen error type is the Error
struct from anyhow
crate:
#![allow(unused)] fn main() { pub type Result<T> = result::Result<T, anyhow::Error>; }
Handling Lisp Errors in Rust
When calling a Lisp function, it's usually a good idea to propagate signaled errors with the ?
operator, letting higher level (Lisp) code handle them. If you want to handle a specific error, you can use error.downcast_ref
:
#![allow(unused)] fn main() { match env.call("insert", &[some_text]) { Err(error) => { // Handle `buffer-read-only` error. if let Some(Signal { symbol, .. }) = error.downcast_ref::<ErrorKind>() { let buffer_read_only = env.intern("buffer-read-only")?; // `symbol` is a `TempValue` that must be converted to `Value`. let symbol = unsafe { Ok(symbol.value(env)) }; if env.eq(symbol, buffer_read_only) { env.message("This buffer is not writable!")?; return Ok(()) } } // Propagate other errors. Err(error) }, v => v, } }
Note the use of unsafe
to extract the error symbol as a Value
. The reason is that, ErrorKind::Signal
is marked Send+Sync
, for compatibility with anyhow
, while Value
is lifetime-bound by env
. The unsafe
contract here requires the error being handled (and its TempValue
) to come from this env
, not from another thread, or from a global/thread-local storage.
Catching Values Thrown by Lisp
This is similar to handling Lisp errors. The only difference is ErrorKind::Throw
being used instead of ErrorKind::Signal
.
Signaling Lisp Errors from Rust
The function env.signal
allows signaling a Lisp error from Rust code. The error symbol must have been defined, e.g. by the macro define_errors!
:
#![allow(unused)] fn main() { // The parentheses denote parent error signals. // If unspecified, the parent error signal is `error`. emacs::define_errors! { my_custom_error "This number should not be negative" (arith_error range_error) } #[defun] fn signal_if_negative(env: &Env, x: i16) -> Result<()> { if (x < 0) { return env.signal(my_custom_error, ("associated", "DATA", 7)) } Ok(()) } }
Handling Rust Errors in Lisp
In addition to standard errors, Rust module functions can signal Rust-specific errors, which can also be handled by condition-case
:
rust-error
: The message isRust error
. This covers all generic Rust-originated errors.rust-wrong-type-user-ptr
: The message isWrong type user-ptr
. This happens when Rust code is passed auser-ptr
of a type it's not expecting. It is a sub-type ofrust-error
.#![allow(unused)] fn main() { // May signal if `value` holds a different type of hash map, // or is a `user-ptr` defined in a non-Rust module. let r: &RefCell<HashMap<String, String>> = value.into_rust()?; }
Panics
Unwinding from Rust into C is undefined behavior. emacs-module-rs
prevents that by using catch_unwind
at the Rust-to-C boundary to convert a panic into a Lisp's signal/throw of the appropriate type:
- Normally the panic is converted into a Lisp's error signal of the type
rust-panic
. Note that it is not a sub-type ofrust-error
. - If the panic value is an
ErrorKind
, it is converted to the corresponding signal/throw, as if aResult
was returned. This allows propagating Lisp's non-local exits through contexts whereResult
is not appropriate, e.g. callbacks whose types are dictated by 3rd-party libraries, such astree-sitter
.
Testing
You can define tests using ert, then use a bash script to load the module and run the tests. Examples:
For continuous testing during development, run this (requires cargo-watch
):
bin/test watch
A future version will have tighter integration with either cargo
or Cask.
Live Reloading
Live code reloading is very useful during development. However, Emacs does not support unloading modules. Live reloading thus requires a custom module loader, e.g. emacs-rs-module, which is itself a dynamic module.
To use it, load it in Emacs:
(require 'rs-module)
Then use it to load other modules instead of require
or module-load
:
;; Will unload the old version of the module first.
(rs-module/load "full/path/to/module.so")
cargo
doesn't support installing dynamic libs yet, so you have to include emacs-rs-module
as a dev dependency to compile it on your own:
[dev-dependencies]
emacs-rs-module = { version = "0.13.0" }
magit-libgit2 is an example of how to set this all up, to have live-reloading on-save.
A future version will have tighter integration with cargo
.
Notes:
- It mainly works on Linux, but potentially because Linux's dynamic loading system is unsafe (i.e. ridden with UB traps).
- It doesn't work on macOS 10.13+ (High Sierra and up), because macOS doesn't unload dynamic libraries that use TLS (thread-local storage), for safety reason. See Rust's issue #28794.
- It doesn't work on Windows, since loading the dll prevents writing to its file.