Contributors to this document include (in alphabetical order): Jason Zhijingcheng Yu
Version Information: Draft version. Refer to the commit hash.
1. Introduction
The Capstone project is an effort to explore the design of a new CPU instruction set architecture that achieves multiple security goals including memory safety and isolation with one unified hardware abstraction.
Other formats: This document is also available in the following formats:
1.1. Goals
The ultimate goal of Capstone is to unify the numerous hardware abstracts that have been added as extensions to existing architectures as afterthought mitigations to security vulnerabilities. This goal requires a high level of flexibility and extensibility of the Capstone architecture. More specifically, we aim to support the following in a unified manner.
- Exclusive access
-
Software should be guaranteed exclusive access to certain memory regions if needed. This is in spite of the existence of software traditionally entitled to higher privileges such as the OS kernel and the hypervisor.
- Revocable delegation
-
Software components should be able to delegate authority to other components in a revocable manner. For example, after an untrusted library function has been granted access to a memory region, the caller should be able to revoke this access.
- Dynamically extensible hierarchy
-
The hierarchy of authority should be dynamically extensible, unlike traditional platforms which follow a static hierarchy of hypervisor-kernel-user. This makes it possible to use the same set of abstractions for memory isolation and memory sharing regardless of where a software component lies in the hierarchy.
- Safe context switching
-
A mechanism of context switching without trusting any other software component should be provided. This allows for a minimal TCB if necessary in case of a highly security-critical application.
1.2. Major Design Elements
The Capstone architecture design is based on the idea of capabilities, which are unforgeable tokens that represent authority to perform memory accesses and control flow transfers. Capstone extends the traditional capability model with new capability types including the following.
- Linear capabilities
-
Linear capabilities are guaranteed not to alias with other capabilities. Operations on linear capabilities maintain this property. For example, linear capabilities cannot be duplicated. Instead, they can only be moved around across different registers or between registers and memory. They can hence enable safe exclusive access to memory regions. Capabilities that do not have this property are called non-linear capabilities.
- Revocation capabilities
-
Revocation capabilities cannot be used to perform memory accesses or control flow transfers. Instead, they convey the authority to revoke other capabilities. Each revocation capability is derived from a linear capability and can later be used to revoke (i.e., invalidate) capability derived from the same linear capability. This mechanism enables revocable and arbitrarily extensible chains of delegation of authority.
- Uninitialised capabilities
-
Uninitialised capabilities convey write-only authority to memory. They can be turned into linear capabilities after the memory region has been "initialised", that is, when the whole memory region has been overwritten with fresh data. Uninitialised capabilities enable safe initialisation of memory regions and prevent secret leakage without incurring extra performance overhead.
1.3. Capstone-RISC-V ISA Overview
While Capstone does not assume any specific modern ISA, we choose to propose a Capstone extension to RISC-V due to its open nature and the availability of toolchains and simulators.
The Capstone-RISC-V ISA is a 64-bit RISC-V extension that makes the following types of changes to the base architecture:
-
Each general-purpose register is extended to 129 bits to accomodate 128-bit capabilities.
-
New instructions for manipulating capabilities are added.
-
New instructions for memory accesses using capabilities are added.
-
New instructions for control flow transfers using capabilities are added.
-
Semantics of a small number of existing instructions are changed to support capabilities.
-
Semantics of interrupts and exceptions are changed to support capabilities.
1.4. Assembly Mnemonics
Each Capstone-RISC-V instruction is given a mnemonic prefixed with CS.
.
In contexts where it is clear we are discussing Capstone-RISC-V instructions,
we will omit the CS.
prefix for brevity.
In assembly code, the list of operands to an instruction is supplied following the
instruction mnemonic, with the operands separated by commas, in the order of
rd
, rs1
, rs2
, imm
for any operand the instruction expects.
1.5. Notations
When specifying the semantics of instructions, we use the following notations to represent the type of each operand:
- I
-
Integer register.
- C
-
Capability register.
- S
-
Sign-extended immediate.
- Z
-
Zero-extended immediate.
1.6. Bibliography
The initial design of Capstone has been discussed in the following paper:
-
Capstone: A Capability-based Foundation for Trustless Secure Memory Access by Jason Zhijingcheng Yu, Conrad Watt, Aditya Badole, Trevor E. Carlson, Prateek Saxena. In Proceedings of the 32nd USENIX Security Symposium. Annaheim, CA, USA. August 2023.
2. Programming Model
The Capstone-RISC-V ISA has extended the part of the machine state, including both some registers and the memory, to enable the storage and handling of capabilities.
2.1. Capabilities
2.1.1. Width
The width of a capability is 128 bits. We represent this as
CLEN = 128
. Note that this does not affect the width of a
raw address, which is XLEN = 64
bits.
2.1.2. Fields
Each capability has the following architecturally-visible fields:
Name | Range | Description |
---|---|---|
|
|
The type of the capability:
|
|
|
The memory address the capability points to (to be used for the next memory access) |
|
|
The base memory address of the memory region associated with the capability |
|
|
The end memory address of the memory region associated with the capability |
|
|
The permissions associated with the capability:
|
The range of the perms
field has a partial order ⇐
defined as follows:
<= = { (0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 4), (3, 4)}
2.2. Extension to General-Purpose Registers
The Capstone-RISC-V ISA extends each of the 32 general-purpose
registers, so it contains either a capability or a raw XLEN
-bit
integer.
The type of data contained in a register is maintained and confusion
of the type is not allowed.
In assembly code, the type of data expected in a register operand
is indicated by the alias used for the register, as summarised
in the following table.
XLEN -bit integer |
Capability |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TODO: talk about what cnull
is.
2.3. Extension to Other Registers
The program counter (pc
) register is extended to contain a
capability.
2.4. Extension to Memory
The memory is addressed using an XLEN
-bit integer at byte-level
granularity.
In addition to raw integers, each CLEN
-bit aligned address can
also store a capability.
The type of data contained in a memory location is maintained and
confusion of the type is not allowed.
2.5. Instruction Set
The Capstone-RISC-V instruction set is based on the RV64G instruction set.
The (uncompressed) instructions are fixed 32-bit wide, and laid out in memory
in little-endian order. In the encoding space of the RV64G instruction set,
Capstone-RISC-V instructions occupies the "custom-2" subset, i.e., the opcode
of all Capstone-RISC-V instructions is 0b1011011
.
Capstone-RISC-V instruction encodings follow two basic formats: R-type and I-type, as described below (more details are also provided in the RISC-V ISA Manual).
R-type instructions receive up to three register operands, and I-type instructions receive up to two register operands and a 12-bit-wide immediate operand.
3. Capability Manipulation Instructions
Capstone provides instructions for creating, modifying, and destroying capabilities. Note that due to the guarantee of provenance of capabilities, those instructions are the only way to manipulate capabilities. In particular, it is not possible to manipulate capabilities by manipulating the content of a memory location or register using other instructions.
3.1. Cursor, Bounds, and Permissions Manipulation
3.1.1. Capability Movement
Capabilities can be moved between registers with the MOVC instruction.
If rs1
does not contain a capability, the instruction will trap. Otherwise,
if rs1
is the same register as rd
, the instruction is a no-op. If rs1
is
not the same register as rd
, rd
will contain the original content of
rs1
, and if the content is a linear capability, rs1
will be set to the content
of cnull
.
3.1.2. Cursor Increment
The CINCOFFSET and CINCOFFSETIMM instructions increment the cursor
of a
capability by a give amount (offset).
For CINCOFFSET, the offset is read from rs2
.
For CINCOFFSETIMM, the offset is the 12-bit sign-extended immediate field
imm
. If the offset is zero, the instructions are semantically equivalent to
MOVC rd, rs1
. Otherwise, the instructions are equal to an atomic execution
of MOVC rd, rs1
followed by an increment of the cursor
field of rd
by
the offset.
3.1.3. Cursor Setter and Getter
The cursor
field of a capability can also be directly set and read with
the SCC and LCC instructions respectively.
3.1.4. Bounds Shrinking
The bounds (base
and end
fields) of a capability can be shrunk with the SHRINK instruction.
The instruction reads rs1
and rs2
and attempts to set the bounds of the capability in
rd
to [rs1, rs2)
.
If rd
does not originally contain a capability or rs1 > rs2
, the instruction will trap.
If the original bounds of the capability in rd
are [base, end)
and
rs1 < base
or rs2 > end
, the instruction will also trap.
Otherwise, the instruction will set the bounds of the capability in rd
to [rs1, rs2)
.
3.1.5. Bounds Splitting
The SPLIT instruction can split a capability into two by splitting the bounds.
TODO: describe this
3.1.6. Permission Tightening
The TIGHTEN instruction tightens the permissions (perms
field) of a capability.
The instruction reads the new permissions from rs1
and attempts to set the perms
field
of the capability in rd
to rs1
. If rd
does not originally contain a capability or
the content of rs1
is outside the range of perms
, the instruction will trap.
If the original perms
field of the capability in rd
is p
and rs1 ⇐ p
does not hold,
the instruction will also trap.
Otherwise, the perms
field of the capability in rd
is set to rs1
.
3.2. Type Manipulation
Some instructions affect the type
field of a capability.
3.2.1. Delinearisation
The DELIN instruction delinearises a linear capability.
The instruction expects a linear capability in rd
. If rd
does not contain
a capability, or the type
field of the capability in rd
is not 0
(linear),
the instruction will trap. Otherwise, the type
field of the capability in rd
is set to 1
(non-linear).
3.2.2. Initialisation
The INIT instruction transforms an uninitialised capability into a linear capability after its associated memory region has been fully initialised (written with new data).
The instruction expects an uninitialized capability in rd
. If rd
does not contain
a capability, or the type
field of the capability in rd
is not 3
(uninitialised),
the instruction will trap. If the end
field and the cursor
field
of the capability in rd
are not equal, the instruction will also trap.
Otherwise, the type
field of the capability in rd
is set to 0
(linear).
TODO: for other instructions that change the cursor, it is also necessary to check the type
field.
3.2.3. Sealing
The SEAL instruction seals a linear capability.
The instruction expects a linear capability in rd
. If rd
does not contain
a capability, or the type
field of the capability in rd
is not 0
(linear),
the instruction will trap. Otherwise, the type
field of the capability in rd
is set to 2
(sealed).
3.3. Dropping
TODO: check whether dropping is actually necessary.
3.4. Revocation
3.4.1. Revocation Capability Creation
The MREV instruction creates a revocation capability.
3.4.2. Revocation Operation
The REVOKE instruction revokes a capability.
4. Memory Access Instructions
4.1. Load/Store with Capabilities
4.2. Load/Store Capabilities
5. Control Flow Instructions
5.1. Jump to Capabilities
5.2. Domain Crossing
5.3. World Switching
6. Adjustments to Existing Instructions
TODO
7. Interrupts and Exceptions
TODO
8. Memory Consistency Model
TODO
Appendix A: Debugging Instructions
TODO
Appendix B: Instruction Listing
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
QUERY |
R |
|
|
I |
- |
- |
- |
DROP |
R |
|
|
C |
- |
- |
- |
RCUPDATE |
R |
|
|
I |
- |
I |
- |
ALLOC |
R |
|
|
I |
- |
I |
- |
REVOKET |
R |
|
|
I |
- |
- |
- |
CAPCREATE |
R |
|
|
- |
- |
C |
- |
CAPTYPE |
R |
|
|
I |
- |
C |
- |
CAPNODE |
R |
|
|
I |
- |
C |
- |
CAPPERM |
R |
|
|
I |
- |
C |
- |
CAPBOUND |
R |
|
|
I |
I |
C |
- |
CAPPRINT |
R |
|
|
I |
- |
- |
- |
TAGSET |
R |
|
|
I |
I |
- |
- |
TAGGET |
R |
|
|
I |
- |
I |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
I |
I |
C |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
C |
I |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
- |
- |
- |
|
CAPGET |
R |
|
|
- |
- |
C |
- |
R |
|
|
C |
I |
C |
- |
|
I |
|
- |
C |
- |
C |
S |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
C |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
- |
I |
C |
- |
|
R |
|
|
- |
I |
C |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
- |
I |
C |
- |
|
R |
|
|
- |
I |
C |
- |
Appendix C: Assembly Code Examples
TODO
Appendix D: Abstract Binary Interface (Non-Normative)
TODO