Contributors to this document include (in alphabetical order): Jason Zhijingcheng Yu
Version Information: Draft version. Refer to the commit hash.
1. Introduction
The Capstone project is an effort to explore the design of a new CPU instruction set architecture that achieves multiple security goals including memory safety and isolation with one unified hardware abstraction.
Other formats: This document is also available in the following formats:
1.1. Goals
The ultimate goal of Capstone is to unify the numerous hardware abstracts that have been added as extensions to existing architectures as afterthought mitigations to security vulnerabilities. This goal requires a high level of flexibility and extensibility of the Capstone architecture. More specifically, we aim to support the following in a unified manner.
- Exclusive access
-
Software should be guaranteed exclusive access to certain memory regions if needed. This is in spite of the existence of software traditionally entitled to higher privileges such as the OS kernel and the hypervisor.
- Revocable delegation
-
Software components should be able to delegate authority to other components in a revocable manner. For example, after an untrusted library function has been granted access to a memory region, the caller should be able to revoke this access.
- Dynamically extensible hierarchy
-
The hierarchy of authority should be dynamically extensible, unlike traditional platforms which follow a static hierarchy of hypervisor-kernel-user. This makes it possible to use the same set of abstractions for memory isolation and memory sharing regardless of where a software component lies in the hierarchy.
- Safe context switching
-
A mechanism of context switching without trusting any other software component should be provided. This allows for a minimal TCB if necessary in case of a highly security-critical application.
1.2. Major Design Elements
The Capstone architecture design is based on the idea of capabilities, which are unforgeable tokens that represent authority to perform memory accesses and control flow transfers. Capstone extends the traditional capability model with new capability types including the following.
- Linear capabilities
-
Linear capabilities are guaranteed not to alias with other capabilities. Operations on linear capabilities maintain this property. For example, linear capabilities cannot be duplicated. Instead, they can only be moved around across different registers or between registers and memory. They can hence enable safe exclusive access to memory regions. Capabilities that do not have this property are called non-linear capabilities.
- Revocation capabilities
-
Revocation capabilities cannot be used to perform memory accesses or control flow transfers. Instead, they convey the authority to revoke other capabilities. Each revocation capability is derived from a linear capability and can later be used to revoke (i.e., invalidate) capability derived from the same linear capability. This mechanism enables revocable and arbitrarily extensible chains of delegation of authority.
- Uninitialised capabilities
-
Uninitialised capabilities convey write-only authority to memory. They can be turned into linear capabilities after the memory region has been "initialised", that is, when the whole memory region has been overwritten with fresh data. Uninitialised capabilities enable safe initialisation of memory regions and prevent secret leakage without incurring extra performance overhead.
1.3. Capstone-RISC-V ISA Overview
While Capstone does not assume any specific modern ISA, we choose to propose a Capstone extension to RISC-V due to its open nature and the availability of toolchains and simulators.
The Capstone-RISC-V ISA is a 64-bit RISC-V extension that makes the following types of changes to the base architecture:
-
Each general-purpose register is extended to 129 bits to accomodate 128-bit capabilities.
-
New instructions for manipulating capabilities are added.
-
New instructions for memory accesses using capabilities are added.
-
New instructions for control flow transfers using capabilities are added.
-
Semantics of a small number of existing instructions are changed to support capabilities.
-
Semantics of interrupts and exceptions are changed to support capabilities.
1.4. Assembly Mnemonics
Each Capstone-RISC-V instruction is given a mnemonic prefixed with CS.
.
In contexts where it is clear we are discussing Capstone-RISC-V instructions,
we will omit the CS.
prefix for brevity.
In assembly code, the list of operands to an instruction is supplied following the
instruction mnemonic, with the operands separated by commas, in the order of
rd
, rs1
, rs2
, imm
for any operand the instruction expects.
1.5. Notations
When specifying the semantics of instructions, we use the following notations to represent the type of each operand:
- I
-
Integer register.
- C
-
Capability register.
- S
-
Sign-extended immediate.
- Z
-
Zero-extended immediate.
1.6. Bibliography
The initial design of Capstone has been discussed in the following paper:
-
Capstone: A Capability-based Foundation for Trustless Secure Memory Access by Jason Zhijingcheng Yu, Conrad Watt, Aditya Badole, Trevor E. Carlson, Prateek Saxena. In Proceedings of the 32nd USENIX Security Symposium. Annaheim, CA, USA. August 2023.
2. Programming Model
The Capstone-RISC-V ISA has extended the part of the machine state, including both some registers and the memory, to enable the storage and handling of capabilities.
2.1. Capabilities
2.1.1. Width
The width of a capability is 128 bits. We represent this as
CLEN = 128
. Note that this does not affect the width of a
raw address, which is XLEN = 64
bits.
2.1.2. Fields
Each capability has the following architecturally-visible fields:
Name | Range | Description |
---|---|---|
|
|
Whether the capability is valid: |
|
|
The type of the capability:
|
|
|
The memory address the capability points to (to be used for the next memory access) |
|
|
The base memory address of the memory region associated with the capability |
|
|
The end memory address of the memory region associated with the capability |
|
|
The permissions associated with the capability:
|
The range of the perms
field has a partial order ⇐
defined as follows:
<= = { (0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 4), (3, 4) }
We say a capability c
aliases with a capability d
if and only if the intersection
between [c.base, c.end)
and [d.base, d.end)
is non-empty.
For two revocation capabilities c
and d
(i.e., c.type = d.type = 2
),
we say c <t d
if and only if
-
c
aliases withd
-
The creation of
c
was earlier than the creation ofd
In addition to the above fields, an implementation also needs to maintain
sufficient metadata to test the <t
relation.
It will be clear that for any pair of revocation capabilities that alias,
the order of their creations is well-defined.
2.2. Extension to General-Purpose Registers
The Capstone-RISC-V ISA extends each of the 32 general-purpose
registers, so it contains either a capability or a raw XLEN
-bit
integer.
The type of data contained in a register is maintained and confusion
of the type is not allowed, except for x0
/c0
as discussed below.
In assembly code, the type of data expected in a register operand
is indicated by the alias used for the register, as summarised
in the following table.
XLEN -bit integer |
Capability |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x0
/c0
is a read-only register that can be used both as an
integer and as a capability, depending on the context. When used
as an integer, it has the value 0
.
When used as a capability, it has the value
{ valid = 0, type = 0, cursor = 0, base = 0, end = 0, perms = 0 }
.
Any attempt to write to x0
/c0
will be silently ignored (no
exceptions are raised).
2.3. Extension to Other Registers
The program counter (pc
) register is extended to contain a
capability.
2.4. Extension to Memory
The memory is addressed using an XLEN
-bit integer at byte-level
granularity.
In addition to raw integers, each CLEN
-bit aligned address can
also store a capability.
The type of data contained in a memory location is maintained and
confusion of the type is not allowed.
2.5. Instruction Set
The Capstone-RISC-V instruction set is based on the RV64G instruction set.
The (uncompressed) instructions are fixed 32-bit wide, and laid out in memory
in little-endian order. In the encoding space of the RV64G instruction set,
Capstone-RISC-V instructions occupies the "custom-2" subset, i.e., the opcode
of all Capstone-RISC-V instructions is 0b1011011
.
Capstone-RISC-V instruction encodings follow two basic formats: R-type and I-type, as described below (more details are also provided in the RISC-V ISA Manual).
R-type instructions receive up to three register operands, and I-type instructions receive up to two register operands and a 12-bit-wide immediate operand.
3. Capability Manipulation Instructions
Capstone provides instructions for creating, modifying, and destroying capabilities. Note that due to the guarantee of provenance of capabilities, those instructions are the only way to manipulate capabilities. In particular, it is not possible to manipulate capabilities by manipulating the content of a memory location or register using other instructions.
3.1. Cursor, Bounds, and Permissions Manipulation
3.1.1. Capability Movement
Capabilities can be moved between registers with the MOVC instruction.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability
If no exception is raised:
If rs1
is the same register as rd
, the instruction is a no-op.
If rs1
is not the same register as rd
, rd
will contain the original content of
rs1
, and if the content is not a non-linear capability (i.e., type != 1
),
rs1
will be set to the content of cnull
.
3.1.2. Cursor Increment
The CINCOFFSET and CINCOFFSETIMM instructions increment the cursor
of a
capability by a give amount (offset).
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability -
rs2
does not contain an integer (for CINCOFFSET) -
The capability in
rs1
hastype = 3
(uninitialised)
If no exception is raised:
For CINCOFFSET, the offset is read from rs2
.
For CINCOFFSETIMM, the offset is the 12-bit sign-extended immediate field
imm
. If the offset is 0
, the instructions are semantically equivalent to
MOVC rd, rs1
. Otherwise, the instructions are equal to an atomic execution
of MOVC rd, rs1
followed by an increment of the cursor
field of rd
by
the offset.
3.1.3. Cursor Setter and Getter
The cursor
field of a capability can also be directly set and read with
the SCC and LCC instructions respectively.
For SCC, an exception is raised if any of the following conditions are met:
-
rd
does not contain a capability -
rs1
does not contain an integer -
The capability in
rd
hastype = 3
(uninitialised)
For LCC, an exception is raised if any of the following conditions are met:
-
rs1
does not contain a capability
3.1.4. Bounds Shrinking
The bounds (base
and end
fields) of a capability can be shrunk with the SHRINK instruction.
The instruction reads rs1
and rs2
and attempts to set the bounds of the capability in
rd
to [rs1, rs2)
.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
rs1
does not contain an integer -
rs2
does not contain an integer -
rs1 > rs2
-
The original bounds of the capability in
rd
are[base, end)
andrs1 < base
orrs2 > end
3.1.5. Bounds Splitting
The SPLIT instruction can split a capability into two by splitting the bounds.
The instruction reads a capability from rs1
and an integer from rs2
and attempts to split
the capability into two capabilities, one with bounds [base, rs2)
and the other with bounds
[rs2, end)
, assuming the original bounds were [base, end)
.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability -
rs2
does not contain an integer -
The
type
field of the capability inrs1
is3
(uninitialised) -
The original bounds of the capability in
rs1
are[base, end)
andrs2 < base
orrs2 > end
If no exception is raised: The capability in rs1
has its end
field set to rs2
. A new
capability is created with base = rs2
and the other fields equal to those of the original
capability in rs1
. The new capability is written to rd
.
3.1.6. Permission Tightening
The TIGHTEN instruction tightens the permissions (perms
field) of a capability.
The instruction reads the new permissions from rs1
and attempts to set the perms
field
of the capability in rd
to rs1
.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
rs1
does not contain an integer -
The content of
rs1
is outside the range ofperms
-
The
perms
field of the capability inrd
isp
andrs1 ⇐ p
does not hold
3.2. Type Manipulation
Some instructions affect the type
field of a capability.
3.2.1. Delinearisation
The DELIN instruction delinearises a linear capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
The
type
field of the capability inrd
is not0
(linear)
If no exception is raised: The type
field of the capability in rd
is set to 1
(non-linear).
3.2.2. Initialisation
The INIT instruction transforms an uninitialised capability into a linear capability after its associated memory region has been fully initialised (written with new data).
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
The
type
field of the capability inrd
is not3
(uninitialised) -
The
end
field and thecursor
field of the capability inrd
are not equal
If no exception is raised: The type
field of the capability in rd
is set to 0
(linear).
3.2.3. Sealing
The SEAL instruction seals a linear capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
The
type
field of the capability inrd
is not0
(linear)
If no exception is raised: The type
field of the capability in rd
is set to 2
(sealed).
3.3. Dropping
TODO: check whether dropping is actually necessary.
The DROP instruction invalidates a capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability
If no exception is raised: The valid
field of the capability in rd
is set to 0
(invalid).
3.4. Revocation
3.4.1. Revocation Capability Creation
The MREV instruction creates a revocation capability.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability -
The
type
field of the capability inrs1
is not0
(linear) -
The
valid
field of the capability inrs1
is0
(invalid)
If no exception is raised: A new capability is created in rd
with the same
base
, end
, perms
, cursor
, and offset
fields as the capability in rs1
.
The type
field of the new capability is set to 2
(revocation).
3.4.2. Revocation Operation
The REVOKE instruction revokes a capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability -
The
type
field of the capability inrd
is not2
(revocation) -
The
valid
field of the capability inrd
is0
(invalid)
If no exception is raised:
For all capabilities c
in the system (in either a register or
memory location), its valid
field is set to 0
(invalid)
if any of the following conditions are met:
-
The type field of
c
is not2
(revocation), thevalid
field ofc
is1
(valid), andc
aliases withrd
-
The type field of
c
is2
(revocation), thevalid
field ofc
is1
(valid), andrd <t c
If any c
that is not a non-linear capability (i.e. c.type != 1
) is invalidated,
then the type
field of rd
is set to 3
(uninitialised), and its cursur
field is set
to base
.
Otherwise, the type
field of rd
is set to 0
(linear).
4. Memory Access Instructions
4.1. Load/Store with Capabilities
4.2. Load/Store Capabilities
5. Control Flow Instructions
5.1. Jump to Capabilities
5.2. Domain Crossing
5.3. World Switching
6. Adjustments to Existing Instructions
TODO
7. Interrupts and Exceptions
TODO
8. Memory Consistency Model
TODO
Appendix A: Debugging Instructions (Non-Normative)
TODO
Appendix B: Instruction Listing
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
QUERY |
R |
|
|
I |
- |
- |
- |
RCUPDATE |
R |
|
|
I |
- |
I |
- |
ALLOC |
R |
|
|
I |
- |
I |
- |
REV |
R |
|
|
I |
- |
- |
- |
CAPCREATE |
R |
|
|
- |
- |
C |
- |
CAPTYPE |
R |
|
|
I |
- |
C |
- |
CAPNODE |
R |
|
|
I |
- |
C |
- |
CAPPERM |
R |
|
|
I |
- |
C |
- |
CAPBOUND |
R |
|
|
I |
I |
C |
- |
CAPPRINT |
R |
|
|
I |
- |
- |
- |
TAGSET |
R |
|
|
I |
I |
- |
- |
TAGGET |
R |
|
|
I |
- |
I |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
I |
I |
C |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
C |
I |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
- |
- |
- |
|
CAPGET |
R |
|
|
- |
- |
C |
- |
R |
|
|
C |
I |
C |
- |
|
I |
|
- |
C |
- |
C |
S |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
C |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
I |
- |
- |
Appendix C: Assembly Code Examples
TODO
Appendix D: Abstract Binary Interface (Non-Normative)
TODO