Collections
When deciding on data structures used to an Applet's state, it is important to understand their tradeoffs.
Choosing the wrong structure can create a bottleneck as the application scales, and migrating the state to the new data structures will come at a cost.
As discussed in State, all the types inside the state must be of WeilType
, which mean that they are either basic types, collections provided by us, or which satisfy some conditions on how to be (de)serialized.
Specifically regarding collections, which may be provided by us (WeilCollections
), by the language's standard library or by others, you need to understand how the contract stores and loads them to decide which one to use.
Use native collections for small amounts of data that need to be accessed all together and SDK collections for large amounts of data that do not need to be accessed altogether.
State (De)Serialization
Each time the contract call is executed, the first thing it will do is to read the state and deserialize it into memory. Once the call finishes, it will serialize and write the state back to the database. This process has different results for Native Collections (those provided by the language) and Weil-Collections (those provided by the our SDKs).
- Rust
- Go
- AssemblyScript
- CPP
You have chosen Rust!
You have chosen Go!
You have chosen AssemblyScript!
You have chosen CPP!
Native Collections
- Rust
- Go
- AssemblyScript
- CPP
Those implementing the WeilType
trait:
Vec<T>
BTreeMap<K,V>
BTreeSet<V>
Those that have deterministic (e.g., ordered) serialization:
- slices
- jsonmap, which may be used as a set as well
AssemblyScript has the following Weil-Collections implemented
- WeilSet<T>
- WeilVec<T>
- WeilMap<K, V>
Other collections should be supported by json-as
library. E.g. Map<string, string>
is supported while Set<string>
is not.
Those that have deterministic (e.g., ordered) serialization:
- vector
- map
- set
All entries in a native collection are serialized into a single value and stored together into the state. This means that every time a function execute, the SDK will read and deserialize all entries in the native collection. This drives to the following conclusion and usage instruction:
- Native collections are useful if you are planning to store smalls amounts of data that need to be accessed all together
- As the native collection grows, deserializing it from memory will cost more. If the collections grows too large, your state might not be able to fit inside the memory which would result in panicked exit from the function execution.
Weil Collections
The Contract SDK expose collections that have interfaces similar to native collections, but which are optimized for random access of large amounts of data.
Weil-Collections are instantiated using an id of type WeilId
, which is used as an index to split the data into chunks.
- Rust
- Go
- AssemblyScript
- CPP
let index WeilMap::new(WeilId(0));
let records WeiVec::new(WeilId(1));
const map: <string, Box<u8>> = new WeilMap<string, Box<u8>>(new WeilId(0))
const records: WeilVec<CustomRecordClass> = new WeiVec<CustomRecordClass>(new WeilId(1))
collections::WeilMap<std::string, std::string> index(0);
collections::WeilMap<int, std::string> records(1);
The id is combined with a key of the collection (e.g. the index of a vector/slice) to reference to the collection elements individually. This way, Weil-Collections can be read and write only the entries it really needs to, in a deferred (lazy) way.
This drives to the following conclusion and usage instruction:
- Weil-Collections are useful when you are planning to store large amounts of data that do not need to be accessed altogether.
One should never ever use the same WeilId
for two different collections even if the previous one is deleted! Using same WeilId
leads to undefined behavior and can have catastrophic effects on the contract state.
Serialization Example
- Rust
- Go
- AssemblyScript
- CPP
Consider a vector with values [1, 2, 3, 4].
If a native collection is used to store it, it will be serialized into the JSON string "[1, 2, 3, 4]"
in Rust.
If instead of a native Vec
a WeilVec
is used, it will be serialized as its WeilId
.
That is, if it was initialized as WeilVec::new(WeilId(i))
, then it is serialized as i
.
As for the items in the collection, they will be saved as:
i_0: 1
i_1: 2
i_2: 3
i_3: 4
We use the standard Go serialization to serialize Applets state.
Consider a vector with values [1, 2, 3, 4].
If a slice is used to store it, it will be serialized into the JSON string "[1, 2, 3, 4]"
in Go.
If instead of a native slice a WeilVec
is used, it will be serialized as its state_id
.
That is, if it was initialized as *collections.NewWeilVec[uint32](*collections.NewWeilId(i))
, then it is serialized using just its id as i
.
As for the items in the collection, they will be saved as:
i_0: 1
i_1: 2
i_2: 3
i_3: 4
We use the json-as library for serialization and deserialization in AssemblyScript.
Consider a vector with values [1, 2, 3, 4].
If an array is used to store it, it will be serialized into the JSON string "[1, 2, 3, 4]"
by as-json
.
If instead of an array a WeilVec
is used, it will be serialized as its WeilId
.
That is, if it was initialized as new WeilVec<u8>(new WeilId(i))
, then it is serialized as i
.
As for the items in the collection, they will be saved as:
i_0: 1
i_1: 2
i_2: 3
i_3: 4
We use the nlohmann json library for serialization and deserialization in C++.
Consider a vector with values [1, 2, 3, 4].
If a native collection is used to store it, it will be serialized into the JSON string "[1, 2, 3, 4]"
in CPP.
If instead of a native Vec
a WeilVec
is used, it will be serialized as its state_id
.
That is, if it was initialized as collections::WeilVec(i)
, then it is serialized using just its id as i
.
As for the items in the collection, they will be saved as:
i_0: 1
i_1: 2
i_2: 3
i_3: 3
When the collection is deserialized, in the case of native collections, the whole vector is rebuilt.
In the case of WeilVec
, only the WeilId
is loaded and only when some element of the collection is accessed will the actual element, based on its index, be deserialized and loaded.
Usage Instructions
The actual Weil-Collection
API may be seen in here.
Here we discuss how to use the collections in your Applet.
Generally Weil-Collections
are used in the outer most contract state as that's where it spans over the scale of data that might not fit in memory.
The inner attributes can use native collections, balancing the trade-off between in-memory space occupied and execution time.
For example, a smart contract might have a map containing as key the wallet address and as value another map containing token to balance data.
Following are the ways one can implement such data structure:
- Rust
- Go
- AssemblyScript
- CPP
// Both outer and inner map using `WeilMap<K, V>`
struct ContractState {
balances: WeilMap<Address, TokenBalances>
}
struct TokenBalances(WeilMap<Token, uint>);
// Outer map using `WeilMap<K, V>` and inner map using `BTreeMap<K, V>`
struct ContractState {
balances: WeilMap<Address, TokenBalances>
}
struct TokenBalances(BTreeMap<Token, uint>);
// Both outer and inner map using `WeilMap<K, V>`
type ContractState struct {
Balances: collections.WeilMap[Address, TokenBalances] `json:"balances"`
}
type TokenBalances struct {
Balances: collections.WeilMap[Token, uint] `json:"balances"`
}
// Outer map using `WeilMap[K, V]` and inner map using a deterministically serializable map.
type ContractState struct {
Balances: collections.WeilMap[Address, TokenBalances] `json:"balances"`
}
struct TokenBalances(BTreeMap<Token, uint>);
type TokenBalances struct {
Balances: jsonmap.Map `json:"balances"`
}
// Both outer and inner map using `WeilMap<K, V>`
class ContractState {
balances: WeilMap<Address, TokenBalances>
}
type TokenBalances = WeilMap<Token, u64>
// Outer map using `WeilMap<K, V>` and inner map using `Map<K, V>`
class ContractState {
balances: WeilMap<Address, TokenBalances>
}
type TokenBalances = Map<Token, u64>
// Both outer and inner map using `WeilMap<K, V>`
struct ContractState {
collections::WeilMap<Address, TokenBalances> balances;
};
struct TokenBalances{
collections::WeilMap<Token, int> tokenBalances;
};
// Outer map using `WeilMap<K, V>` and inner map using `map<K, V>`
struct ContractState {
collections::WeilMap<Address, TokenBalances> balances;
};
struct TokenBalances{
std::map<Token, int> mp;
};
Both approaches are correct, but the second one optimizes the trade-off between in-memory usage and performance. You need to remember that each lazy get
or set
operation on a Weil-Collection
is potentially a call to persistent storage, while a standard collection is loaded all at ones in memory. So the outer map can be Weil-Collection
which might scale with the number of wallets the blockchain platform is hosting which could be potentially in millions or billions however the inner map can be stored as standard B-Tree Map as it just stores all the tokens owned by that wallet which might be few hundred or thousand at max.
So by careful inspection about the scale various attributes can attain inside contract state, we can implement quite efficient collection based data-structures.