Decode Solana Transactions on a budget
Creating a 45x smaller minimal library to decode Solana transactions in javascript.
Background
Technologies used by Solana
- Anchor is a framework for Solana smart contracts, IDL specification.
- Solana serializes data using BORSH - Binary Object Representation Serializer for Hashing
Use case: On the frontend we want to parse Jupiter Swap Events, you can see this when viewing a transaction on Solscan.
Even using getParsedTransaction
from web3.js
you will notice that the data is in its raw, base58 encoded form.
Typescript Ecosystem
Conveniently Jupiter provides a package to parse events.
The cost
The dependency graph is huge.
Specifically for parsing these are the main dependencies.
@jup-ag/instruction-parser
→ @coral-xyz/anchor
→ @coral-xyz/borsh
→ buffer-layout
We could go down a layer and use @coral-xyz/anchor
, which allows you to use an IDL to parse, but it is still large.
Issues
- Large bundle size - 618 kB -> 187 kB (gzip). Dependencies have no ESM support
- Uses many deps - bn.js (native
BigInt
exists), bs58 (modern equivalent@scure/base
) - Usage of Buffer - Technically
web3.js
has it anyway, but its removed in the ongoing rewrite - Class based - Non-optimal tree shaking, the class has additional code such as instruction parsing that can’t be tree-shaken away.
- Have to include entire IDL - Jupiter’s IDL is 18kb and don’t forget the IDL parser itself. Feature or overhead?
Minimal parser for Swap Events
Our scope is limited, we only need to decode and for 1 type of event, Jupiter’s Swap event. With this lets figure out the minimal steps required.
- Get data for swap transaction
2a4EpB...
We filter out Jupiter Program ID JUP6Lkb...
instruction data
- Decode from base58
On Solscan the Instruction Data Raw is displayed in hex by default, doing that will make it easier for us to visualise the binary structure.
- The first 8 bytes is the instruction discriminator, an identifier. We can skip this as we only care are about the Swap Events emitted (source)
- This leaves us with event data. Firstly, we need to determine the event type to deserialize as there may be multiple possible events.
To identify events, there is an event discriminator in the first 8 bytes.
This comprises of the first 8 bytes from the sha256 hash of the event signature. The signature is event:${event_name}
(source). You can try this out on CyberChef.
Now we can identify Swap Events and deserialize with the appropriate struct.
- Deserializing the data
From Jupiter’s Program IDL we can determine the struct. The struct is quite simple. No need for an IDL parser, we can declare this in code directly.
With this struct in mind, we can write deserializers for the specific data types and combine them to parse the event based off the structure.
We can use modern javascript functionality to do this without bringing in large dependencies.
Update 19 Oct. Web3js v2 has landed and exports decoders that we can use. I highly recommend you use that instead.
- Deserialize public key - 32 bytes types
- Deserialize little endian uint64- 8 bytes type
Combining both, we can now succesfully deserialize Swap Event with only minimal code and dependencies.
Comparisons
The impact of this is significant.
@jup-ag/instruction-parser
- Bundle size is 618 kB -> 187 kB (gzip) | bundlejs@coral-xyz/anchor
- Bundle size is 426 kB -> 124 kB (gzip) | bundlejs- Custom binary decoder - Bundle size is 7.82 kB -> 3.48 kB (gzip) | bundlejs
- 92% of which is libraries for base conversions and hashing
@scure/base
@noble/hashes
- 92% of which is libraries for base conversions and hashing
Notable mentions
These are good references, which were useful for de/serialization implementations
- Using
@dao-xyz/borsh
to decode | bundlejs | Bundle size is 17.8 kB -> 6.3 kB (gzip)- Uses decorators, which might not work with certain builds and I’m not a fan
- An official
borsh-ts
rewrite exists, but it’s not published on npm- Addresses some issues, but still class based
@hackbg/borshest
looks good, unsure about the API
Conclusion
Unsure if we will run into issues with more complex data types or scenarios. I made an experimental library AnchorES with a simple API and an easy way to extend or provide new structs to decode. The struct declaration is inspired by validation libraries.