Managing storage in a Java application with IPFS
In this article, we learn how to interact with IPFS (InterPlanetary File System) in Java using the official java-ipfs-http-client library. This library connects to an IPFS node and wraps most of the operations offered by the HTTP API.
The following diagram describes a Java program connected to an IPFS node via the java-ipfs-http-client library to the API Server.
- API server (default port: 5001): Full API
- Gateway server (default port: 8080): Read Only API (access to data only)
- P2P (default port: 4001): Peer-to-peer interface
Prerequisites
To run this tutorial, we must have the following installed:
- Java programming language (> 8)
1 | java -version |
- A package and dependency manager, for example Maven or Gradle
- An IDE (Integrated development environment), for this tutorial, we use Eclipse
- A running IPFS node (> 0.4.x)
Follow the following article to learn how to install an IPFS node (go-ipfs)
Dependencies
To get started, import the java-ipfs-http-client
dependency
Maven
Using Maven, we first need to configure the repository that hosts the dependency and then import the dependency. Add the code below before the closing </project>
tag:
1 | <properties> |
Gradle
The equivalent using Gradle:
1 | dependencies { |
Connect to IPFS
Once we’ve imported java-ipfs-http-client
, the first step of our application is connecting to an IPFS node.
Connect by host and port
We can connect by a host and port like this:
1 | IPFS ipfs = new IPFS("localhost", 5001); |
Connect by multiaddr
It is also possible to connect by multiaddr. A multiaddr represents a self-describing network address.
Multiaddr is a format for encoding addresses from various well-established network protocols. It is useful to write applications that future-proof their use of addresses and allow multiple transport protocols and addresses to coexist.
1 | IPFS ipfs = new IPFS("/ip4/127.0.0.1/tcp/5001"); |
If the IPFS node sits behind a proxy with SSL (like Infura), we can configure java-ipfs-http-client
to use https
rather than http
but a multiaddr is required.
1 | IPFS ipfs = new IPFS("/dnsaddr/ipfs.infura.io/tcp/5001/https"); |
Add content to IPFS
When adding a file on the IPFS network, the file is uploaded to the IPFS node we are connected to and stored in its local datastore. This operation returns a unique identifier of the file called “multihash” (for example: Qmaisz6NMhDB51cCvNWa1GMS7LU1pAxdF4Ld6Ft9kZEP2a
).
We use the ipfs.add(NamedStreamable file): List<MerkleNode>
method to store content on the IPFS node we are connected to. This method takes a NamedStreamable
or a List<NamedStreamable>
as input. NamedStreamable
has four different implementations:
-
FileWrapper
wraps ajava.io.File
-
InputStreamWrapper
wraps ajava.io.InputStream
-
ByteArrayWrapper
wraps abyte[]
-
DirWrapper
wraps a(String name, List<NamedStreamable> children)
to describe a hierarchical files structure
We can also add optional parameters to the method:
-
wrap
[boolean]: Wrap files into a directory. -
hashOnly
[boolean]: Only chunk and hash - do not write to the datastore.
Finally, the method returns a list of MerkleNode
which represents the content-addressable objects just added on the IPFS network.
File (FileWrapper)
We can use NamedStreamable.FileWrapper
to pass a java.io.File
to IPFS.
1 | try { |
InputStream (InputStreamWrapper)
If you are dealing with a java.io.InputStream
, use NamedStreamable.InputStreamWrapper
:
1 | try { |
Byte Array (ByteArrayWrapper)
To store a byte[]
, use NamedStreamable.ByteArrayWrapper
.
1 | try { |
Directory (DirWrapper)
Finally, to store files inside folders, use NamedStreamable.DirWrapper
. For example, with the folder structure below:
1 | folder |
Use:
1 | try { |
MerkleNode
IPFS is a peer-to-peer network essentially used to share linked Objects from a giant Merkle tree. When adding one file or a directory to IPFS, this operation returns the new dedicated branch of the Merkle tree composed of one or more linked Objects. We represent these branches in Java as a List<MerkleNode>
.
A MerkleNode
is composed of the following information:
- hash (multihash): a unique identifier of the Object within IPFS
- name (optional): Name of the object (usually the folder or filename)
- size (optional): Size of the object
- links (zero or more): A list of child Objects
MultiHash
Multihash (github) is a self-describing hash to uniquely identify and locate an object into the IPFS Merkle tree. It is usually represented in Base58, but we can also represent it in hexadecimal.
A multihash consists of different parts:
For example (in hexadecimal)
Read a Base58 hash to Multihash
1 | Multihash multihash = Multihash.fromBase58("QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o"); |
Read a Base16 (hexadecinal) hash to Multihash
1 | Multihash multihash = Multihash.fromHex("122046d44814b9c5af141c3aaab7c05dc5e844ead5f91f12858b21eba45768b4ce"); |
Convert a Multihash to Base58
1 | String hash = multihash.toBase58(); |
Convert a Multihash to Base16
1 | String hash = multihash.toHex(); |
Convert a Multihash to a byte array
1 | byte[] hash = multihash.toBytes(); |
Read content from IPFS
In order to read a file on the IPFS network, we need to pass the hash (multihash) of the Object we want to retrieve. Then IPFS finds and retrieves the file from the closest peer hosting the file via the peer-to-peer network and a Distributed Hash Table.
Using java-ipfs-http-client
, there are two ways to read content from the IPFS network.
Read content into a Byte array
The most common way to find and read content from IPFS for a given hash is to use the method ipfs.cat(<hash>): byte[]
1 | try { |
It’s also possible to retrieve a file from a directory structure by passing the path of the file like this ipfs.cat(<hash>, <path>): byte[]
:
1 | try { |
Read content into a stream
The second way consists in using the method ipfs.catStream(<hash>): InputStream
to write the response in a Stream.
1 | try{ |
Pin/Unpin content
Adding a file on IPFS only creates a copy of the file in one location (your node), so the file is readable from any node unless your node goes offline. Pinning is the action to replicate a file (already available somewhere on the network) to our local node.
This method is useful to bring speed and high availability to a file.
Pin
The method ipfs.pin.add(<hash>): void
offers to pin a file by hash on our node.
1 | try { |
Pinning an Object linked to other Objects (children) such as a directory automatically pins all the subsequent children.
Unpin
The reverse operation is also possible with the method ipfs.pin.rm(<hash>, <recursive>): void
which removes a file from our node.
1 | try { |
We can use the flag recursive [boolean]
to remove (unpin) all the subsequent linked objects to the object identified by the hash (default true).
List
Finally, we can list all the content hosted on our local node with the method ipfs.pin.ls(<pinType>): Map<Multihash, Object>
1 | try { |
We can request different types of pinned keys to list:
-
all
: All Objects -
direct
: Objects pinned directly -
indirect
: Objects referenced by recursive pins -
recursive
: Roots of recursive pins (like direct, but also pin the children of the object)
IPNS
IPNS stands for “InterPlanetary Naming System” and represents a global mutable namespace accessible from anywhere on the IPFS network to assign a name against a hash (similar to a DNS server assigning a name against a server IP). This can be useful when we want to share a link of a mutable object.
Let’s say for example we want to host an article on IPFS, the article (version 1) has a unique hash, but if we decide to update the article and host it on IPFS, the hash is different, and we have to reshare the new hash. We can use IPNS to prevent this issue, it is possible to link a name to a hash and update the hash a much as we want so we only have to reassign the hash of the article to a name and share the name, if we update the article, we only have to update the name resolution to point to the latest version.
Note: IPNS is still work in progress and is slow to use, it takes approximately 1-2 min to publish a name.
Keys
IPNS is based on a distributed Public Key Infrastructure (PKI). To get started, we need a keypair available on our IPFS node.
We can use the keypair to store one key/value pair where the key represents the “name” and “value” the hash to resolve.
Generate a key
First, we need to generate a keypair using the method ipfs.key.gen(name, type, size): KeyInfo
.
1 | try { |
The following function returns a KeyInfo
object composed of the name and the id (multihash) of the key representing the name which we can use to resolve a hash.
Delete a key
It is also possible to remove a key using ipfs.key.rm(keyName): void
.
1 | try { |
List all keys
The method ipfs.key.list()
allows us to list all keys available on our node.
1 | try { |
The self
key represents the default key generated when we launch IPFS for the first time.
Publish
Once we have a exclusive keypair available, we can use it to publish a hash against it using the method ipfs.name.publish(hash, keyName)
:
1 | try { |
Note that this operation is particularly slow and can take up to two minutes to execute
Resolve
Just like a DNS, reading an object from an IPNS name is a two steps process:
- Resolve the hash against the name
- Read the content from the hash
1 | try { |
Other operations
The java-ipfs-http-client
library wraps many other API operations available on the node.
Node version
To get the Node version we are connected to, the library provides the method ipfs.version(): String
1 | try { |
Node peers
To retrieve the list of Peers connected to our local node:
1 | List<Multihash> peers = ipfs.refs.local() |