Posts

Algorand Node API Acceleration

Accelerating Performance of the Algorand Node API

It’s been a few months since we launched our Algorand API as a Service. In that time, we have gotten a lot of great feedback on what is working well and what still needs more work.

One thing we have heard and observed ourselves is that certain operations time out and don’t return any data. Digging into this, we observed that one of the queries that we provide has highly variable performance associated with it. Our API service is fronting pools of Algorand nodes, and calls are serviced by the Algorand Node Rest API on these nodes. The performance issues we are seeing are issues with a specific query to the Node Rest API itself.

The /v1/account/{address}/transactions Endpoint

The REST endpoint that has the performance issues is /v1/account/{address}/transactions. This query is only available if you run a full archival indexer node.

The purpose of this endpoint is to be able to query the transaction history for a given account. It is a useful query from a developer point of view as you could use it to, for example, populate a transaction history when looking at an account in a wallet or other application.

Sometimes, this query returns quickly and sometimes, the query times out. By default, the query has a max result set of 100. The current behavior of the query is that, given an account, the indexer node will start walking backwards from the head of the chain looking for transactions to fill its result bucket that meet the query constraints.

This becomes problematic for accounts with relatively low transaction volume. If there has been a lot of transaction activity on the account, the query will reach the result set limit quickly and return. If there has not been a lot of transaction activity, it will keep walking backwards — all the way to genesis — looking for transactions that meet the query criteria. In this second, low-transaction activity scenario, the query generally times out and also has the side effect of making the node unresponsive to other queries.

This second scenario is problematic, since users are actively hitting this endpoint, which starts taking nodes in our node pools offline for periods of time. While we have plenty of capacity in our node pools, if enough of these queries were received in rapid succession, we could suffer an outage due to a kind of denial of service situation.

Even for accounts that have a large transaction volume, when parameters restricting the scope of the search are used, for instance fromDate and toDate, the query becomes non-performant and the service suffers.

Improving the Endpoint with a Backend Datastore

First, it is important to recognize that the node can’t be optimized for every situation. It already performs a variety of different roles including supporting consensus, relaying, etc. However, given the current behavior of this query, we must provide an improved path to our customers so they can reliably retrieve transaction data.

We have decided to replace the backend handler for this query with a datastore that is optimized to return results much more quickly and efficiently.

This backend datastore is based on AWS Aurora and includes a set of AWS lambda data management routines to keep this datastore reliably in sync with the Algorand TestNet, MainNet, and BetaNet. Queries coming into this particular endpoint will be serviced by this new datastore. All other queries will remain serviced by our node pools.

GET STARTED WITH PURESTAKE’S API SERVICE FOR ALGORAND

Philosophical Questions

The direct node API represents the truth in terms of the state of the Algorand blockchain. The downside to servicing this query with an alternate third-party data store is that there is a chance that the third-party will return the wrong data due to a bug or other problem with the infrastructure.

At PureStake, we spent some time debating this situation. We could have just turned off the endpoint, but that didn’t seem like a good solution, and certainly not helpful to developers trying to build on Algorand. After all, this is a useful query to have available for a variety of use cases.

Additionally, we had to consider that the way the query works against the node isn’t necessarily what most developers want. Even for the scenarios where the query returns — accounts with high transaction activity — you only get the last 100 transactions, not all of them.

In the spirit of providing a solution to the intent behind this endpoint, we decided to move ahead with offering a more performant version of this query, while at the same time making it compatible with the available SDKs. However, we will clearly distinguish between queries that are serviced by the node and ones that are serviced by other infrastructure we are running. We will mark this particular query in the response headers and in our portal as being serviced in a different way to the other node api queries. We feel this approach strikes a good compromise and helps developers building on Algorand.

Comparative Performance of the New Endpoint

To take a concrete example, the following query when run against a TestNet indexer node will time out (on a reasonably spec’d machine):

GET /v1/account/EWZYOHWLR2C44MDIPNZMOGZMAAY66BWI2ALGJB2NE22TWO7YGNCS7NTFVQ/transactions

The state of account EWZYOHWLR2C44MDIPNZMOGZMAAY66BWI2ALGJB2NE22TWO7YGNCS7NTFVQ is that it has 2 transactions in its history, less than the 100 max results query limit, so the node will start walking back towards genesis looking for transactions. The query times out in 30 seconds for API users, but will continue running on the node until complete – in a test for this article the query was manually stopped after 25 minutes. The io / iops on the node goes very high while the query is running on the node and consumes the allotted burst capacity, eventually locking the machine up.

Backed by the new datastore queries to this endpoint return in under 1 second generally or 3 seconds or under from a cold lambda start .

Next Steps and Future Direction

It isn’t possible for the node to achieve high performance for all queries.

PureStake’s new datastore will be the basis of a new query-optimized set of endpoints that will be offered alongside the node-based APIs. These APIs would be well-suited to certain types of applications, such as explorers, wallets, etc. However, users will have the choice to opt for the node-backed APIs or the query-optimized APIs, depending on their preferences.

PureStake will likely continue to create different kinds of optimized data stores over time in order to support different types of queries and use cases. A de-normalized data warehouse is another obvious optimization for aggregate and over-time type data queries that aren’t possible with the current node API.

We may remove our query-optimized endpoint for this specific node API in the future, if the performance and behavior of the underlying node API changes.

Are there other APIs or features you would like to see added to the PureStake API services application? Reach out to us and let us know.

 

Buy vs. Build Infrastructure-as-a-Service Blog Featured Image

Advantages of Buying Blockchain Infrastructure-as-a-Service vs. BUIDLing It Yourself

In everyday life, we are faced with decisions to either buy readymade solutions or to build something from scratch.

Whether it’s a large purchase like buying a home, or something considerably smaller like choosing between two couches, there are pros and cons to each side. Do you want to stay up until 2:30 in the morning putting together a couch? For the right price, a lot of folks would say, “Absolutely,” while others would say, “No shot!”

With the rise of blockchain as a viable platform, the business community seems to be posed with this question at all levels. Cost, effort, risk, focus, and quality all factor into every decision a company makes, including whether to build the infrastructure that run these applications and platforms, or to pursue a third party vendor that offers blockchain infrastructure-as-a-service.

Risk vs. Reward

The allure of blockchain is real: the technology as a whole promises dramatic cost savings (up to 70%!) to banks and financial institutions. Since up to two-thirds of those costs are attributable to infrastructure, it’s imperative to pursue an infrastructure strategy that captures as much of that cost savings as possible.

But blockchain projects can be deceivingly costly. A recent report of government-sponsored blockchain projects revealed that the median project cost is $10-13 million.

At first glance, building infrastructure in-house seems like the most cost-effective way to approach the blockchain: there are no licensing fees, and your company is in complete control.

Of course, there are always trade-offs: an in-house infrastructure project is very taxing on your organization’s resources.

SECURE INFRASTRUCTURE-AS-A-SERVICE FOR ALGORAND NODES

Your team must have the time and operational skillset to build out a secure infrastructure that is scalable enough to support your blockchain network of choice. Those skills are hard to come by: blockchain skills are among the most sought-after. The rates of a freelance blockchain developer hovers around $81-100 per hour in the US, sometimes going as high as $140+ per hour.

It’s easy to underestimate how much time it will take to create, and whether your team really has the skills and ability to create the offering.

In addition to infrastructure, you’ll also need to build or secure vendors that can address storage needs, network speeds, encryption, smart contract development, UX/UI, and more. Each of those initiatives is going to require additional dedicated budget.

The question then becomes: what kind of advantage does this create? Much of this will depend upon the number of partially or fully decentralized applications (DApps) that you plan to run on it, and how many of them can and will share the same underlying infrastructure.

The ‘aaS’ Revolution

When evaluating your options for a new project, the project planning stage is always tricky. You’ll need to do a full scoping of the project, allocate responsibilities, and create a vendor vetting process.

In the past it was easy: you went out, bought some software and hardware, and got to building. But once the internet made it easier for companies to provide ‘as-a-service’ offerings, it added a layer of complexity for IT and engineering teams as to what options made sense for their project or organization. Salesforce began to displace massive Oracle on-premises implementations. Broadsoft started to displace PBXs and made phone closets a thing of the past. The list goes on and on with applications that replaced their on-prem brethren from years prior because the ongoing maintenance and upkeep was a headache for IT teams to manage. Why keep all of the infrastructure under your management when you could push all that work onto an infrastructure-as-a-service company’s plate, since their primary focus was supporting that exact technology?

This is great from an IT perspective, but what about for engineers and developers? Don’t they need to be able to store their code and applications locally? Don’t they need to own all of the pieces that tie in to their application? Oh and security, THE SECURITY!

Sorry for being dramatic, but the answer is no. These are all valid concerns, however, many of them can be addressed by working with the right service provider that suits your needs.

Uber is a great example of leveraging third-party service platforms to create an application. Did they need to go out and create a maps platform? Nope, they used Google for routing and tracking. Did they need to go out and spin up their own messaging and voice servers? Nope, they use Twilio for their communication services. They took a buy-centric approach which enabled them to focus on their core application and remove the need to focus on things outside of their core skill set.

How We Apply This to Blockchain

How difficult is it to build? How costly is it to manage? Do we have the skillset to support it? These are all questions that companies ask themselves when looking at making an investment for any kind of infrastructure.

On top of the infrastructure, it only takes a few minutes to realize that DevOps is really hard to do well. Making sure that the investments you’re making align with your team’s skill set is critical for your success. So if you’re looking around, saying “We need to bring in DevOps engineers for our Algorand project,” then HARD STOP! Check out below.

PureStake was created with this exact use case in mind. We provide secure and scalable blockchain infrastructure-as-a-service to help everyone from investors to developers better interact with the Algorand network. We’ve recently launched an API service that will provide an on-ramp to Algorand for any application looking to build on their pure proof of stake network. We offer a variety of subscriptions so that, regardless of size or budget (we have free, and free is good), you’ll be able to utilize our service and start interacting with the Algorand network within minutes.

 

Teal Windows Background Graphic

Getting Started with the Algorand REST API and the PureStake API Service

Since Algorand’s MainNet launch, PureStake has focused on building and delivering highly performant infrastructure to support early adopters of the Algorand network. Earlier this year, we launched an Algorand infrastructure service offering that delivers relay and participation nodes as a service targeted at early supporters and customers that want to be active participants in the network. However, we found that these services are not ideally matched to the needs of developers building applications on top of the Algorand network. So we’ve released a new Algorand API service that is specifically designed to help developers get started with Algorand REST APIs quickly and easily.

The Need for an Algorand API Service

PureStake’s Algorand infrastructure services are centered around managing the lifecycle of Algorand relay and participation nodes in an automated and secure way. Managed relay and participation nodes make sense for customers that want to — or have an obligation to — support running the network, but don’t fulfill the needs of developers who are writing applications that interact with the Algorand blockchain.

For DApp developers, the nodes are a means to an end — a way of reading data from or sending transactions to the network. They need a simpler alternative to running their own nodes, which can be costly and time-consuming.

The PureStake API service simplifies interactions with the Algorand network by hiding the complexity of running and managing nodes from the user.

Why Running Algorand Nodes Can Be Challenging

Developers always have the option of downloading and running their own nodes. However, running Algorand nodes requires both significant infrastructure investment and the right operational skills.

For example, most development use cases dictate running full archival transaction indexer nodes to achieve the best possible performance for querying transactions. The storage and sync time requirements for this type of node quickly increase as the block height increases. In the case of the Algorand MainNet, which has been live for about two months and has a block height of 1.4M blocks (as of August 2019), a transaction indexer node requires at least 20GB of storage. However, since the index database grows as a function of the number of transactions, we can expect the storage growth rate to significantly increase as transaction volume on the MainNet increases, further expanding the storage requirements.

The PureStake API Service Simplifies Interactions with the Algorand REST APIs

The API service is a natural extension of PureStake’s Algorand infrastructure platform that we built to support Algorand relay and participation nodes. Our platform uses an infrastructure-as-code approach to deploy security, networking, cloud configurations, compute, storage, and other elements into cloud data centers in an automated fashion.

The API service, which was built on top of this platform, is spread across multiple cloud data centers and features an API network layer, an API management layer, a caching layer, a load balancing layer, and a node backend layer. Each of these layers is fully redundant and managed/monitored 24×7.

GET STARTED WITH PURESTAKE’S API SERVICE FOR ALGORAND

How the API Service Works

The API network access layer is supported by a worldwide edge network with many peering points for request ingress, where requests are then privately routed to one of our POPs. In the POP, the API management layer then handles authentication, authorization, accounting, and further routing of API requests. It will check received requests for a valid API key header, whether the request is valid according to the account requests limits, and other security checks. It then logs the request, which is used to power end user-facing features such as endpoint utilization charts in the API dashboard. The management layer then routes API requests to backend services that can handle them.

Some queries will be handled by a high-performance cache. Other queries will be routed to the load balancer layer, which has awareness of the node resources available and routes requests on to an available Algorand node. The node layer has pools of Algorand transaction indexer nodes that can be swapped out and maintained without any downtime. These nodes are patched and updated with the latest Algorand node software as new versions are available.

What is the Difference Between the PureStake APIs and the Algorand REST APIs?

The PureStake API Service supports the official Algorand node API in the same form as exposed by the Algorand node software, which adds consistency and makes it easy to move off our service and back to self-managed nodes if needed. This design choice was an intentional one, since many proprietary APIs create vendor lock-in for its users. The only differences in our API service and the official Algorand REST APIs are the addition of the X-API-Key header that we require to secure access to our service, and the removal of the API that provides metrics about the nodes themselves. Through this approach, our users have the freedom to move between API services and self-managed nodes as needed.

Currently, our API service supports the Algod API, but not the KMD API. The Algod API can be used for querying historical information from the blockchain, getting information about blocks and transactions, and sending transactions. The KMD API, by contrast, is used for wallet management, key management, and signing transactions with private keys. We have intentionally chosen not to expose the KMD API, as we do not want any customer secrets or keys on our servers. However, customers can manage secrets and sign transactions within their applications, and post signed transactions to our API.

How the PureStake API Service Impacts the Decentralization of the Algorand Network

An essential property of the Algorand network is its decentralization. PureStake is a centralized company providing a centralized service to access that decentralized network. At first glance, it may seem like a centralized service could threaten the decentralized nature of the network (particularly if all or most of the access to the Algorand network happens through the service). Similar concerns have been raised in the Ethereum community in relation to the large number of applications relying on the Infura service to access the Ethereum network. While it may seem counterintuitive, centralized services can actually serve to support and promote the best interests of decentralized networks such as Algorand.

The first thing to point out is that this decentralization risk is not a design or protocol-level risk. No one is forced to use the service. Anyone using the service can leave and run their own nodes at any time.

In fact, there is no reason decentralization can’t proceed normally with lighter weight nodes. Algorand is going to great lengths to make sure nodes supporting the consensus mechanism do not have large storage or other infrastructure requirements. So, if someone just wants to get current account balances, submit transactions, and support consensus, they can do this with non-archival participation nodes that have much lower requirements and may not require a service provider. In addition, the upcoming vault improvements to Algorand will greatly reduce the sync time for participation nodes as well. The developer use case specifically lends itself to larger infrastructures and a service provider approach.

Secondly, Algorand needs developers, applications, and utilization of the network to be successful in the long term. The PureStake API service makes the on-ramp for developers substantially easier and will help grow the utilization and traction of the network. While there may be a hypothetical form of centralization risk in the future if the service is wildly successful, this possible future risk is far outweighed by the direct benefits to the Algorand community in helping to get traction that drives transaction volume and network utilization. In a future with more developers, applications, and network utilization, we expect competitive developer-oriented services to enter the market, which will continue to fragment the market.

Future Expansion and Long-Term Vision for the API Service

Support for the base Algorand node API is the first step for our API service. In the future, potential enhancements could include:

Additional Query-Optimized Data Stores: Taking Algorand block and transaction data and loading it into relational or NoSQL datastores opens possibilities for much more performant queries across the historical data set. These optimized data stores could be used to improve the performance of node API requests or to power net-new APIs.

Eventing Infrastructure: The idea would be to provide support for subscriptions for certain types of events, and to receive callbacks whenever they occur. DApp developers frequently implement these backend infrastructural features to improve the performance of their applications.

Getting Started with the PureStake API Service

Users can register for a free account at https://developer.purestake.io/.

Once logged into the API Services application, users will have access to an API key that is unique to their user account in their dashboard. This API key needs to be added as the value of the X-API-Key request header for any API requests being made to the PureStake service.

There are examples of how to do this and how to use the API once you have logged in at https://developer.purestake.io/code-samples and also in our Github repo https://github.com/PureStake/api-examples.

Do you have a question or an idea for a useful enhancement to the API service? Feel free to reach out to us!