Estimating the Cost of xAPI: Using Data Simulation to Budget Your Implementation
Factors that Drive Increased Costs
One of the most frustrating things about xAPI is that it is difficult to gauge how much an implementation is going to cost. This is due to several factors.
xAPI Data Statement Size
xAPI statements can be of highly variable size. For example, statements can either be extremely simple and contain the minimum amount of information necessary to be conformant or they may include huge extensions. Those complicated extensions can drive higher costs.
xAPI Data Volume and Throughput
Different use cases will drive different amounts of data throughput in most systems. For example, the amount of data collected from an LMS will typically be far less than the amount of data captured by a VR experience. The difference in volume (how much total data is produced) and throughput (how fast and often it hits the LRSs) will have an impact on cost.
xAPI Data Design Quality
It is possible to produce xAPI data that is conformant, but which is junk. Often that junk data is poorly conceived and the result can be bottlenecks in your system which impact operational data. This can cause costs to rise.
In addition to these factors, there are a host of other matters that cause costs to increase, including platform and deployment decisions, multi-regional deployment, the decision to federate LRSs, database choices, and licensing matters.
How to Estimate the Cost of xAPI
In order to account for xAPI statement size, volume and throughput, and data design quality when pricing an xAPI solution, we recommend building a synthetic data set representative of the data that you expect to run through your system. The easiest way to do this is to leverage DATASIM — an Apache 2.0 open source synthetic data generator designed exclusively for xAPI. With DATASIM, you can design and build a dataset of any scale for use in end-to-end system load testing. And unlike generic dummy data, DATASIM’s synthetic xAPI data can be modeled to the exact specifications of the actual xAPI data that you’ll be collecting — so, you get the best idea of what your costs will actually be.
How to Use DATASIM
DATASIM is available on Github and is free to use.
When designing your dataset, you will create a simulation specification that aligns to an xAPI Profile representative of your use case. DATASIM will generate data based on the patterns available in the xAPI Profile and can be modified to feature custom actors/learners/instructors (we call these Personae) as well as alignments that help to give weight to certain aspects of your xAPI Profile patterns that you expect to be more representative of data generation in your real-world scenario.
Use of DATASIM requires some knowledge of xAPI Profiles and the ability to choose or design an xAPI Profile that is representative of your use case. If you need help designing an xAPI Profile, you can use the Centriph platform. Centriph is an xAPI Profile data authoring platform featuring tools for both the business user and the programmer. It is in beta and is free to use through the end of the year.
If you need help designing xAPI Profiles or using DATASIM, let us know and we’d be happy to assist. A major part of our work here at Yet Analytics is designing and implementing xAPI Profiles.