What is FAPEC?
Here at DAPCOM you have some pages presenting or describing what is FAPEC, such as here, here, in some of these publications, or in this flyer. But to make it short: What is FAPEC? And which are its benefits?
FAPEC is a data compression algorithm implemented as a software application.
That is: we simply reduce the size of your data to reduce the requirements on disk space or transfer time.
There are other solutions like Zip, BZip2, Rar, 7-Zip or (more recently) Zstandard which have been widely used for “generic” data compression since years. There are also other solutions aimed at “specific” types of data, such as JPEG (for images), MP3 (for sound and music) or MPEG4 (for video), some of which may introduce losses in the data to allow compressing better at the cost of a slight quality reduction. In general, there is, actually, a large variety of data compression software.
So what makes FAPEC different?
FAPEC is a staged data compressor:
It is based on a first pre-processing stage, which is adapted to the type of data being compressed, followed by a second entropy coding stage, based on our patented technology, performing a fast statistical analysis to select the most efficient binary codes.
Other solutions simply try to perform an exhaustive search for repeated strings or values, which can be significantly slow. Or they are restricted to a specific type of data, thus not applicable to other types.
With this staged approach, FAPEC is able to efficiently handle a wide variety of data – all in a single, lightweight, fast and multi-platform software program.
You can either let FAPEC detect the most adequate pre-processing stage and options, or you can fix it by yourself.
What are the real benefits?
Users basically care about two indicators: ratio and speed.
In many solutions, there is typically a balance between both: you can either compress better but slower, or compress worse but faster.
By applying the adequate algorithms to your specific data, FAPEC can break this restriction and achieve high compression ratios at high speeds.
What is even more important: by better adapting to your data, and depending on the case, FAPEC can achieve significantly higher ratios than other algorithms.
…more specifically?
These are some of the data types where FAPEC excels:
- Binary files with time series, such as sensor measurements (temperature, pressure, brightness, energy flux, power…), either as integer or floating-point values
- Multi-dimensional data (binary values arranged in a table or matrix)
- Raw multi-band images, such as color pictures, and specially multispectral or hyperspectral imagery
- Log files, such as those generated by data processing systems
- Tabular text data, such as CSV files or text files with LIDAR or Point Cloud data
- Genomics data, such as FastQ and VCF files
- As an example of a tailored professional stage, watercolumn data files generated from Kongsberg Maritime multibeam echosounders
For some of these cases (such as images), FAPEC offers a lossy option, allowing to slightly reduce the data quality to achieve higher ratios. The default option is a lossless operation.
How much can we gain from FAPEC?
It depends on the specific kind of data, and it can also vary with each file or data block.
In general, at least on the mentioned data types, FAPEC can outperform other solutions like Zip by 10-20%, and in some cases it can even double it.
Speed is also important: even in single-threaded mode, FAPEC typically compresses much faster than other solutions. In some cases, FAPEC can compress 10 times faster than other solutions. Decompression speed is also excellent, exceeding 1 GB/s in some cases.
If you want to know for sure how much can you get from FAPEC, you can simply test it by yourself!
What else does FAPEC offer?
- Chunk-based operation: if your compressed file gets corrupted, FAPEC will try to recover it, minimizing data loss.
- Multi-file: you can store over 8 million files or folders in a single FAPEC archive.
- Multi-thread: do you have a many-core processor? FAPEC can use up to 62 threads for a lightning speed.
- Encryption: AES-256 (requiring OpenSSL libraries) or our own implementation of the XXTEA algorithm.
- License-enforced privacy: you can generate FAPEC archives that can only be decompressed with your license.
- On-the-fly statistics generation: while compressing each file, FAPEC can generate a log file with the partial ratios obtained for each data chunk. Some stages generate additional statistics on the data contents. It offers a kind of digest of the data complexity, allowing to quickly detect some features in the data, for example.
- DAPCOM support: we will help you to achieve the best compression results on your data. We can also design and implement specific pre-processing stages for your case!
Where can I run FAPEC?
FAPEC is mostly implemented in ANSI C with some POSIX extensions.
You can run it on Linux, Mac OS and Windows; x86 (32 or 64 bits), ARM (32 or 64 bits) or Power PC; Little or Big Endian.
It is lightweight (less than 1MB), and you can run it on low-range computers with slow processors and small RAM. By selecting a small enough chunk size you can run it with less than 1MB of RAM.
FAPEC can actually run in almost any computer, from satellites to supercomputers.
How do I use FAPEC?
You can use it from the command-line, as an executable program. You can invoke it on files or on streams (standard input/output).
It can also be invoked through its C API, invoking it on files or memory buffers. Thus, you can integrate FAPEC in your own software.
We will soon offer the Java API (through JNI) and Python API, as well as the FAPEC integration in HDF5, NetCDF and FITS. We are also preparing a Graphical User Interface, implemented in Java for better portability.