- Notifications
You must be signed in to change notification settings - Fork0
Snappy compressor/decompressor for Java
License
ossdev07/snappy-java
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
snappy-java is a Java port of the snappyhttp://code.google.com/p/snappy/, a fast C++ compresser/decompresser developed by Google.
- Fast compression/decompression around 200~400MB/sec.
- Less memory usage. SnappyOutputStream uses only 32KB+ in default.
- JNI-based implementation to achieve comparable performance to the native C++ version.
- Although snappy-java uses JNI, it can be used safely with multiple class loaders (e.g. Tomcat, etc.).
- Compression/decompression of Java primitive arrays (
float[]
,double[]
,int[]
,short[]
,long[]
, etc.)- To improve the compression ratios of these arrays, you can use a fast data-rearrangement implementation (
BitShuffle
) before compression
- To improve the compression ratios of these arrays, you can use a fast data-rearrangement implementation (
- Portable across various operating systems; Snappy-java contains native libraries built for Window/Mac/Linux (64-bit). snappy-java loads one of these libraries according to your machine environment (It looks system properties,
os.name
andos.arch
). - Simple usage. Add the snappy-java-(version).jar file to your classpath. Then call compression/decompression methods in
org.xerial.snappy.Snappy
. - Framing-format support (Since 1.1.0 version)
- OSGi support
- Apache License Version 2.0. Free for both commercial and non-commercial use.
Snappy's main target is very high-speed compression/decompression with reasonable compression size. So the compression ratio of snappy-java is modest and about the same as
LZF
(ranging 20%-100% according to the dataset).Here are somebenchmark results, comparingsnappy-java and the other compressors
LZO-java
/LZF
/QuickLZ
/Gzip
/Bzip2
. ThanksTatu Saloranta @cotowncoder for providing the benchmark suite.The benchmark result indicates snappy-java is the fastest compreesor/decompressor in Java:http://ning.github.com/jvm-compressor-benchmark/results/canterbury-roundtrip-2011-07-28/index.html
The decompression speed is twice as fast as the others:http://ning.github.com/jvm-compressor-benchmark/results/canterbury-uncompress-2011-07-28/index.html
The current stable version is available from here:
- Release version:http://central.maven.org/maven2/org/xerial/snappy/snappy-java/
- Snapshot version (the latest beta version):https://oss.sonatype.org/content/repositories/snapshots/org/xerial/snappy/snappy-java/
- Snappy-java is available from Maven's central repository:http://central.maven.org/maven2/org/xerial/snappy/snappy-java
Add the following dependency to your pom.xml:
<dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> <version>(version)</version> <type>jar</type> <scope>compile</scope></dependency>
libraryDependencies += "org.xerial.snappy" % "snappy-java" % "(version)"
First, importorg.xerial.snapy.Snappy
in your Java code:
importorg.xerial.snappy.Snappy;
Then useSnappy.compress(byte[])
andSnappy.uncompress(byte[])
:
Stringinput ="Hello snappy-java! Snappy-java is a JNI-based wrapper of " +"Snappy, a fast compresser/decompresser.";byte[]compressed =Snappy.compress(input.getBytes("UTF-8"));byte[]uncompressed =Snappy.uncompress(compressed);Stringresult =newString(uncompressed,"UTF-8");System.out.println(result);
In addition, high-level methods (Snappy.compress(String)
,Snappy.compress(float[] ..)
etc. ) and low-level ones (e.g.Snappy.rawCompress(.. )
,Snappy.rawUncompress(..)
, etc.), which minimize memory copies, can be used.
Stream-based compressor/decompressorSnappyOutputStream
/SnappyInputStream
are also available for reading/writing large data sets.SnappyFramedOutputStream
/SnappyFramedInputStream
can be used for theframing format.
- See alsoJavadoc API
SnappyOutputStream
andSnappyInputStream
use[magic header:16 bytes]([block size:int32][compressed data:byte array])*
format. You can read the result ofSnappy.compress
withSnappyInputStream
, but you cannot read the compressed data generated bySnappyOutputStream
withSnappy.uncompress
. Here is the data format compatibility matrix:SnappyHadoopCompatibleOutputStream
does not emit a file header but write out the current block size as a preemble to each block
Write\Read | Snappy.uncompress | SnappyInputStream | SnappyFramedInputStream | org.apache.hadoop.io.compress.SnappyCodec |
---|---|---|---|---|
Snappy.compress | ok | ok | x | x |
SnappyOutputStream | x | ok | x | x |
SnappyFramedOutputStream | x | x | ok | x |
SnappyHadoopCompatibleOutputStream | x | x | x | ok |
BitShuffle is an algorithm that reorders data bits (shuffle) for efficient compression (e.g., a sequence of integers, float values, etc.). To use BitShuffle routines, importorg.xerial.snapy.BitShuffle
:
importorg.xerial.snappy.BitShuffle;int[]data =newint[] {1,3,34,43,34};byte[]shuffledByteArray =BitShuffle.shuffle(data);byte[]compressed =Snappy.compress(shuffledByteArray);byte[]uncompressed =Snappy.uncompress(compressed);int[]result =BitShuffle.unshuffleIntArray(uncompress);System.out.println(result);
Shuffling and unshuffling of primitive arrays (e.g.,short[]
,long[]
,float[]
,double[]
, etc.) are supported. SeeJavadoc for the details.
If you have snappy-java-(VERSION).jar in the current directory, use-classpath
option as follows:
$ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java # in Windowsor$ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java # in Mac or Linux
Post bug reports or feature request to the Issue Tracker:https://github.com/xerial/snappy-java/issues
Public discussion forum is here:Xerial Public Discussion Group
snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage
$ ./sbt # enter sbt console> ~test # run tests upon source code change> ~test-only * # run tests that matches a given name pattern > publishM2 # publish jar to $HOME/.m2/repository> package # create jar file> findbugs # Produce findbugs report in target/findbugs> jacoco:cover # Report the code coverage of tests to target/jacoco folder
If you need to see detailed debug messages, launch sbt with-Dloglevel=debug
option:
$ ./sbt -Dloglevel=debug
For the details of sbt usage, see my blog post:Building Java Projects with sbt
See thebuild instruction. Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc.
$ git clone https://github.com/xerial/snappy-java.git$ cd snappy-java$ make
When building on Solaris, usegmake
:
$ gmake
A filetarget/snappy-java-$(version).jar
is the product additionally containing the native library built for your platform.
Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders.
Prepare org-xerial-snappy.properties file (under the root path of your library) in Java's property file format.Here is a list of the available properties:
- org.xerial.snappy.lib.path (directory containing a snappyjava's native library)
- org.xerial.snappy.lib.name (library file name)
- org.xerial.snappy.tempdir (temporary directory to extract a native library bundled in snappy-java)
- org.xerial.snappy.use.systemlib (if this value is true, use system installed libsnappyjava.so looking the path specified by java.library.path)
Snappy-java is developed byTaro L. Saito. Twitter@taroleo
About
Snappy compressor/decompressor for Java
Resources
License
Stars
Watchers
Forks
Packages0
Languages
- Java46.0%
- C++39.2%
- Shell10.4%
- Scala1.6%
- Makefile1.4%
- C1.0%
- Other0.4%