357 lines
No EOL
14 KiB
Text
357 lines
No EOL
14 KiB
Text
Universal Binary JSON Java Library
|
|
http://ubjson.org
|
|
|
|
|
|
About this project...
|
|
---------------------
|
|
This code base is actively under development and implements the latest
|
|
specification of Universal Binary JSON (Draft 8).
|
|
|
|
I/O is handled through the following core classes:
|
|
|
|
* UBJOutputStream
|
|
* UBJInputStream
|
|
* UBJInputStreamParser
|
|
|
|
Additionally, if you are working with Java's NIO and need byte[]-based
|
|
results, you can wrap any of the above I/O classes around one of the highly
|
|
optimized custom byte[]-stream impls:
|
|
|
|
* ByteArrayInputStream (optimized for reuse, not from JDK)
|
|
* ByteArrayOutputStream (optimized for reuse, not from JDK)
|
|
|
|
If you are working with NIO and want maximum performance by using (and reusing)
|
|
direct ByteBuffers along with the UBJSON stream impls, take a look at the:
|
|
|
|
* ByteBufferInputStream
|
|
* ByteBufferOutputStream
|
|
|
|
classes. You can wrap any ByteBuffer source or destination with this stream type,
|
|
then wrap that stream type with a UBJSON stream impl and utilize the full
|
|
potential of Java's NIO with Universal Binary JSON without giving yourself an
|
|
ulcer.
|
|
|
|
This allows you to re-use the streams over and over again in a pool of reusable
|
|
streams for high-performance I/O with no object creation and garbage collection
|
|
overhead; a perfect match for high frequency NIO-based communication.
|
|
|
|
All of the core I/O classes have been stable for a while, with tweaks to make the
|
|
performance tighter and the error messages more informative over the last few
|
|
months.
|
|
|
|
More Java-convenient reflection-based I/O classes are available in the
|
|
org.ubjson.io.reflect package, but they are under active development.
|
|
|
|
There are other efforts (like utilities) in other sub portions of the source
|
|
tree. This project intends to eventually contain a multitude of UBJSON
|
|
abstraction layers, I/O methods and utilities.
|
|
|
|
|
|
Changelog
|
|
---------
|
|
02-10-12
|
|
* Added ByteBuffer input and output stream impls as compliments to the
|
|
re-usable byte[] stream impls.
|
|
|
|
Provides a fast translation layer between standard Java stream-IO and the
|
|
new Buffer-based I/O in NIO (including transparent support for using
|
|
ultra-fast direct ByteBuffers).
|
|
|
|
* Optimized some of the read/write methods by removing unnecessary bounds
|
|
checks that are done by subsequent Input or Output stream impls themselves.
|
|
|
|
02-09-12
|
|
* Fixed bug with readHugeAsBigInteger returning an invalid value and not
|
|
treating the underlying bytes as a string-encoded value.
|
|
|
|
* Removed implicit buffer.flip() at the end of StreamDecoder; there is no
|
|
way to know what the caller had planned for the buffer before reading all
|
|
the data back out. Also the flip was in the wrong place and in the case of
|
|
an empty decode request (length=0) the flip would not have been performed,
|
|
providing the caller with a "full" buffer of nothing.
|
|
|
|
* Rewrote all readHugeXXX method impls; now huge's can be read in as a
|
|
simple Number (BigInteger or BigDecimal) as well as raw bytes and even
|
|
decoded chars. Additionally the methods can even accept and use existing
|
|
buffers to write into to allow for tighter optimizations.
|
|
|
|
* Rewrote all readStringXXX methods using the same optimizations and
|
|
flexibility that readHuge methods now use.
|
|
|
|
02-07-12
|
|
More Memory and CPU optimizations across all the I/O impls.
|
|
|
|
* StreamDecoder was rewritten to no longer create a ByteBuffer on every
|
|
invocation and instead re-use the same one to decode from on every single call.
|
|
|
|
* StreamDecoder now requires the call to pass in a CharBuffer instance to hold
|
|
the result of the decode operation. This avoids the creation of a CharBuffer
|
|
and allows for large-scale optimization by re-using existing buffers between
|
|
calls.
|
|
|
|
* StreamEncoder was rewritten to no longer create a ByteBuffer on every
|
|
invocation either and now re-uses the same single instance over and over
|
|
again.
|
|
|
|
* UBJOutputStream writeHuge and writeString series of methods were all
|
|
rewritten to accept a CharBuffer in the rawest form (no longer char[]) to stop
|
|
hiding the fact that the underlying encode operation required one.
|
|
|
|
This gives the caller an opportunity to cache and re-use CharBuffers over
|
|
and over again if they can; otherwise this just pushes the CharBuffer.wrap()
|
|
call up to the caller instead of hiding it secretly in the method impl under
|
|
the guise of accepting a raw char[] (that it couldn't use directly).
|
|
|
|
For callers that can re-use buffers, this will lead to big performance gains
|
|
now that were previously impossible.
|
|
|
|
* UBJInputStream added readHuge and readString methods that accept an existing
|
|
CharBuffer argument to make use of the optimizations made in the Stream encoder
|
|
and decoder impls.
|
|
|
|
01-15-12
|
|
Huge performance boost for deserialization!
|
|
|
|
StreamDecoder previously used separate read and write buffers for decoding
|
|
bytes to chars including the resulting char[] that was returned to the caller.
|
|
This design required at least 1 full array copy before returning a result in
|
|
the best case and 2x full array copies before returning the result in the
|
|
worst case.
|
|
|
|
The rewrite removed the need for a write buffer entire as well as ALL array
|
|
copies; in the best OR worse case they never occur anymore.
|
|
|
|
Raw performance boost of roughly 25% in all UBJ I/O classes as a result.
|
|
|
|
12-01-11 through 01-24-12
|
|
A large amount of work has continued on the core I/O classes (stream impls)
|
|
to help make them not only faster and more robust, but also more helpful.
|
|
When errors are encountered in the streams, they are reported along with the
|
|
stream positions. This is critical for debugging problems with corrupt
|
|
formats.
|
|
|
|
Also provided ByteArray I/O stream classes that have the potential to provide
|
|
HUGE performance boosts for high frequency systems.
|
|
|
|
Both these classes (ByteArrayInputStream and ByteArrayOutputStream) are
|
|
reusable and when wrapped by a UBJInputStream or UBJOutputStream, the top
|
|
level UBJ streams implicitly become reusable as well.
|
|
|
|
Reusing the streams not only saves on object creation/GC cleanup but also
|
|
allows the caller to re-use the temporary byte[] used to translate to and
|
|
from the UBJ format, avoiding object churn entirely!
|
|
|
|
This optimized design was chosen to be intentionally performant when combined
|
|
with NIO implementations as the ByteBuffer's can be used to wrap() existing
|
|
outbound buffers (avoiding the most expensive part of a buffer) or use
|
|
array() to get access to the underlying buffer that needs to be written to
|
|
the stream.
|
|
|
|
In the case of direct ByteBuffers, there is no additional overhead added
|
|
because the calls to get or put are required anyway to pull or push the
|
|
values from the native memory location.
|
|
|
|
This approach allows the fastest implementation of Universal Binary JSON
|
|
I/O possible in the JVM whether you are using the standard IO (stream)
|
|
classes or the NIO (ByteBuffer) classes in the JDK.
|
|
|
|
Some ancillary work on UBJ-based command line utilities (viewers, converters,
|
|
etc.) has begun as well.
|
|
|
|
11-28-11
|
|
* Fixed UBJInputStreamParser implementation; nextType correctly implements
|
|
logic to skip existing element (if called back to back) as well as validate
|
|
the marker type it encounters before returning it to the caller.
|
|
|
|
* Modified IObjectReader contract; a Parser implementation is required to
|
|
make traversing the UBJ stream possible without knowing what type of element
|
|
is next.
|
|
|
|
11-27-11
|
|
* Streamlined ByteArrayOutputStream implementation to ensure the capacity
|
|
of the underlying byte[] is never grown unless absolutely necessary.
|
|
|
|
* Rewrote class Javadoc for ByteArrayOutputStream to include a code snippet
|
|
on how to use it.
|
|
|
|
11-26-11
|
|
* Fixed major bug in how 16, 32 and 64-bit integers are re-assembled when
|
|
read back from binary representations.
|
|
|
|
* Added a numeric test to specifically catch this error if it ever pops up
|
|
again.
|
|
|
|
* Optimized reading and writing of numeric values in Input and Output
|
|
stream implementations.
|
|
|
|
* Optimized ObjectWriter implementation by streamlining the numeric read/write
|
|
logic and removing the sub-optimal force-compression of on-disk storage.
|
|
|
|
* Fixed ObjectWriter to match exactly with the output written by
|
|
UBJOutputStream.
|
|
|
|
* Normalized all testing between I/O classes so they use the same classes
|
|
to ensure parity.
|
|
|
|
11-10-11
|
|
* DRAFT 8 Spec Support Added.
|
|
|
|
* Added support for the END marker (readEnd) to the InputStreams which is
|
|
required for proper unbounded-container support.
|
|
|
|
* Added support for the END marker (writeEnd) to UBJOutputStream.
|
|
|
|
UBJInputStreamParser must be used for properly support unbounded-containers
|
|
because you never know when the 'E' will be encountered marking the end;
|
|
so the caller needs to pay attention to the type marker that nextType()
|
|
returns and respond accordingly.
|
|
|
|
* Added readHugeAsBytes to InputStream implementations, allowing the bytes
|
|
used to represent a HUGE to be read in their raw form with no decoding.
|
|
|
|
This can save on processing as BigInteger and BigDecimal do their own decoding
|
|
of byte[] directly.
|
|
|
|
* Added readHugeAsChars to InputStream implementations, allowing a HUGE
|
|
value to be read in as a raw char[] without trying to decode it to a
|
|
BigInteger or BigDecimal.
|
|
|
|
* Added writeHuge(char[]) to support writing out HUGE values directly from
|
|
their raw char[] form without trying to decode from a BigInteger or
|
|
BigDecimal.
|
|
|
|
* readArrayLength and readObjectLenght were modified to return -1 when an
|
|
unbounded container length is encountered (255).
|
|
|
|
* Fixed UBJInputStreamParser.nextType to correctly skip past any NOOP
|
|
markers found in the underlying stream before returning a valid next type.
|
|
|
|
* Normalized all reading of "next byte" to the singular
|
|
UBJInputStream.nextMarker method -- correctly skips over NOOP until end of
|
|
stream OR until the next valid marker byte, then returns it.
|
|
|
|
* Modified readNull behavior to have no return type. It is already designed
|
|
to throw an exception when 'NULL' isn't found, no need for the additional
|
|
return type.
|
|
|
|
* Began work on a simple abstract representation of the UBJ data types as
|
|
objects that can be assembled into maps and lists and written/read easily
|
|
using the IO package.
|
|
|
|
This is intended to be a higher level of abstraction than the IO streams,
|
|
but lower level (and faster) than the reflection-based work that inspects
|
|
user-provided classes.
|
|
|
|
* Refactored StreamDecoder and StreamEncoder into the core IO package,
|
|
because they are part of core IO.
|
|
|
|
* Refactored StreamParser into the io.parser package to more clearly denote
|
|
its relationship to the core IO classes. It is a slightly higher level
|
|
abstraction ontop of the core IO, having it along side the core IO classes
|
|
while .reflect was a subpackage was confusing and suggested that
|
|
StreamParser was somehow intended as a swapable replacement for
|
|
UBJInputStream which is not how it is intended to be used.
|
|
|
|
* Refactored org.ubjson.reflect to org.ubjson.io.reflect to more correctly
|
|
communicate the relationship -- the reflection-based classes are built on
|
|
the core IO classes and are just a higher abstraction to interact with UBJSON
|
|
with.
|
|
|
|
* Renamed IDataType to IMarkerType to follow the naming convention for the
|
|
marker bytes set forth by the spec doc.
|
|
|
|
|
|
10-14-11
|
|
* ObjectWriter rewritten and works correctly. Tested with the example test
|
|
data and wrote out the compressed and uncompressed formats files correctly
|
|
from their original object representation.
|
|
|
|
* Added support for reading and writing huge values as BigInteger as well
|
|
as BigDecimal.
|
|
|
|
* Added automatic numeric storage compression support to ObjectWriter - based
|
|
on the numeric value (regardless of type) the value will be stored as the
|
|
smallest possible representation in the UBJ format if requested.
|
|
|
|
* Added mapping support for BigDecimal, BigInteger, AtomicInteger and
|
|
AtomicLong to ObjectWriter.
|
|
|
|
* Added readNull and readBoolean to the UBJInputStream and
|
|
UBJInputStreamParser implementations to make the API feel complete and feel
|
|
more natural to use.
|
|
|
|
10-10-11
|
|
* com.ubjson.io AND com.ubjson.io.charset are finalized against the
|
|
Draft 8 specification.
|
|
|
|
* The lowest level UBJInput/OuputStream classes were tightened up to run as
|
|
fast as possible showing an 800ns-per-op improvement in speed.
|
|
|
|
* Profiled core UBJInput/OuputStream classes using HPROF for a few million
|
|
iterations and found no memory leaks and no performance traps; everything at
|
|
that low level is as tight as it can be.
|
|
|
|
* Stream-parsing facilities were moved out of the overloaded UBJInputStream
|
|
class and into their own subclass called UBJInputStreamParser which operates
|
|
exactly like a pull-parsing scenario (calling nextType then switching on the
|
|
value and pulling the appropriate value out of the stream).
|
|
|
|
* More example testing/benchmarking data checked into /test/java/com/ubjson/data
|
|
|
|
Will begin reworking the Reflection based Object mapping in the
|
|
org.ubjson.reflect package now that the core IO classes are finalized.
|
|
|
|
* Removed all old scratch test files from the org.ubjson package, this was
|
|
confusing.
|
|
|
|
09-27-11
|
|
* Initial check-in of core IO classes to read/write spec.
|
|
|
|
|
|
Status
|
|
------
|
|
Using the standard UBJInputStream, UBJInputStreamParser and UBJOutputStream
|
|
implementations to manually read/write UBJ objects is stable and tuned for
|
|
optimal performance.
|
|
|
|
Automatic mapping of objects to/from UBJ format via the reflection-based
|
|
implementation is not tuned yet. Writing is implemented, but not tuned for
|
|
optimal performance and reading still has to be written.
|
|
|
|
* org.ubjson.io - STABLE
|
|
* org.ubjson.io.parser - STABLE
|
|
* org.ubjson.io.reflect - ALPHA
|
|
* org.ubjson.model - BETA
|
|
|
|
|
|
License
|
|
-------
|
|
This library is released under the Apache 2 License. See LICENSE.
|
|
|
|
|
|
Description
|
|
-----------
|
|
This project represents (the official?) Java implementations of the
|
|
Universal Binary JSON specification: http://ubjson.org
|
|
|
|
|
|
Example
|
|
-------
|
|
Comming soon...
|
|
|
|
|
|
Performance
|
|
-----------
|
|
Comming soon...
|
|
|
|
|
|
Reference
|
|
---------
|
|
Universal Binary JSON Specification - http://ubjson.org
|
|
JSON Specification - http://json.org
|
|
|
|
|
|
Contact
|
|
-------
|
|
If you have questions, comments or bug reports for this software please contact
|
|
us at: software@thebuzzmedia.com |