Example Analysis of Hadoop Serialization and Java Serialization 07/01 Update SLTechnology News&Howtos

Example Analysis of Hadoop Serialization and Java Serialization

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Editor to share with you Hadoop serialization and Java serialization example analysis, I believe that most people do not know much about it, so share this article for your reference, I hope you will learn a lot after reading this article, let's go to know it!

The Java serialization mechanism converts objects into contiguous byte data, which can be later restored (deserialized) into the original objects.

In Java, for an instance of a class to be serialized, the class must implement the Serializable interface. The Serializable interface is a flag without any method, which is defined as follows

Public interface Serializable {}

Define a class Block1 that implements the Serializable interface

Class Block1 implements Serializable {private int one= 1; private int two= 2; private int three= 3; @ Override public String toString () {return "Block1 [one=" + one + ", two=" + two + ", three=" + three + "]";}}

Define a class JavaSerializeTest to test the Java serialization mechanism

Public class JavaSerializeTest {public static void main (String [] args) throws IOException, ClassNotFoundException {Block1 block = new Block1 (); ByteArrayOutputStream baos = null; ObjectOutputStream oos = null; ObjectInputStream ois = null; try {/ / create a ByteArrayOutputStream object baos baos = new ByteArrayOutputStream (); / / decorate ByteArrayOutputStream object baos to get ObjectOutputStream object oos oos = new ObjectOutputStream (baos); / / A pair of block is serialized and serialized into oos.writeObject (block) in baos / / get the byte array byte [] bytes = baos.toByteArray () from the byte array output stream baos; System.out.println ("serialize the Block1 object to the byte array with the length of the byte array:" + bytes.length); / / create the ByteArrayInputStream object with the byte array bytes, and decorate the object as the ObjectInputStream object ois ois = new ObjectInputStream (new ByteArrayInputStream (bytes)) / call the readObject () method of the ObjectInputStream object ois to deserialize and return a Block1 object block1 Block1 block1 = (Block1) ois.readObject (); System.out.println ("byte array deserialization, restore to Block1 object:" + block1);} finally {/ / close the stream}

Console output:

Serialize Block1 objects to byte arrays, byte array length: 72

Byte array deserialization, reverting to Block1 object: Block1 [one=1, two=2, three=3]

ObjectOutputStream provides some writeX () methods, including writeInt (), writeLong (), writeFloat (), writeUTF ().

JavaAPI:

Public final void writeObject (Object obj) throws IOException

Writes the specified object to ObjectOutputStream. The class of the object, the signature of the class, and the values of the non-transient and non-static fields of the class and all its parent types are written to the

Because the serialization mechanism of Java is too powerful, we can see that the Block1 object block with only three attributes (all of int type, a total of 12 bytes) has 72 bytes after serialization, so a new serialization mechanism is needed for Hadoop.

In Hadoop, for an instance of a class to be serialized, the class must implement the Writable interface.

The Writable interface has two methods, write () serialization and readFields () deserialization, defined as follows:

Public interface Writable serializes the property fields of the object (this) into the output stream DataOuput out. * / void write (DataOutput out) throws IOException; / * * reads the attribute field information from the input stream DataInput in and reassembles it into a (this) object, which is a deserialization operation. * / void readFields (DataInput in) throws IOException;}

Define a class Block2 that implements the Writable interface

Class Block2 implements Writable {private int one = 1; private int two = 2; private int three = 3; / * serialize the attribute fields of the object (this) into the output stream DataOuput out. * / @ Override public void write (DataOutput out) throws IOException {out.writeInt (one); out.writeInt (two); out.writeInt (three);} / * * read the attribute field information from the input stream DataInput in and reorganize it into a (this) object, which is a deserialization operation. * / @ Override public void readFields (DataInput in) throws IOException {one= in.readInt (); / / to see the deserialization effect, swap two and three, three= in.readInt (); / / two=3 two= in.readInt (); / / three=2} @ Override public String toString () {return "Block2 [one=" + one + ", two=" + two + ", three=" + three + "]";}}

PS: the order of x = in.readX () in the out.writeX (x) and readFields () methods in the write () method must be the same, otherwise the correctness of the data cannot be guaranteed.

Define a class HadoopSerializeTest to test the Hadoop serialization machine

Public class HadoopSerializeTest {public static void main (String [] args) throws IOException, ClassNotFoundException {Block2 block = new Block2 (); ByteArrayOutputStream baos = null; DataOutputStream dos = null; DataInputStream dis = null; try {/ / create a ByteArrayOutputStream object baos baos = new ByteArrayOutputStream (); / / decorate ByteArrayOutputStream object baos to get DataOutputStream object dos dos = new DataOutputStream (baos); / / A pair of block is serialized and serialized into block.write (dos) in baos / / get the byte array byte [] bytes = baos.toByteArray () from baos; System.out.println ("serialize the Block2 object to the byte array with the length of the byte array:" + bytes.length); / / create the ByteArrayInputStream object with the byte array bytes, and decorate the object as the DataInputStream object dis dis = new DataInputStream (new ByteArrayInputStream (bytes)); Block2 block1 = new Block2 () System.out.println ("undeserialized Block2 object:" + block1); / / call the readFields (DataInput) method of block1 to deserialize, swap the values of two and three block1.readFields (dis); System.out.println ("byte array deserialization, restore to Block2 object:" + block1);} finally {/ / close stream}

Console output:

Serialize Block2 objects to byte arrays, byte array length: 12

Undeserialized Block2 object: Block2 [one=1, two=2, three=3]

Byte array deserialization, reverting to Block2 object: Block2 [one=1, two=3, three=2]

Since only 3 int are output from block serialization of Block2 objects, the byte array generated after serialization is only 12 bytes. Compared with the output of Java's serialization mechanism (72 bytes), the serialization result of Hadoop is compact and fast.

The above is all the content of the article "sample Analysis of Hadoop Serialization and Java Serialization". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.