Saturday, 8 April 2017

Custom Stream Framework – Open Source

     Streams are a powerful tool for transferring huge data with least memory foot print. Stream implemented applications yield better performance and are more scalable. On the flip side Streams are tricky to be used and requires more time and effort whilst the development stage. Therefore it is overlooked at the initial stage of coding and is mostly considered during performance bottle neck phases.
     Considering the difficulty in using Streams in development stage, I have created few Custom Wrapper classes which override the existing roadblocks in using them. It also provides certain common stream utility functions. This topic focus on the usage of these Utilities and not on the implementation details of the wrapper classes.  The source code in available for download ! 

Binary            : StreamFramework.jar

Below utilities are available in the stream framework.
  1. Simple Read - Read operation made simple through InputStreamReadWrapper Class which has additional method called next() which reads the data from stream and returns boolean value to say the data has been read or not. The method getBytes() will return the bytes of data which is read in the next() method.
  2. Limited Read - InputStreamReadWrapper class constructor accepts a long value which is the max data to read from the stream. Also has a method setReadLimit() which does the same job.
  3. Restricted Read Line - InputStreamReadWrapper has readLine(lineLengthToCheck) method which grabs the specified length of data from the stream and checks new line char is available or not. It put back the remaining data after the new line char into the stream. The restricted read line is invented to avoid the memory issues in case of huge data sent in a single line. The getBytes will return the line data read from the stream.
  4. Calculate MD5InputStreamReadWrapper constructor has an option to enable md5 calculation. We can use the method getMessageDigestValue() to get the MD5 value of the data read from the stream.
  5. Convert Data to Stream - InputStreamReadWrapper class constructor also accepts a DataLoader type which helps to convert Data to Stream.
  6. Convert InputStream to GzippedStream – GzippedStream is inherited from InputStreamReadWrapper Class and has the constructor which accepts InputStream. The instantiated object of GzippedStream can be read to get the gzipped data of the InputStream.
  7. Generate GzippedStream while reading InputStream – InputStreamGzipReadWrapper inherited from InputStreamReadWrapper class and has a constructor which accepts InputStream and DataReader custom type. The instantiated object of InputStreamGzipReadWrapper can we read as normal InputStream by the same time we can read the gzipped data from the DataReader custom type class.


Below sections will guide you through each utility in the Stream Framework.


 1. Converting InputStream to InputStreamReadWrapper:

InputStreamReadWrapper is inherited class of InputStream and has more functionality wrapped to make read operations easy to use. Below are the functionalities we get when we instantiate InputStreamReadWrapper class.
  1.      Easy to read data using next(), getBytes() methods.
  2.      Option available to read specified length of data from the Stream
  3.      Get MD5 value after reading the data.

Class Diagram

Read bytes from InputStream:

Below is the simplest code snippet to read the bytes from the input Stream. This involves defining the int and byte array. Also requires a while loop which does assigning read data length and checks the value is not -1. After the data is loaded in the byte array we should read only the bytes till the read length from the byte array.


int read;






final byte[] data = new byte[1024*128];


while ((read = fileInputStream.read(data)) != -1) {

fileOutStream.write(data, 0,read);


}








Read bytes using Stream Framework:

         All the above complexity has been simplified in InputStreamReadWrapper class. This will take care of all the variable instantiation and it works very similar to database result set next() method. InputStreamReadWrapper class next method which loads the next chunk of data in to the byte array and returns boolean  to say that the data is loaded or not. The other method getBytes returns the bytes of data read from the stream.


InputStreamReadWrapper insWrap = new InputStreamReadWrapper(fileInputStream);
while(insWrap.next()) {







fileOutStream.write(insWrap.getBytes());




}












Controlled Read using Stream Framework:

Think of a usage if you want to copy 1000 bytes to one output Stream and remaining bytes to another output stream.


InputStreamReadWrapper insWrap = new InputStreamReadWrapper(fileInputStream, 1000L);
Util.writeTo(insWrap, file1OutStream);





file1OutStream.close();


















insWrap = new InputStreamReadWrapper(fileInputStream);



Util.writeTo(insWrap, file2OutStream);





file2OutStream.close();


















insWrap.close();








file1OutStream.close();







file2OutStream.close();








 2.  Convert Data to InputStream:

     When there is a need to send huge of data to a server or another process we are in a situation to hold that huge data in memory before we start sending it. If we use a PipeInputStream and PipeOutputStream we can write the data at one end and in the other end the data can be read by another process. This Pipe Input and Output stream usage is wrapped inside InputStreamReadWrapper class. This helps not to wait for the full data to be generated and do not need to hold the huge data memory. The data can be written to OutputStream as soon as we have a chunk of data.

                                                         Feature description diagram


The below code snippet explains the sample implementation of converting data to stream. The DataLoader interface has a method load, which needs to be defined in the implementation class and the method parameter OutputStream object is used to write data into the Stream.


DataLoader dataLoader = new DataLoaderImpl();


InputStreamReadWrapper ins = new InputStreamReadWrapper(dataLoader);
Util.writeTo(ins, ffOutStream);




           

class DataLoaderImpl implements DataLoader{



@Override







public void load(OutputStream out) throws IOException {



for (int i = 0; i < 1000; i++) {






out.write(("Test data " + i + "\n").getBytes("UTF-8"));



}









out.close();






}








}










 3. Convert InputStream to GZippedInputStream:

     Converting the InputStream to GZippedInputStream is a great feature. This is done by GZippedStream class which is extended from InputStreamReadWrapper class to have its functions as well. The GzippedStream class constructor uses the PipedInputStream, PipedOutputStream and GZIPOutputStream to do this functionality.

Feature description diagram

Class Diagram
                                                                            
          The below code snippet shows the sample implementation of converting InputStream to GzippedInputStream. GZippedStream class constructor takes the InputStream as parameter and creates PipedOutputStream connected with PipedInputStream. The PipedOutputStream is set to the constructor of GZIPOutputStream. Now if we write data to GZIPOutputStream and read data from PipedInputStream we get the feature what we need. The write and read operation should be done in different threads since to avoid resource lock which is the general rule of using pipe streams.










GZippedStream zs = new GZippedStream(fileInputStream);
Util.writeTo(zs, fileOutStream);










 4. Convert InputStream to GzippedInputStream while reading InputStream:

The feature is when an InputStream is read we get the Gzipped data in parallel to what is read. This is amazingly useful when we need to validate the data but we want to store the content in zipped format. This feature implementation involves pipe streams and threads in the InputStreamGzipReadWrapper class. The InputStreamGzipReadWrapper class extends InputStreamReadWrapper to inherit other functions.

             Feature description diagram   
 Class Diagram

The below code snippet shows the sample implementation of converting InputStream to Gzipped Data InputStream while reading the base InputSream.


DataReader gzipDataReadernew GzipDataReaderImpl();






InputStreamGzipReadWrapper dataInputStream
      = new InputStreamGzipReadWrapper(fileInputStream, gzipDataReader);

while(dataInputStream.next()) {











dataInputStream.getBytes();//ValidateTheData 










}
















class GzipDataReaderImpl implements DataReader {

@Override

public void readBytes(byte[] bytes, int off, int length) {


try {



ffOutStream.write(bytes, off, length);


} catch (IOException e) {



e.printStackTrace();


}






}


}






 5. Usage of Util.WriteTo(InputStream, OutputStream): 

     Just copies the input Stream to OutputStream.

 6. Usage of Util.byteIndexOf(byte[] sourceData, byte[]searchData) api:
       Searches particular bytes of data in the sourceData byte array and returns the first occurrence index value. If the searchData is not available in source data then it returns -1. This saves the cost of converting the bytes to string to check for presence of specific word or string.


 Testing the Performance:

Read Api :
        Just reading the data from the stream is compared below. The execution time, memory usage and cup usage are captured using JVisual VM tool. Since this is a wrapped class of InputStream we can not expect better performance. But this should not degrade the existing InputStream performance. Looking at the below test of reading 300 mb using InputStream and InputStreamReadWrapper, the performance is not degraded. 

       Read 300 mb of file using InputStream                          Execution Time: 6067

Read 300 mb of file using InputStreamReadWrapper               Execution Time: 6164


Conclusion:

     This stream framework helps in reducing the coding effort in using Streams. Anyone can play with Stream as like they want and exploit the features.




No comments: