Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 3, Marc h -2012 1

ISS N 2229-5518

An Improved Data Compression Method for General Data

Salauddin Mahmud

Abs tract -Data compression is useful many f ields, particularly usef ul in communications because it enables devices to transmit or store the same amount of data in f ew er bits. There are a variety of data compression techniques, but only a f ew have been standardized. This paper has proposed a new data compression method f or general data w hich based on a logical truth table. Here, tw o bits data can be represented by one bit in both w ire and w ireless netw ork. This proposed technique w ill be eff icient f or w ired and w ireless netw ork. The algorithms have evaluated in terms of the amount of compression data, algorithm eff iciency, and w eakness to error. While algorithm eff iciency and susceptibility to error are relatively independent of the characteristics of the source ensemble, the amount of compression achieved depends upon the characteristics of the source to a great extent.

Inde x terms : Data compression, Truth Table, Compression Ratio, Compres sion Factor.

—————————— ——————————


Date compression is most common and indispensable method in the digital world where transmission bandwidth is limited and wants to quick transmitting data in a network. So we are bound to compress data. Compression is helpful because it helps reducing the data length, such as hard disk space or transmission bandwidth. On the downside, compressed data must be decompressed to be used, and this extra processing may be harmful to some applications [4]. A data compression feature can help reduce the size of the database as well as improve the performance of I/O intensive workloads. However, extra resources are required on the database server to compress and decompress the data, while data is exchanged with the application. The following figure shows a model for a compression system that performs a transparent compression.






ar e to r educe the amount of data storage space r equir ed and r educe the length of data transmission time over the networ k.


Tw o type of compr ession exists in r eal wor ld. One is Lossless compr ession and another is Lossy compr ession. Lossless and lossy compr ession ar e ter ms that descr ibe whether or not, in the compr ession of a file, all original data can be r ecover ed w hen the file is uncompr essed.


With lossless compr ession, every single bit of data that was or iginally in the file r emains after the file is uncompr essed. All of the information is completely r estor ed. This is generally the technique of choice for text or spr eadsheet files, wher e losing w ords or financial data could pose a pr oblem. The Graphics Inter change File (GIF) is an image format used on the W eb that provides lossless compr ession. Lossless compr ession algor ithms usually exploit statistical r edundancy in such a way as to r epr esent the sender's data mor e concisely without err or . Lossless compr ession is possible because most r eal-w or ld data has statistical r edundancy.


Figur e 1: A model for cOoRmIGIpNArLession system


Reducing the amount of data r equir ed to r epr esent a sour ce of information w hile pr eserving the or iginal content as much as possible. The main obj ectives of data compr ession
Lossless algorithms mean we can r eturn the or iginal signal. No bits will ignor e. It’s numer ically identical to the or iginal content on a pixel-by-pixel basis. The Lempel–Ziv (LZ) [1] compr ession methods ar e among the most popular algor ithms for lossless stor age. DEFLATE is a differ ence on

IJSER © 2012 http ://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 3, Marc h -2012 2

ISS N 2229-5518

LZ which is utilized for decompr ession speed and compr ession ratio, but compr ession can be slow . DEFLATE is used in PKZIP, gzip and PNG. LZW (Lempel –Ziv– W elch) is used in GIF images. Also notew orthy ar e the LZR (LZ–Renau) methods, which serve as the basis of the Zip method. LZ methods utilize a table-based compr ession model wher e table entr ies ar e substituted for r epeated strings of data. For most LZ methods, this table is generated dynamically fr om ear lier data in the input. The table itself is often Huffman encoded (e.g. SHRI, LZX). A curr ent LZ- based coding scheme that per forms well is LZX, used in Micr osoft's CAB format.
The very best modern lossless compr essor s use pr obabilistic models, such as pr ediction by partial matching. The Burr ows–W heeler tr ansform can also be view ed as an indir ect form of statistical modeling.
In a further r efinement of these techniques, statistical pr edictions can be coupled to an algor ithm called ar ithmetic coding[2]. Ar ithmetic coding, invented by Jorma Rissanen, and turned into a practical method by Witten, Neal, and Cleary, achieves super ior compr ession to the better -known Huffman algor ithm, and lends itself especially w ell to adaptive data compr ession tasks wher e the pr edictions ar e str ongly context -dependent. Ar ithmetic coding is used in the bi-level image-compr ession standar d JBIG, and the document-compr ession standar d DjVu. The text entry system, Dasher , is an inverse-arithmetic-coder . Applications of lossless algorithm ar e medical imaging
.Techniques of this algorithm ar e - Bit Plane Coding,
Lossless pr edictive coding. But In Lossless compr ession motion compr ession is not used.


Another type of compr ession, called lossy data compr ession or per ceptual coding, is possible if some loss of r eliability is acceptable. Lossy compr ession r educes a file by permanently eliminating cer tain information, especially r edundant information. When the file is uncompr essed, only a part of the or iginal information is still ther e (although the user may not notice it). Lossy compr ession is generally used for video and sound, wher e a certain amount of information loss will not be detected by most users. Generally, a lossy data compr ession w ill be guided by r esear ch on how people r ecognize the data in question[3]. For example, the human eye is mor e sensitive to subtle var iations in luminance than it is to differ ence in color . JPEG image compr ession wor ks in part by "r ounding
off" some of this less-important information. Lossy data compr ession pr ovides a way to obtain the best r eliability for a given amount of compr ession.
A lossy algor ithm is which that data cannot r etur n into the or iginal bit. Some bits will ignor e. But it will not make any fact. Lossy image compr ession is used in digital cameras, to incr ease storage capacities with minimum r uin of pictur e quality. Similar ly, DVDs use the lossy MPEG-2 Video codec for video compr ession [ 5].In lossy audio compr ession; methods of psychoacoustics ar e used to r emove non- audible (or less audible) components of the signal. Compr ession of human speech is often per for med w ith even mor e specialized techniques, so that "speech compr ession" or "voice coding" is sometimes distinguished as a separate discipline fr om "audio compr ession". Differ ent audio and speech compr ession standar ds ar e listed under audio codecs. Voice compr ession is used in Internet telephony for example, w hile audio compr ession is used for CD r ipping and is decoded by audio player s.The Techniques ar e -Tr ansform coding (MPEG-X), Vector Quantization (VQ), Sub band Coding (Wavelets),Fractals Model-Based Coding.


The pr oposed technique is shown that, the data bits can be r epr esented by of its half bits. That means 128 bit can be r epr esented by 64 bits, 64 bit can be r epr esented by 32 bits, and 32 bits can be r epr esented by 16 bits.
1GB data can be r epr esented by 512 MB data.
This technique is based on the logical tr uth table, the combination of two bits ar e given below:
Table 1:

IJSER © 2012 http ://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 3, Marc h -2012 3

ISS N 2229-5518

Truth table of pr oposed technique
In table 1:
Resultant output is 0 when tw o input bits ar e 0 and 0
Resultant output is 1 when tw o input bits ar e 1 and 1
Resultant output is when two input bits ar e 0 and 1
Resultant output is when two input bits ar e 1 and 0


In the above figur e w e see that the r epr esentation of the input bits into output bits at the output terminal wher e each bit is r epr esented for two bits.

Conventionally w e uses tw o levels for r epr esenting digital signal but her e w e can r epr esent data in four levels which is shown in below.

Input Data


Check even or odd

If even then process

If odd then add a bit 0 or 1

Input our proposed Technique

Fig 4: Gr aphical Repr esentation


Let be consider ed that
A plane text has been given
x= {1,0,1,1,0,1,0,1,0,0,1,1,0,0,1,1,0,0,1,0} w her e x=20
Fig 2: The pr oposed data compr ession flow char t
Input data sequence is checked wher e it is even or odd. If even then pr ocess or if odd then add a bit either 0 or 1 depend on last bit . If the last bit of input sequence is 0 then add 0 or if the last bit of input sequence is 1 then add
1.Then it feed to our pr oposed technique. In this method tw o bits ar e collected and converted it into one bit using table 1. So we can get compr ess data.

Fig 3: The pr oposed technique

Pseudo generator gener ates a dynamic key sequence arr =array(0,1,0,1,0,0,1,1,0,1,1,0,1,0,1,0,1,0,1,0)
for each i till to size of x wher e initial i is equal to 1
and incr ement i by 1
{ rand=rand(0,19); y =arr[rand];
y= {01010011011010101010} wher e y=20
X or x and y
So the plane text is encrypted
z= {1,1,1,0,0,1,1,0,0,1,0,1,1,0,0,1,1,0,0,0}

IJSER © 2012 http ://

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h Vo lume 3, Issue 3, Marc h -2012 4

ISS N 2229-5518

So this data has been compr essed.


So z can r epr esent by 10 bits


These parameters ar e used for measur ing the per formance of the pr oposed method.


T o calculate the efficiency w e have to measur e the compr ession r atio. The compr ession r ate is theor etically
50%. This is str ength of this pr oposed compr ession technique

So the compr ession rate is 50%
So it will be a gr eat compr ession method.


Compr ession factor is the inver se ter m of compr ession ratio. The value of compr ession factor is gr eater than 1 mean compr ession and value is less than 1 mean imply expansion.
Compr ession can be defined as

Compr ession factor =

= = 2
Data compr ession is most consideration thing of the r ecent wor ld. We have to send a huge amount of data in a limited bandw idth. That is why data has to compr ess. This pr oposed compr ession technique can be useful to send a lot of data in wir e and w ir eless networ k. The decoder decode to the original data. In a data storage application, although the degr ee of compr ession is the pr imar y concern, it is nonetheless necessar y that the algor ithm be efficient in or der for data compr ession.


[1]J. Ziv and A. Lempel, “Compr ession of individual sequences via variable-rate coding,”Information Theory, IEEE Tr ansactions on, vol.24, no. 5, pp. 530–536, sep. 1978.
[2] D. Salomon, Data Compr ession: The Complete Refer ence, 2nd ed., 2004. [Online]. Available: dxs/DC3advertis/Dcomp3Ad.html
[3] A. B. sorting Lossless, M. Burr ows, M. Burr ows, D. W heeler , and D.J. W heeler , “A block-sorting lossless data compr ession
algor ithm,” Digital SRC Resear ch Repor t, Tech.Rep., 1994
[4] Intr oduction to Data Compr ession by Guy E Blelloch fr om CMU
[5] John Watkinson, The MPEG Handbook,p.4

Sa lauddin Mahmud ha s completed his gra dua tion in Electronics & Telecommunica tion Engineering from Da ffodil Interna tional University, Ba ngla desh in 2011PH-+8801672463556.(e-mail: ra ju_681@ya )

IJSER © 2012 http ://