An audio watermarking algorithm based on the combination of DCT and DWT

With the development of multimedia technology and the widespread use of network technology, information security issues have become increasingly prominent. At the same time, as people's awareness of intellectual property rights has gradually strengthened, the problems of tracking piracy and maintaining copyright have also prompted people to seek newer and more efficient information security technologies. Traditional information security technology is basically based on cryptography theory [1], whether it is a traditional key system or a public key system, its protection method is to control file access. However, with the rapid improvement of computer processing power, this method of increasing the security level of the system by continuously increasing the key length has become increasingly insecure. Therefore, how to carry out copyright protection in the open environment of information exchange is an urgent problem to be solved.

Digital watermarking technology is an information security technology developed in response to this problem in recent years. Digital watermarking embeds the copyright owner’s mark or ID number as watermark data into the owner’s work. The work here can be production materials or any kind of consumer goods, which can generally be brought to pirates. Profitable products (such as audiovisual products). Due to the large sales volume of such products, and the development of network and information digitization, illegal copying is very easy. Therefore, in recent years, the development of image watermarking [2] and video watermarking has been relatively rapid. They mainly use human visual models (ie HVS) to embed watermarks in places that people cannot feel. When infringement disputes occur, they can pass Watermark detection and watermark extraction are used as evidence of prosecution [3].

    At present, audio watermarking technology has gradually developed. It uses the human auditory model (ie HAS) to embed the watermark in a position that the human ear cannot perceive, so as to hide the watermark data. Watermarking technology can be divided into time-domain watermarking algorithms and transform-domain watermarking algorithms according to the different watermark embedding positions. Early watermarking algorithms modified the least important bits of the original audio signal to achieve the purpose of embedding watermarks. Literature [4] proposed to pre-categorize and define various embedding modes, adaptively select the optimal embedding mode to embed the watermark in the echo of the original audio signal; Literature [5] proposed a watermarking technique based on wavelet transform; Literature [6] proposed A watermarking algorithm based on discrete cosine transform. It can be seen from the above literature that the time domain watermarking algorithm is not robust and robust, so the transform domain watermarking algorithm has developed rapidly in recent years. Common watermarking algorithms in transform domain include Fourier transform, discrete cosine transform and discrete wavelet transform. These transform algorithms have their own advantages and disadvantages.

Based on the analysis and research of the above watermarking algorithms, this paper proposes a method based on the combination of discrete wavelet transform and discrete cosine transform to embed and extract watermarks, making full use of the multi-resolution characteristics of wavelet transform and the energy compressibility of discrete cosine transform , Taking an intuitive binary image as a watermark, a new audio watermarking algorithm is given. Experiments prove the robustness and imperceptibility of the algorithm.

Using fast wavelet transform, select a certain wavelet function to decompose the input signal at a certain scale, and get the high-frequency part and low-frequency part of the signal under this scale. In one scale, the high-frequency part and the low-frequency part include the complete restoration of the previous scale Download all the information of the signal. If this decomposition is repeated, a multi-scale decomposition of the signal is obtained, and the multi-layer wavelet coefficients of the signal are obtained, that is, the low-frequency coefficients of the signal and a series of high-frequency coefficients. The wavelet decomposition tree shown in Figure 1.

Figure 1 Wavelet decomposition tree

Fig.1  Wavelet decompose tree

For most signals, the low frequency part gives the characteristics of the signal, which is often the most important, while the high frequency part is associated with noise and disturbance. Remove the high frequency part of the signal, the basic characteristics of the signal can still be retained. Therefore, the general signal processing is carried out for this part. Therefore, in signal analysis, the approximation and details of the signal are often mentioned . The approximation is mainly the global and low-frequency part of the system , while the details are often the local and high-frequency components of the signal.

Decomposing the signal into a linear combination of mutually orthogonal wavelet functions can show the important characteristics of the signal, but this is not the whole of wavelet analysis. Another important aspect of wavelet analysis is to analyze, compare, and process (such as removing high-frequency signals, encryption, etc.) wavelet coefficients, and then reconstruct the signal according to the newly obtained coefficients. This process is called inverse discrete wavelet transform (IDWT), or wavelet reconstruction, synthesis, etc. The basic process of signal reconstruction is shown in Figure 2.

The algorithm flow chart is shown as in Fig. 3.

                                       Figure 3 Algorithm flow chart

 Fig.3  Algorithm flow chart

In the experiment, we choose the watermark information as a 64×64 binary image, as shown in Figure 4(a). The watermark image is reduced in dimensionality first, and then scrambled. The scrambled image is shown in Figure 4(b). Then the original audio signal is segmented, and the audio signal used for embedding the watermark is decomposed by three-level wavelet. In this paper, the'db1' wavelet base is selected, and then the discrete cosine transform is performed on the approximate components of the three-level wavelet decomposition, and then the discrete cosine transform coefficients Sort, and finally embed the watermark in the audio signal according to formula (8). Take = 0.2 during the embedding process. Figure 5(a) is the original audio signal, which is mono, 22.05kHz sampling rate, 8bit quantization, and the period length is 8s; Figure 5(b) is the audio signal after the watermark is embedded, and the two audio signals are almost It makes no difference.   

      

 

Fig .4  Watermark image

         

         (A) Original audio signal; (b) Watermarked audio signal

Figure 5 Watermark signal

Fig. 5  Audio signal

In order to detect the robustness of the algorithm, the following processing was performed on the watermarked audio signal: ① The SNR of Gaussian white noise is 30, and the extracted watermark is shown in Figure 4(c); ② Re-sampling, the signal is extracted once and once The interpolation, decimation and interpolation coefficient is 2. The extracted watermark is shown in Figure 4(d); ③ The low-pass filter is passed through the Bicheshev low-pass filter with a cut-off frequency of 4 kHz, and the watermark is extracted as shown in Figure 4(e); ④ The signal bit rate is 80kb and the compression ratio is 8.8 The watermark extracted in the state of :1 is shown in Figure 4(f).

The normalization coefficients are 0.823 5, 0.601 2, 0.682 6, 0.596 1. It can be seen from the above experiments that the algorithm is robust to the usual signal processing.

Guess you like

Origin blog.csdn.net/ccsss22/article/details/108741499