Video Watermarking, Collusion or Convolution Attack
There are several ways to protect the audiovisual content and watermarking is one of them. It is arguably the best solution against content distribution via streaming, simply because it allows one to identify the source of the media theft.
Watermarking, which originally was created for image protection has been intensively researched in the past decade and is now possible to be applied not only in static videos but also in live streaming. That can be done in the hardware or software level and the mark can be inserted in the frames, key frames, bits, video sample and many other ways. It is an amazing technology! It is offered by the specialized companies as the ultimate protection against piracy.. for a lot of money off course.
There are certain desirable characteristics in these type of forensic measures that make it useful to be implemented to prevent piracy and I would like to discuss them first, before getting to the real purpose of this article.
Can you image a soccer transmission where you see a giant logo or number in the screen? That would not be the best way to put a mark on the content. Yet, it needs to be there somewhere. Nobody cares if the owner of the content has to insert something in it as long it doesn’t impact the end-user experience. There should be no degradation in the quality of the video too. The only and best way to do it is by inserting the mark invisible to the human eyes. Or if not invisible, imperceptible.
Robustness means that it should be difficult (if not impossible) to remove the watermarking from the media. What about making it not only invisible but moving? or random? uhh.. what about having it injected in different intervals? or a mix of sound and video marks? So the essence of the term robustness applied to this type of technology is to make it resistant to actions such as resizing, cropping, compression, rotation, noise, and many other attacks that may be applied in the effort to remove the mark.
This is one is the easiest ! Pairwise independence refers to fact that there shouldn’t be two equal marks in the same media. Although you can carry multiple different marks in the same media (say from different distribution path) they should not be equal.
Ok. Now that I have covered what the watermarking algorithm should have to be good I want to discuss a little bit what can be done to break it. Recent watermarking solutions are resistant to the common attacks – resizing, cropping, noise, compression and image overlay. There is one attack, however that still remains a challenge for must companies and it is called – The Collusion attack. The attack consists in merging two sources of the same video to form a third one. That new product would be then without the watermark or in some cases it would have two marks and make it difficult for the source identification.
Colluders collect several watermarked documents and combine them to produce digital content without underlying watermarks.
There are two basic types of collusion attack
Type 1 – In this type of collusion attack, attacker obtains several copies of the same work, with different watermarks. Here, the attacker tries to ﬁnd out the video frames which are similar in nature. Hence, frames belonging to the same scene have a high degree of correlation. The attacker then separates various scenes of the video. Then statistical average of the neighboring frames is done to mix the different marks together and computes a new unmarked frame. Type-1 collusion attack can only be successful if successive frames are different enough.
Type 2 – In this type of attack, the attacker obtains several different copies that contain the same watermark and studies them to learn about the algorithm. Then several copies are averaged by the attacker. If all copies have the same reference pattern added to them, then this averaging operation would return something that is closed to the pattern. Then, the average pattern can be subtracted from the copies to generate an unmarked video.
It seems complicated but there are several encoders out there that are able to perform the collusion attack without you having to study all this stuff.
Collusion or Convolution?
I was caught in a curious discussion with a friend when the term collusion was first presented to me. Although the technique made sense and sounded reasonable I had never heard about it before. He on the other hand didn’t know about Convolution either. So which term is the correct one, when referring to merging two sources to produce a third? In the literature the term convolution is used to describe a math operation of two functions (f and g) to produce a third function that expresses how the shape of one is modified by the other. The term convolution refers to both the result function and to the process of computing it. It is defined as the integral of the product of the two functions after one is reversed and shifted. While collusion is about people getting together to defraud a system. Both terms are correct, in my humble opnion and context helps to employ them properly. If one would be talking about people getting together to remove watermark that would be Collusion (could be a single guy btw). if you are talking about the math process to merge to different signals and produce a third than it is Convolution.