Video Watermarking, Collusion or Convolution Attack

There are several ways to protect the audiovisual content and watermarking is  one of them. It is arguably the best solution against content distribution via streaming, simply because it allows one to identify the source of the media theft.

Watermarking, which originally was created for image protection has been intensively researched in the past decade and is now possible to be applied not only in static videos but also in live streaming. That can be done in the hardware or software level and the mark can be inserted in the frames, key frames, bits, video sample and many other ways. It is an amazing technology! It is offered by the specialized companies as the ultimate protection against piracy.. for a lot of money off course.

There are certain desirable characteristics in these type of forensic measures that make it useful to be implemented to prevent piracy and I would like to discuss them first, before getting to the real purpose of this article.

Imperceptibility

Can you image a soccer transmission where you see a giant logo or number in the screen? That would not be the best way to put a mark on the content. Yet, it needs to be there somewhere. Nobody cares if the owner of the content has to insert something in it as long it doesn’t impact the end-user experience. There should be no degradation in the quality of the video too. The only and best way to do it is by inserting the mark invisible to the human eyes. Or if not invisible, imperceptible.

Robustness

Robustness means that it should be difficult (if not impossible) to remove the watermarking from the media. What about making it not only invisible but moving? or random? uhh.. what about having it injected in different intervals? or a mix of sound and video marks? So the essence of the term robustness applied to this type of technology is to make it resistant to actions such as resizing, cropping, compression, rotation, noise,  and many other attacks that may be applied in the effort to remove the mark.

Pairwise Independence

This is one is the easiest ! Pairwise independence refers to fact that there shouldn’t be two equal marks in the same media. Although you can carry multiple different marks in the same media (say from different distribution path) they should not be equal.

 

Collusion Attack

Ok. Now that I have covered what the watermarking algorithm should have to be good I want to discuss a little bit what can be done to break it. Recent watermarking solutions are resistant to the common attacks – resizing, cropping, noise, compression and image overlay. There is one attack, however that still remains a challenge for must companies and it is called – The Collusion attack. The attack consists in merging two sources of the same video to form a third one. That new product would be then without the watermark or in some cases it would have two marks and make it difficult for the source identification.

Colluders collect several watermarked documents and combine them to produce digital content without underlying watermarks.

There are two basic types of collusion attack

Type 1 – In this type of collusion attack, attacker obtains several copies of the same work, with different watermarks. Here, the attacker tries to find out the video frames which are similar in nature. Hence, frames belonging to the same scene have a high degree of correlation. The attacker then separates various scenes of the video. Then statistical average of the neighboring frames is done to mix the different marks together and computes a new unmarked frame. Type-1 collusion attack can only be successful if successive frames are different enough.

 

Type 2 – In this type of attack, the attacker obtains several different copies that contain the same watermark and studies them to learn about the algorithm. Then several copies are averaged by the attacker. If all copies have the same reference pattern added to them, then this averaging operation would return something that is closed to the pattern. Then, the average pattern can be subtracted from the copies to generate an unmarked video.

It seems complicated but there are several encoders out there that are able to perform the collusion attack without you having to study all this stuff.

Collusion or Convolution?

I was caught in a curious discussion with a friend when the term collusion was first presented to me. Although the technique made sense and sounded reasonable I had never heard about it before. He on the other hand didn’t know about Convolution either. So which term is the correct one, when referring to merging two sources to produce a third? In the literature the term convolution is used to describe a math operation of two functions (f and g) to produce a third function that expresses how the shape of one is modified by the other. The term convolution refers to both the result function and to the process of computing it. It is defined as the integral of the product of the two functions after one is reversed and shifted. While collusion is about people getting together to defraud a system. Both terms are correct, in my humble opnion and context helps to employ them properly. If one would be talking about people getting together to remove watermark that would be Collusion (could be a single guy btw). if you are talking about the math process to merge to different signals and produce a third than it is Convolution.

I just got my MASTERS!! Yeahhhh – And what have I learned with it?

This Feb 28th is the so called “Thesis Defense” day. It is where me, myself and I, after submitting the theses papers, put myself at the disposal of the thesis committee. In this case, “defend” does not imply that a I will have to argue aggressively about my work (although I see myself doing it).

Resultado de imagem para fight gif

Rather, the thesis defense is designed so that faculty members can ask questions and make sure that students actually understand their field and focus area. It serves as a formality because the paper will already have been evaluated ( have been… it is called Qualification Process). During a defense, a student will be asked questions by members of the thesis committee. Questions are usually open-ended and require that the student think critically about his or her work. The event is supposed to last from one to 3 hours, I have heard it could take more.. geees!

The Defense, is the crowning event of at least 2 years of hard study, dedication and sacrifice. And I want to tell you what I have learned with it.

It is not as hard and mystic as it seems

At least here in Brazil, masters, or as we call it “Mestrado” is not as common as it should be. It has some sort of mysticism around it, like if it was reserved for a certain “class” of student and society. People really tend to go for an MBA. MBA in Brazil, although stands for – Masters in Business Administration, has nothing to do with masters Strict Sensu and it just a Lato-Sensu course, or a specialization. I’m not taking out the credit of those that chooses the MBA, but it is different. The MBA, in the country has a more commercial practical focus. Also it is offered during the night, or weekends which helps a lot those that actually have a job to attend. I guess that is the main reason people tend to go that way. In the other hand a full blown masters course, is considered too academical or meant for those that want to pursue a professor, researcher or academic career. This is not entirely true! You can enroll to a master course, continue to work in the industry and solve a real world problem.

This is part of the mysticism that goes around a Masters course. It is not only meant for academics, it is not meant only for super nerds researchers and you can do it while you keep your actual job! It is true, that the work you have must give you certain flexibility and freedom to cope with the crazy schedules that some schools push, but it is possible.


You can have a job that is not related to universities and academic world. You can work anywhere you want. In fact, big tech companies are the ones that employs masters graduates the most. And, there is a probability that your salary increases by up to 80% if you have a master degree.

The professors and board of teachers are pretty much regular people, with experience on some topics and areas of research.. but are NOT the owners of the entire knowledge. It is pretty common that the student knows more about a given topic than the professor. He is there, to help you to adjust your thought process and writing your ideas within the accepted scientific methodology but he is not a God with omniscience. You can argue, you can defend a statement, heck, you can actually fight with your professor (although not a good idea) if you think X is equal Y.

The other aspect of the mysticism of the Master course is that you are there to learn and have classes.Myth! There is no way for the school to teach each student the specifics about his work. What you actually learn is how to organize your thought process, how to treat numbers you may collect from your research, how to use others work to build the fundamentals of your work. That is it. Don’t expect to be there and have classes of advanced math or signal processing or any in depth classes about your field. That’s not going to happen. Instead you have several classes of debate, several seminars to expose your ideas and have the other student to confront, challenge and disagree with you. You have some writing classes, you have basic statistics classes and a lot of seminars. That’s where the knowledge and ideas are born. That’s where you mature and learn how to “defend” your work and identify potential flaws in it, by discussing it with other people.

Since it depends on you, to understand, collect, test, treat and present the work.. it is not as hard as it seems. give it a try. You might surprise your self.

It is a victory in solitude

Resultado de imagem para solitude gif

It is sad! I know. But it is the hard truth. There is a big chance that not a single soul around you will actually understand what you are doing. I mean, friends, family co-workers. None of them will, one – be interested about what you discovered, two – be willing to discuss it in detail. Nobody cares! If you are married, your wife will be interested about when you will finish it so she can have you back to regular life. Or when you will stop spending the night reading to give more attention to your kids, or – my case – When you will be finished to be able to request a raise at your current job!

There wont be any question about the inner details of your research, and if they initially show interest.. that is rapidly lost when you start going on and on about it.

Upon completion, your friends will be excited to know that you have finished it with success.. remember the myth that it is super hard and reserved for some people? Sure, they will be thrilled with the news. But don’t think they are really interested in bits and bytes of it. And it is not because the don’t like you, or have no interest on your stuff.. it is just because the don’t understand it.. and it is really hard to relate with something you have no idea about.. they are (as everybody else) afraid to look stupid.

Isn’t it sad? That you research, and successfully develop something that could be used by society and have the potential to put your name in the field history.. and nobody cares?! You can’t share it, or brag about it :)?! Come on!! So sad!

It is indeed a victory in solitude. Be glad you made it. The hours, the sacrifice is totally worth!!. Knowledge is one the few things you can actually keep and its value can’t be measured. It is however a satisfaction that only you will feel in its fullness. It is ok!

Are you curious to know what I have studied? Probably not, Maybe?!

My thesis is called – Image Perceptual Hash applied for Video Copy identification. and here is the abstract.

With the event of the Internet, video and image files are widely shared and consumed by users from all over the world. Thus, methods to identify these files have emerged as a way to preserve intelectual and commercial rights. Content based identification or perceptual hashing is the technique capable of generating a numeric identifier from the image characteristics. With this identifier, it is possible compare and decide if two images are equal, similar or different. This study has as objective discuss the application of image perceptual hashing to identify video copies. It proposes the usage of known and public methods such as the Average and Difference Hash that are based on statistics of the image also Phash and Wavelet hash that are based on the image frequency.

An identification technique was applied using a Hamming distance similarity threshold and the combination of perceptual hash algorithms for video copy identification. The method was tested by applying several attacks to the candidate video and the results were properly detailed. It is possible to use perceptual hash algorithms for video copy identification, and there are benefits when there is a combination of more than one of them filling performance gaps and vulnerabilities eliminating false positives

Escrevendo em inglês – Porque?

Eu trabalho com TI, com informática, com computação. Deu pra entender né? Pra ser mais especifico, eu trabalho com segurança da informação e desenvolvimento para automação de tarefas. Isto significa, na prática que a literatura das coisas relacionadas do ramo, raramente são em português.

Por isto, já que eu leio em inglês, por que não escrever também? Por que eu não sou nativo? Porque talvez o número de erros gramaticais serão grandes? Talvez. Porém uma análise mais fria da coisa toda, me faz perceber que mesmo escrevendo em português eu não estaria livre deste risco. Na era das redes sociais, escrever errado se tornou o padrão da comunicação, por que as pessoas valorizam mais o “fazer-se entender” do que o propriamente escrever corretamente. Neste paradigma, escrever correto, na verdade aumento o risco do locutor não ser compreendido…( 🙂 ). Então por que não arriscar?

Assim, os posts relacionados a coisas técnicas e coisas de nerd deste blog serão em inglês. Os outros, posts, relacionados a corrida, bike e outras aventuras seguirão em português.