Content-Based Identification Algorithms Combination Framework Against Audiovisual Piracy
Author: Torres,A.; Demanboro A.C.
With the event of the Internet, video and image files are widely shared and consumed by users from all over the world. Recent studies point out that one out of two internet users have engaged in activities classified as illicit. Unauthorized copy, distribution or publishing of digital content without the proper rights holder consent is what is commonly called piracy. Those that profit from digital piracy ignore the intellectual property laws and copyrights from the owners, programmers, distributors and many others that live and depend on the economic value of these assets. Methods to identify these files have emerged to preserve intellectual and commercial rights such as content-based identification techniques also known as perceptual hashing. With said techniques a unique identifier is generated making possible to compare two images or videos and decide if they are equal, similar or different. This article has as objective to discuss the application of content-based identification technologies as a method to fight piracy, presenting a framework where perceptual hashing can be used to prevent publishing and/or distribution of video content. The methodology proposed is to combine four types of perceptual hash (ahash, dhash, phash, whash) to make it possible to identify illegal videos with more accuracy. The results are encouraging, considering the most common forms of attacks.
Index Terms: Content-Based Identification, Piracy, Security, Intellectual Property
Video consumption worldwide has reached 70% of the internet users according to . The phenomenon is even more impressive when the time spent by the average user watching videos is considered. Using their cellphones, smart tv sets, tablets or computers one could be use up to 3 hours consuming content in the video form. The online video platform Netflix, for example, has consistently reported an annual increase of 35% in the number of hours watched by its users . A different study, published by Cisco, indicates that this year 80% of the total bandwidth used by the internet will be dedicated to the streaming of video content  and with the advance of mobile 5G technology the overall growth of mobile consumption of video will be incremented by 55% every year until 2022 .
With such demand and voracity from the end user is only natural to observe the equally growing interest of the multimedia industry. Hollywood studios and producers such as Disney, HBO, Fox and even technology giants such as Google, Amazon and Netflix invest billions of dollars in the production and acquisition of new titles. While the access to broadband technology and relative low costs of video streaming platforms enable these services to be purchased by virtually anyone – even by those financially less privileged, digital piracy remains a problem. P2P (Peer to Peer), Camcorder, Cyberlockers, Smartbox, Playlists and UGC  are names of technologies and ways of sharing content. P2P for example, has caused damage to the audiovisual industry since its inception in the beginning of the 2000s . It is estimated that through Camcorder and UGC will cause together, losses of 52 billion dollars until 2022 . Hence the reason that motivates the multimedia industry to invest in technologies and in alternatives to protect their assets. Google, for example, invested about 100 million dollars in content protection within the platform of YouTube and claims to have paid more than 3 billion dollars to rights holders . Facebook social network through the proprietary Rights Management tool, similarly, helps the holders of copyright and intellectual property to identify their works . Both Google and Facebook systems use content-based verification tools to authenticate whether a work is similar or equal to another. The great advantage of perceptual hash systems when compared to watermarking is that it is not necessary to enter information into the reference file, thus preserving its original characteristics. Consequently, perceptual hash algorithms can be applied in circumstances where files have already been distributed . It is in this context, that a content-based authentication framework is presented to combine features of the algorithms known as ahash , dhash , phash , whash  to overcome inherited deficiencies under content-preserving manipulations  and achieve a complete identification scheme. The ImageHash library , written in Python, contains the implementation of the aHash, pHash, dHash and wHash used for this study.
Given the importance of this topic, this article will bring the video copy identification framework definition and detail the technique used in Section II. Section III will discuss the results found after applying content-preservation attacks to the candidate video and Section IV concludes the paper and gives some directions for future research on the topic.
II. Video Content-Based Identification Framework
The elements proposed in this work, are similar to what was previously proposed  however it is necessary to add some elements that are useful in identification of the objects. The steps of the framework proposed are Video Selection, Metadata Extraction, Frames Extraction, Hash Extraction, Search, Comparison and Identification. The hash generation is further explored, and four perceptual hash algorithms are used. Using the Hamming distance, a similarity threshold (st) is defined based on the number of bits of the hash value . The closest the value is to 0 the higher the chances of the two files being compared to be equal.
Video Selection – It is presumed that the party interested in video identification possess the reference file, and its selection should follow the criteria established by the owner. You can expect a single video file as a reference or a video library from which the hash will be generated for comparison, in this study a set of random of short videos with several resolutions and frame rate were selected. The candidate file, however, can be obtained from several different sources, from which reproduction is supposed to be done in a non-authorized manner.
Metadata Extraction – Metadata is important to avoid that comparisons of the same file are performed. Although most of the metadata about the candidate file are optional, there may be a need to use certain information such as source file, location, date, resolution, duration, number of frames and others, as evidence for potential litigation.
Frames Extraction – All video frames are extracted for hash generation and the framerate of the video has direct impact in the number of resulted values.
Hash Generation – For each frame a set of four hash values is generated according to the ones available in the ImageHash library and stored in a database for further processing.
Search, Comparison – There are two groups of hash values to be analyzed and compared. The first being the reference value, representing the video you wish to locate and the second representing videos with copy potential. This means that for each reference file one must carry out the search and comparison of all the elements of the candidate base. The st is calculated directly at the database level using the XOR operator to compare each reference hash value with the candidates.
Identification – Based on the similarity threshold the identification is achieved by counting the number of hash values that has its hamming distance equal or below the similarity threshold. To combine the four algorithms a weighted average is calculated to generate what here is called of Similarity Coefficient (SC).
After comparing the candidate file against the reference database, the sc of each of them is used to determine which file is a potential copy.
III. Results and Discussion
To execute the test and unaltered version of the candidate video file was searched against the reference database and then the process was repeated having the file submitted to content-preservation manipulations such as resolution change, cropping and multiple degrees of rotation.
The resolution was changed from 1280×544 to 640×480 and the identification was achieved with success with an alteration of 0,41% of the number of records below the similarity threshold. The cropping reduced the area of the video in 30% and caused a dramatic alteration in the number of records below the threshold found, being 75% lower, yet it was possible to identify the correct video. During the cropping test if the algorithms were to be used alone the identification would not be possible using dHash or pHash.
The method was also effective when submitting the candidate video to rotation of 10 degrees clockwise. It was not possible to identify the correct video when the rotation was 90 and 180 degrees. The algorithms included in the ImageHash library use among other things the pixel and its position (x,y) to generate the hash value, it means that they are not resistant to strong special alterations. The results are expressed in the Table 1, and through it is possible to observe that for the 90 and 180 degrees a different video has the an FSC bigger than the actual candidate file.
The size of the reference videos database limits the verification of false positives and false negatives, since getting something close to reality is impractical without resorting to a larger database – which is what precisely the method seeks to avoid. In addition, the tests were performed with short videos (around 60 seconds), with drastic reduction of the number of frames to be analyzed. The similarity threshold can be altered based on the weights given to each algorithm and its behavior in face of the attacks to which they were submitted. This means that adjusting the weights produce different results. Considerations regarding the minimum number of frames required for a positive identification were not part of the scope of this article, however it is possible to observe that dark frames, or long takes, has direct impact on the hash values and consequently in the similarity calculation. Linear or force brute search and comparison generates perhaps an unnecessary number of values that could be otherwise discarded. keyframes could be used instead eliminating the frames that cause noise and potential false positives.
This study demonstrated the possibility of video copy identification using image perceptual hash algorithms and presented the results obtained. Existing opensource tools such as OpenCV library, MySQL, Python and ImageHash were used. An identification framework was proposed using a similarity threshold and a similarity coefficient with the combination of image perceptual hash algorithms to find the video with the higher probability of copy. It is concluded that it is possible to use perceptual hash algorithm of images for video copy identification, however, the combination of more than one of them fills performance gaps and vulnerabilities. The method requires memory, processing and advanced programming techniques so that it becomes feasible for everyday scenarios. The proposed technique and framework are therefore candidates to be used as part of identification tools, by incorporating image and videos fraud detection systems or civil and legal cases. It is necessary to stress that fighting piracy plays a social role in its efforts to preserve individual and collective rights, return of investments, creative and production value maintenance and even sustainability of jobs. Future work could include the use of machine learning or artificial intelligence to identify key and repeated frames. Parallel computing could be used in order to increase performance and calculation of the results. One can also explore the insertion of other attack-resistant perceptual hash algorithms to add robustness to the system. The number and characteristics of the attacks, as well as the weights assigned for the generation of the similarity coefficient number could be altered to ensure coverage of as many vulnerabilities as possible and greater efficiency.
 SRUOGINIS, K.; WARREN, J. Live Video Streaming-A Global Perspective. [S.l.], 2018.
 MCALONE, N. People became even more addicted to Netflix in 2015, according to Goldman Sachs. 2016.
 CISCO PUBLIC. Cisco Visual Networking Index: Forecast and Trends, 2017–2022. [S.l.], 2018.
 EJDLING, F. Ericsson Mobility Report – November 2018. [S.l.], 2018.
 G1.GLOBO.COM. Brasil perde R$ 130 bilhões por ano com pirataria, contrabando e comércio ilegal, aponta estudo. Brasilia: [s.n.], 2017.
 MITTAL, R. P2P Networks: Online Piracy of Music, Films and Computer Software.
Journal of Intellectual Property Rights, v. 9, n. -, p. 440 – 461, Sep 2004.
 HAJJ-AHMAD, A. et al. Flicker Forensics for Camcorder Piracy. IEEE Transactions on Information Forensics and Security, IEEE, v. 12, n. 1, p. 89 – 100, Jan 2017. ISSN1556-6013.
 MOENS, M.; LI, J. Mining User Generated Content and Its Applications. In: Mining User Generated Content. [s.n.], 2014. p. 3 – 17 .
 MILOJICIC, D. S. et al. Peer-to-Peer Computing. HP Laboratories Palo Alto,
Hewlett-Packard Company, Palo Alto, -, n. -, p. – – –, Mar 2002.
 DTVE. Piracy to cost TV and film industry US$52bn by 2022. 2017.
 GOOGLE. How Google Fights Piracy. 2018.
 FACEBOOK. Primeiros passos com o Rights Manager. 2018.
 HADMI, A. et al. Perceptual Image Hashing. Watermarking, Dr. Mithun Das
Gupta (Ed.), v. 2, 2012. ISSN 978-953-51-0619-7.
 YANG, B.; GU, F.; NIU, X. Block mean value based image perceptual hashing. In: . [S.l.:s.n.], 2006.
 DRMIC, A. et al. Evaluating robustness of perceptual image hashing algorithms. In: .[S.l.: s.n.], 2017.
 ZAUNER, C. Implementation and Benchmarking of Perceptual Image Hash Functions.2010.
 MONGA, V.; EVANS, B. L. Robust perceptual image hashing using feature points. In: . [S.l.: s.n.], 2004.
 HAN, S.; CHU, C. Content-based image authentication: current status, issues,
and challenges. Int. J. Inf. Sec., v. 9, n. 1, p. 19 – 32, 2010.
 BUSCHNER, J. ImageHash. 2016. – https://pypi.org/project/ImageHash/
 WENG, L.; PRENEEL, B. ATTACKING SOME PERCEPTUAL IMAGE HASH
ALGORITHMS. 2007 IEEE International Conference on Multimedia and
Expo, IEEE, Beijing, China, -, n. -, p. – – –, Aug 2007.
 WENG, L.; PRENEEL, B. A Secure Perceptual Hash Algorithm for Image Content.
Authentication. CMS’11 Proceedings of the 12th IFIP TC 6/TC 11 international
conference on Communications and multimedia security, p. 108 – 121, October 2011.