11758387 (P.T.A.B. Mar. 30, 2017)

Ex Parte Rodgers

Patent Trial and Appeal BoardMar 30, 2017

11758387 (P.T.A.B. Mar. 30, 2017)

United States Patent and Trademark Office UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O.Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 11/758,387 06/05/2007 Stephane Rodgers 3875.1480001 4354 49579 7590 04/03/2017 STERNE, KESSLER, GOLDSTEIN & FOX P.L.L.C. 1100 NEW YORK AVENUE, N.W. WASHINGTON, DC 20005 EXAMINER BEHESHTI SHIRAZI, SAYED ARESH ART UNIT PAPER NUMBER 2435 NOTIFICATION DATE DELIVERY MODE 04/03/2017 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): e-office @ skgf.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE PATENT TRIAL AND APPEAL BOARD Ex parte STEPHANE RODGERS Appeal 2015-005366 Application 11/758,3871 Technology Center 2400 Before JASON V. MORGAN, MICHAEL J. STRAUSS, and SCOTT B. HOWARD, Administrative Patent Judges. MORGAN, Administrative Patent Judge. DECISION ON APPEAL Introduction This is an appeal under 35 U.S.C. § 134(a) from the Examiner’s Final Rejection of claims 1—34. We have jurisdiction under 35 U.S.C. § 6(b). We REVERSE and enter NEW GROUNDS OF REJECTION. Invention Appellant discloses the use of unique version identifiers to detect when a prior version of code is copied over a subsequent version of code. Abstract. 1 Appellant identifies Broadcom Corporation as the real party in interest. App. Br. 3. Appeal 2015-005366 Application 11/758,387 Exemplary Claims Claims 1,3, and 4, reproduced below with key limitations emphasized, are exemplar).'': 1. A method for securing executable code in a system, the method comprising: in a reprogrammable security system which previously utilized a prior version of executable code: receiving a subsequent version of executable code, wherein said subsequent version of executable code comprises a corresponding unique code version identifier embedded therein; detecting in said reprogrammable security system, instances when said prior version of executable code is copied over at least a portion of said subsequent version of executable code, wherein said detecting is based on said corresponding unique code version identifier; and controlling operations of said reprogrammable security system based on said detection. 3. The method according to claim 2, wherein said corresponding unique code version identifier for each of said prior version of executable code and said subsequent version of executable code is embedded in a plurality of locations therein. 4. The method according to claim 3, wherein said corresponding unique code version identifier for each of said prior version of executable code and said subsequent version of executable code is encrypted within corresponding ones of said prior version of executable code and said subsequent version of executable code. 2 Appeal 2015-005366 Application 11/758,387 Rejections The Examiner rejects claims2 1—3, 5—7, 18—20, and 22—24 under 35 U.S.C. § 103(a) as being unpatentable over Hars (US 2006/0005046 Al; publ. Jan. 5, 2006) and England (US 2004/0003244 Al; publ. Jan. 1, 2004). Final Act. 3—8. The Examiner rejects claims 4, 8—17, 21, and 25—34 under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, and Drehmel (US 2006/ 0015754 Al; publ. Jan. 19, 2006). Final Act. 8-12. ANALYSIS Issue'. Did the Examiner err in finding the combination of Hars and England teaches or suggests a “subsequent version of executable code comprises a corresponding unique code version identifier embedded therein,” as recited in claim 1 ? In rejecting claim 1, the Examiner finds that Hars, by providing firmware version information in a firmware update file header or auxiliary data, teaches or suggests a subsequent version of executable code comprises a corresponding unique code version identifier embedded therein. Final Act. 3^4 (citing Hars Figs. 1, 2, 8—20). The Examiner interprets the claimed executable code to encompass a file containing both code and auxiliary data (e.g., a header), such as the Hars firmware update file. Ans. 4. Appellant contends the Examiner erred because, in contrast with the claimed invention, “Hars intentionally maintains the auxiliary data separate from the firmware code, i.e., not embedded within.” App. Br. 10. Appellant argues that the auxiliary data in Hars merely describes the firmware code 2 Claims 7 and 23 are not listed in the statement of the rejection (Final Act. 3), but are addressed in the body of the rejection {id. at 6, 8). 3 Appeal 2015-005366 Application 11/758,387 contained in the firmware update file, but that this auxiliary data is not executable code itself. Reply Br. 3. We find Appellant’s argument persuasive. The Examiner’s interpretation of executable code as encompassing an entire file that has both code (which is executable) and auxiliary data (which describes the code, but has not been shown by the Examiner to be executable) is unreasonable because, inter alia, such an interpretation is inconsistent with Appellant’s Specification. Spec. Tflf 24—26. Thus, the Examiner’s basis for finding Hars teaches or suggests the claimed embedding of a unique code version identifier is erroneous. The Examiner does not rely on England with respect to this recitation. Therefore, the Examiner’s findings do not show that the combination of Hars and England teaches or suggests subsequent version of executable code comprises a corresponding unique code version identifier embedded therein. Accordingly, we do not sustain the Examiner’s 35 U.S.C. § 103(a) rejection of claim 1, and claims 2, 3, 5—7, 18—20, and 22—24, which contain similar recitations and are similarly rejected. The Examiner does not show that Drehmel cures the noted deficiency. Therefore, we also do not sustain the Examiner’s 35 U.S.C. § 103(a) rejection of claims 4, 8—17, 21, and 25— 34. NEW GROUNDS OF REJECTION Claims 1, 2, 7, 18, 19, and 24 Although the Examiner’s findings do not show that Hars, England, and Drehmel teach or suggests a subsequent version of executable code comprises a corresponding unique code version identifier embedded therein, it was known in the art that identification information could be embedded in 4 Appeal 2015-005366 Application 11/758,387 code. One technique known at the time of the invention is detailed in Smita Thaker, Software Watermarking via Assembly Code Transformations, Master’s Thesis, San Jose State Univ., Dept, of Comp. Sci., 1—69 (May 2004) (available at http://www.cs.sjsu.edu/faculty/stamp/students/ cs298ReportSmita.pdf) (“Thaker”) (excerpts attached as an Appendix). Thaker describes the use of “software watermarking, i.e., embedding some special pieces of code in software so as to uniquely identify software.” Thaker 8. Although Thaker considers embedding, e.g., a “customer-id in the form of a watermark into the software” being sold {id. at 9), Thaker makes clear that the “special pieces of code (transformations) serve as a watermark and can carry any special information” (id. at 8; emphasis added). It would have been obvious to an artisan of ordinary skill to embed the unique code version taught or suggested by Hars’ firmware version into the firmware code of Hars (i.e., into executable code using the software watermarking technique of Thaker) because doing so would improve security by making tampering with the executable code more difficult. Id. at 10—11. Except as detailed above, we adopt the Examiner’s findings with respect to Hars and England as they pertain to the rejection of claims 1, 2, 7, 18, 19, and 24 and incorporate our additional findings as detailed above. Accordingly, we newly reject these claims under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, and Thaker. Claims 3, 5, 6, 20, 22, and 23 With respect to the Examiner’s rejection of claim 3, Appellant argues the combination of Hars and England does not teach wherein said corresponding unique code version identifier for each of said prior version of executable code and said subsequent version of executable code is 5 Appeal 2015-005366 Application 11/758,387 embedded in a plurality of locations therein because “England does not disclose appending more than one copy of a secure counter value to a data file.” App. Br. 12. However, Thaker explicitly teaches “inserting the watermark multiple times throughout the code.” Thaker 49; see also id. at 52. Thus, Thaker would have cured the alleged deficiency. Therefore, we disregard the Examiner’s disputed finding regarding England as it pertains to the rejection of claim 3, but note that Appellant’s argument with respect to claim 3 is moot because of the pertinent teachings and suggestions of Thaker. Except as detailed above, we adopt the Examiner’s findings with respect to Hars and England as they pertain to the rejection of claims 3, 5, 6, 20, 22, and 23 and incorporate our additional findings as detailed above. Accordingly, we newly reject these claims under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, and Thaker. Claims 4 and 21 With respect to the Examiner’s rejection of claim 4, Appellant argues the combination of Hars, England, and Drehmel does not teach wherein said corresponding unique code version identifier for each of said prior version of executable code and said subsequent version of executable code is encrypted within corresponding ones of said prior version of executable code and said subsequent version of executable code because “Hars explicitly discloses that the ‘auxiliary data’ (which ‘can include ... the version number of the new firmware’ (Hars, 1 8)) i[s] not encrypted {id., 113).” App. Br. 13—14. In particular, Appellant contends that “the Examiner has provided no explanation as to why/how a skilled artisan would 6 Appeal 2015-005366 Application 11/758,387 circumvent the express teachings of Hars and encrypt the version identifier.” Id. at 14. Appellant’s argument is unpersuasive because, as Appellant acknowledges, “Hars explicitly states that the firmware code is encrypted.” App. Br. 10 (citing, e.g., Hars 1 8). Thus, in modifying Hars with the teachings and suggestions of Thaker—such that the encrypted executable code of Hars has unique code version identifier information embedded therein—the unique code version identifier information would also be encrypted (as being embedded within the executable code that is encrypted). Appellant further argues that the Examiner erred in modifying the combination of Hars and England using the teachings and suggestions of Drehmel because the Examiner merely “provides a perfunctory cite to 138 of Drehmel as [the] alleged rationale, which is insufficient to demonstrate obviousness.” Reply Br. 6; see also App. Br. 14. However, Appellant’s argument is unresponsive to the Examiner’s conclusion—supported by Drehmel’s teachings—that it would have been obvious to modify the combination of Hars and England using the teachings and suggestions of Drehmel “in order to prevent unauthorized rollback of versions.” Final Act. 9 (citing Drehmel 138); Ans. 5. Therefore, Appellant’s argument is unpersuasive. Except as detailed above, we adopt the Examiner’s findings and conclusions with respect to Hars, England, and Drehmel as they pertain to the rejection of claims 4 and 21 and incorporate our additional findings as detailed above. Accordingly, we newly reject these claims under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, Drehmel, and Thaker. 7 Appeal 2015-005366 Application 11/758,387 Claims 8—17 and 25—34 Except as detailed above, we adopt the Examiner’s findings and conclusions with respect to Hars, England, and Drehmel as they pertain to the rejection of claims 8—17 and 25—34 and incorporate our additional findings as detailed above. Accordingly, we newly reject these claims under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, Drehmel, and Thaker. DECISION We reverse the Examiner’s decision rejecting claims 1—34. We newly reject claims 1—3, 5—7, 18—20, and 22—24 under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, and Thaker. We newly reject claims 4, 8—17, 21, and 25—34 under 35 U.S.C. § 103(a) as being unpatentable over Hars, England, Drehmel, and Thaker. This Decision contains a new ground of rejection pursuant to 37 C.F.R. § 41.50(b). Section 41.50(b) provides “[a] new ground of rejection pursuant to this paragraph shall not be considered final for judicial review.” Section 41.50(b) also provides: When the Board enters such a non-final decision, the appellant, within two months from the date of the decision, must exercise one of the following two options with respect to the new ground of rejection to avoid termination of the appeal as to the rejected claims: (1) Reopen prosecution. Submit an appropriate amendment of the claims so rejected or new Evidence relating to the claims so rejected, or both, and have the matter reconsidered by the examiner, in which event the prosecution will be remanded to the examiner. The new ground of 8 Appeal 2015-005366 Application 11/758,387 rejection is binding upon the examiner unless an amendment or new Evidence not previously of Record is made which, in the opinion of the examiner, overcomes the new ground of rejection designated in the decision. Should the examiner reject the claims, appellant may again appeal to the Board pursuant to this subpart. (2) Request rehearing. Request that the proceeding be reheard under §41.52 by the Board upon the same Record. The request for rehearing must address any new ground of rejection and state with particularity the points believed to have been misapprehended or overlooked in entering the new ground of rejection and also state all other grounds upon which rehearing is sought. Further guidance on responding to a new ground of rejection can be found in the Manual of Patent Examining Procedure § 1214.01 (9th Ed., Rev. 07.2015, Nov. 2015). No time period for taking any subsequent action in connection with this appeal may be extended under 37 C.F.R. § 1.136(a). See 37 C.F.R. § 41.50(f). REVERSED 37 C.F.R, § 41,501b) APPENDIX Smita Thaker, Software Watermarking via Assembly Code Transformations, Master’s Thesis, San Jose State Univ., Dept, of Comp. Sci., 1—11, 49-54 (May 2004). 9 Application/Control No. Applicant(s)/Patent Under Patent Notice of References Cited 11/758,387 Appeal No. 2015-005366 Examiner Art Unit 2435 Page 1 of 1 U.S. PATENT DOCUMENTS * Document Number Country Code-Number-Kind Code Date MM-YYYY Name Classification A us- B us- C US- D US- E US- F US- G US- H US- 1 US- J US- K US- L US- M US- FOREIGN PATENT DOCUMENTS * Document Number Country Code-Number-Kind Code Date MM-YYYY Country Name Classification N O P Q R S T NON-PATENT DOCUMENTS * Include as applicable: Author, Title Date, Publisher, Edition or Volume, Pertinent Pages) U Smita Thaker, Software Watermarking via Assembly Code Transformations, Master’s Thesis, San Jose State Univ., Dept, of Comp. Sci., 1-11,49-54 (May 2004). V w X *A copy of this reference is not being furnished with this Office action. (See MPEP § 707.05(a).) Dates in MM-YYYY format are publication dates. Classifications may be US or foreign. U.S. Patent and Trademark Office PTO-892 (Rev. 01-2001) Notice of References Cited Part of Paper No. SOFTWARE WATERMARKING VIA ASSEMBLY CODE TRANSFORMATIONS A Thesis Presented to The Faculty of the Department of Computer Science San Jose State University In Partial Fulfillment of the Requirements for the Degree Masters of Science by Smita Thaker May 2004 Page 1 of 69 ©2004 Smita Thaker ALL RIGHTS RESERVED Page 2 of 69 APPROVED FOR THE DEPARTMENT OF COMPUTER SCIENCE Dr. mark Stamp Dr. Chris Pollett Dr. Dave Blockus APPROVED FOR THE UNIVERSITY Page 3 of 69 ABSTRACT SOFTWARE WATERMARKING VIA ASSEMBLY CODE TRANSFORMATIONS By Smita Thaker By some accounts, piracy costs the software industry in excess of $10 billion annually [10]. One possible defense against piracy is to add a watermark to software. While this will not prevent all piracy, a robust watermark would make it possible to determine the source of pirated software, and hence would likely discourage a significant amount of piracy. To date, it has been difficult to design digital watermarking schemes that can function well in a hostile environment. Most watermarking techniques are easily removed or distorted beyond recognition. For example, “stirmark” will effectively remove a watermark from a digital image; see [12] for a discussion. This project, presents a watermarking scheme based on modifications to software at the level of assembly code. These modifications result in multiple versions of a given piece of software, with each version having a different code sequence, but with all versions being functionally equivalent. Our watermarking approach was inspired by metamorphic computer viruses. Metamorphic viruses are transformed into distinct code each time they replicate. This makes detection far more difficult, since there is no constant signature for virus scanning software to detect. In the context of watermarking, metamorphism allows the watermark to be uniquely embedded in each instance of the software. While it is still possible to Page 4 of 69 remove (or obscure) a watermark in a given instance of the software, the variability of the code makes it extremely difficult to automatically remove watermarks from a significant number of instances of the code. Consequently, watermark removal will remain a manual and labor-intensive process. Metamorphic code transformations have also been discussed as a method for increasing software diversity. In an analogy to the genetic diversity of biological systems, it is argued that increased software diversity can limit the effects of malicious attacks on computing systems [12], A side benefit of our approach to software watermarking is that it results in diverse (or metamorphic) software, and hence may increase the resistance of such software to many types of attacks. Page 5 of 69 TABLE OF CONTENTS 1 Introduction............................................................................................................................7 2Requirements........................................................................................................................ 9 3Potential Applications:....................................................................................................... 11 4Assembly language programming - Background information..........................................13 4.1 Inside the CPU.............................................................................................................. 13 4.1.1 General Purpose Registers................................................................................... 14 4.1,2Segment Registers................................................................................................ 15 4.1.3Special Purpose Registers.................................................................................... 16 4.28086 Instruction Set:.....................................................................................................16 5Design...................................................................................................................................18 5.1 Part A: Inserting the watermark into the code.............................................................18 5.2Part B: Read Watermark.............................................................................................. 21 6 Implementation..................................................................................................................23 6.1 Setting up the environment / Tools used.....................................................................23 6.2Implementation Details................................................................................................23 6.2.1 Inserting the watermark in the code..................................................................... 25 6.2.2Reading the watermark from the executable:...................................................... 35 7Code Transformations......................................................................................................... 39 7.1 Inserting Jump statements - JMPs............................................................................... 39 7.2 Adding redundant labels:........................................................................................... 40 7.3Using arithmetic and logical instructions for transformation.................................... 41 7.4Inserting NOPs............................................................................................................ 43 7.5Using PUSH and POP instructions............................................................................. 43 8Challenges............................................................................................................................45 9Deployment..........................................................................................................................46 lOSummary............................................................................................................................ 49 1 IDirectory Structure............................................................................................................ 50 12Conclusion......................................................................................................................... 52 13 References:.........................................................................................................................53 14Appendices.........................................................................................................................54 14.1 runproj.cmd................................................................................................................54 14.2 properties.txt..............................................................................................................55 14.3keyTranx.txt................................................................................................................56 14.4 keyPattem.txt.............................................................................................................57 14.5 CreateTranxO.java.................................................................................................... 58 14.6ReadWM.java.............................................................................................................59 14.7 WriteWM.java...........................................................................................................63 Page 6 of 69 1 Introduction Software security is constantly compromised by attacks from hackers. Typically, hackers spend time and resources to tamper with the code to discover vulnerabilities then exploit these to carry out an attack. Such attacks, when conducted in a distributed fashion across a network, severely disrupt the functioning of corporations, online service providers, government agencies, etc. One possible defense against hackers is to have multiple copies of software which are functionally equivalent but where each copy is distinct (Software Diversity). In this case, an attack is likely to be effective against only one instance (or a small fraction of instances) of the software. Some tools and techniques have been devised for this purpose. Moreover, these tools modify software at the source code level. We believe that transformations at the assembly code level are more robust and secure. Another use of code transformations is for software watermarking. In this research project, we intend to focus primarily on this particular application of such transformations. I propose to make the code transformations at the assembly level. This creates unique, but functionally equivalent instances of a given piece of software. The aim of this project is to effectively insert a watermark in assembly code software and retrieve the watermark from the executable. This watermark carries customer-specific information. We will be working on assembly language 80x86 as it is widely used by several of the current processors. The project has two principal goals. Page 7 of 69 1. To produce multiple copies of a given piece of software, making an attacker's job more difficult and thereby reducing the potential damage in the event of an attack. 2. To provide anti-piracy protection by embedding a customer's information in each instance of the software. If pirated versions of the software are found in the market, we can retrieve the watermark from its code and determine which legitimate customer originally received this copy of the software. As a side benefit, the reverse-engineering problem is made more difficult and this in turn makes the software more secure. We must make it difficult for an attacker to identify the actual location of the watermark in the software. And even if an attacker does succeed in identifying the watermark, tampering with it should be a challenge, since he has to work at a very low level. In short, the task of modifying the watermark and having the resulting code to function correctly is a challenging task. Definition Watermarking is a technique of embedding some special mark in an object to use it as identification. This project addresses software watermarking, i.e., embedding some special pieces of code in software so as to uniquely identify software. These special pieces of code (transformations) serve as a watermark and can carry any special information we wish to embed. Page 8 of 69 2 Requirements The project has the following goals. 1. Watermark - To hide certain customer-specific information within each copy of software. To be able to successfully read the watermark back to find out at any point of time, who the actual owner of the software copy is. If we always embed, say, the customer-id in the form of a watermark into the software we are selling, we will be able to track down the owner of any software instance. The purpose of this feature is to discourage software pirates and reduce the industry losses. 2. Software Diversity - To produce multiple diverse copies of a given piece of software making it harder for an attacker to attack a particular software weakness and exploit it. Hackers generally study software looking for a bug or weakness and thereafter use this weakness to carry out malicious attacks. When searching for potential victims, attacker generally tend to look for software with a known weakness. Now if we can introduce software diversity, then although all our software instances will be functionally the same, they will differ in their lines of code. So, the hackers’ pattern matching criteria will work only for one instance (or a very few limited instances). His line of attacking will fail against the majority of software instances providing a higher level of protection and a major reduction in damages. Page 9 of 69 3. Robustness - To make the watermark robust enough to survive an optimizing compiler. An optimizing compiler tends to remove pieces of code it determines are unnecessary. Since the watermarked lines of code have been added into the actual software code, there is a possibility that the compiler will filter out the watermark on compilation. Our job is to make sure that the transformations we select to carry our watermark can survive this optimization. 4. Reliability - To ensure that the project application returns the correct watermark each time. Inspite of the loss of watermarking information by the optimizer, we should always be able to obtain correct and accurate results from the project. If the results obtained are incorrect we may end up incriminating a totally innocent customer to whom the customer-id returned belongs to. On the other hand, if the project returns partial customer-id results (say some bits of the customer-id are missing) we will be unable to decide if the optimizer is to blame or it’s some smart reverse engineering job. Even slight loss of reliability can jeopardize the utility of the software. Hence, the project application needs to be virtually 100% reliable. 5. Obscurity - To introduce obscurity in the resulting files so as to make the job of reverse-engineering more challenging. Reverse engineering is a process of breaking down a software to understand how it works so as to copy it, tamper with it or maybe enhance it. In an application with software diversity, the reverse engineering process involves Page 10 of 69 comparing multiple instances of the software and trying to determine where and how the watermark has been inserted. The introduction of real and dummy watermark transformations into the code should be done so that even after comparing several different instances of a software, a hacker cannot extract any information regarding the watermark location or the watermark application techniques. 6. Security and Tamper Proofing - Tampering with the watermark should be a challenging job. Also, in the event of code tampering, the application should be able to detect it. Software tampering is a by product of successful reverse-engineering. With respect to this project, the most obvious task a hacker might carry out on successful reverse-engineering is to tamper with our inserted watermark. If he can figure out the embedded lines of code in the software, he can determine the location of some of these watermark transformation. His next interest would be to tamper with this watermarking by changing it or removing it, although it will be a manual and labor-intensive process. Our aim is to make the application as secure and tamper-resistant as possible. Moreover, if inspite of all this protection someone still manages to tamper with the watermark, we should have some mechanism to be able to detect that the watermark has been tampered with. 3 Potential Applications: Some of the potential applications of this application are listed below: Page 11 of 69 10 Summary We successfully carried out the project goals of inserting a watermark into the assembly code and reading the watermark from the executable. This is how we achieved our results. 1. Watermark - We embedded the customer-id into the software assembly via transformation-insertion schemes. 2. Software Diversity - By embedding a different watermark and a different set of dummy-transformations into appropriate random locations throughout the software code, we are able to produce several copies of software which are functionally equivalent but where each copy is unique. An attacker who manages to break one particular instance of the software cannot use the same techniques to break the entire system [11], It thus is able to provide partial protection against reverse-engineering. 3. Robustness - By trying out several different transformations we managed to find out the ones with a high survival rate after being processed by the optimizing compiler. We also figured out the locations where they tend to get lost. So using this knowledge and by inserting the watermark multiple times throughout the code we are able to make the watermark robust. 4. Reliability - The transformations selected for watermarking are highly robust, and are inserted multiple times. Thus, we are able to obtain accurate results every time. 5. Obscurity - Large number of dummy transformations have been inserted alongwith the watermark throughout the code to make every instance of the software Page 49 of 69 significantly different from the other and to avoid any sort of pattern formations in the code. 6. Security and Tamper Proofing - In general, it is impossible to disassemble an executable, modify its assembly-code and re-assemble the code and obtain an executable, which can work. So, if someone tampers with the watermark at the assembly level he will have to put in significant amount of time and labor to generate a working executable out of it. Moreover, we also have a backup scheme to retrieve the watermark. So each time we read the watermark we obtain results from both scheme 1 and scheme2 to see if any of the cust-id bits are lost or tampered with. Any discrepancy in results will immediately indicate tampering. Since both schemes work very differently, it’s nearly impossible to figure out how each one of them works and tamper the code in such a way that results from both schemes match. 11 Directory Structure ■ runproj.cmd - the command file which comprises of all instructions to read/write WM. More details about runproj.cmd can be found in the appendix section. ■ properties.txt - a text file which stores the value of all crucial project-parameters like file names, customer-ids, etc. This allows us to change the parameters at runtime. ■ keytranx. txt - this file stores the list of all possible transformations we embed as our watermark Page 50 of 69 ■ keyPattem.txt - this file stores the patterns into which the transformations get changed to after passing through the read/write procedures. This file is for our reference only. These patterns have already been included in the read procedure, for doing a pattern search. ■ dummyTranx - this file stores the list of all dummy transformations, which we embed alongwith with our watermark-transformation to introduce obscurity and thereby making the problem of reverse engineering much more challenging. ■ ReadWM.java - carries the code to read the watermark from the indicated executable-file. ■ WriteWM.java - carries the code to write the watermark based on the given customer-id We have used the hello program as a sample to explain our implementation. It is very short in size, which makes it easy to understand. But for an actual demonstration we will be using a bigger C project. It’s a traffic simulation project whose source files are listed below. It gives us more flexibility and variation when playing around with the customer-id, its size and other parameters. ■ Source C files - this list depends on what software you pick to be watermarked. We have used a traffic intersection simulation project as our source. o Tmain.c - the main file o Init.o - object file of init.h. File init.h contains code to initialize the system before the start of the simulation. o Rsrc.o - object file Page 51 of 69 Generated Files Tmain.s - the assembly file generated from the source-C file. wmTransform.s - Tmain.s gets transformed to wmTransform.s after applying the watermark wmExe.exe - the executable generated for this particular software- project. wmExe.asm - the assembly file generated when disassembler IDA Pro disassembles wmExe.exe. We read the watermark from this file. 12 Conclusion The technique of software watermarking discussed in this paper appears to be a viable method for embedding a robust and invisible watermark into software. The technique has several potential benefits. First, the removal of the watermark is difficult, if not impossible, to automate. In general, it is impossible to disassemble an executable, modify its assembly-code and re-assemble the code to obtain an executable that works. Second, the watermark can be made highly robust by inserting the mark multiple times into the code. Third, the technique has a positive security side effect, since it results in diverse software. Software watermarking field is still in the developing phase and a lot needs to be done in this area. At times, compiler optimizations remove inserted watermarks. Research needs to be done on how optimization works and what can be done to make the watermark more robust. Methods to make the watermark more stealthy need to be Page 52 of 69 discovered. The project also needs to be software independent i.e. we need to find how applications written in languages other than C can use these watermarking schemes. Different assemblers and disassemblers use different assembly syntax. So this project code may not work if we use an assembler other than gcc or a disassembler other than IDA Pro. We need to find some common grounds such that we can use this project with most commonly used assemblers and disassemblers. The steps involved in reading the watermark are complex and tedious. The most difficult part is to determine how the watermarking transformations get changed after they are written, the code is assembled and the executable is disassembled. Coming up with the correspond patterns involves trial and error mechanism and is time-consuming. Moreover these patterns need to be hard coded into the read mechanism. Hence, its not easy to change the transformation sequence selected for watermarking. 13 References: [1] 8086 instruction set summary, Retrieved on September 24, 2003 from http://www.eie.polyu.edu.hk/~enyhchan/8086inst.pdf [2] Brey, B., Assembly Language Programming: 8086-8088, 80286, 80386, 80486, Macmillan Publishing Company, 1997. [3] Collberg, C., Thomborson C.,and Low, D., A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Science, University of Auckland, New Zealand, July 1997. Retrieved on September 3, 2003 from http://www.es.auckland.ac.nz/~collberg/Research/Publications/CollbergThomborsonLo97a/i ndex.html [4] EMU8086: Tutorials, Retrieved on October 20, 2003 from http://www.emu8086.com/Help/tutorials.html [5] Gavin (1995), Gavin’s Guide to 80x86 Assembly, Retrieved on March 10, 2004 from http:/ /burks .brighton. ac .uk/burks/language/asm/ asmtut/asm2 .htm. Page 53 of 69 [6] Isenberg, D., Digital Watermarks: New Tools for Copyright Owners and Webmasters, Retrieved on February 8, 2004 from http://www.webreference.com/content/watermarks/. [7] Low, D., Protecting Java Code Via Code Obfuscation, 1998, Retrieved on Nov 10, 2003 from http://www.cs.arizona.edu/~collberg/Research/Students/DouglasLow/obftiscation.html. [8] Mayer, J., Assembly Language Programming: 8086/8088, 8087, John Wiley & Sons, 1988. [9] Mishra, P. A Taxonomy of Uniqueness Transformations, 2003. [10] Report: Software Piracy Dips. Retrieved on January 24, 2004 from http://www.cnn.com/2003/TECH/biztech/06/03/software.piracy.reut/. [11] Stamp, M., Digital Rights Management: The Technology Behind the Hype, Retrieved on October 15, 2003 from http://home.earthlink.net/~mstampl/papers/DRMpaper.pdf. [12] Stamp, M., Risks of Monoculture, to appear in Communications of the ACM. [13] Szor, P., Ferrie, P., Hunting for Metamorphic, Virus Bulletin Conference, September 2001, Retrieved on August 15, 2003 from http://www.peterszor.com/metamorp.pdf. [14] Zhang, Y., Shao, Q., Kwon, D., Krishnaswamy, S., Ma, D., Palsberg, J. Software Watermarking, Retrieved on October 30, 2003 from http://www.cs.purdue.edU/homes/madi/wm/#Intro 14 Appendices Following is the code of the basic files used : 14.1 runproj.cmd This file contains all the commands to automatically run the project. echo off echo ...................................................... Let's begin ........................................................................ echo 1. Creating an assembly TrafSim.s gcc -S Traffic/TrafSim.c pause echo ................................................................................................................................................................. echo 2. Creating custom-tranx 0 javac CreateTranxO.java java CreateTranxO echo ................................................................................................................................................................. echo 2. Writing watermark to assembly generating new file wmTrafSim.s javac WriteWM.java Page 54 of 69