11034552 (B.P.A.I. Aug. 25, 2009)

Ex Parte Shafi

Board of Patent Appeals and InterferencesAug 25, 2009

11034552 (B.P.A.I. Aug. 25, 2009)

UNITED STATES PATENT AND TRADEMARK OFFICE ____________ BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES ____________ Ex parte HAZIM SHAFI ____________ Appeal 2008-005107 Application 11/034,5521 Technology Center 2100 ____________ Decided: August 26, 2009 ____________ Before LEE E. BARRETT, LANCE LEONARD BARRY, and HOWARD B. BLANKENSHIP, Administrative Patent Judges. BARRETT, Administrative Patent Judge. DECISION ON APPEAL This is a decision on appeal under 35 U.S.C. Â§ 134(a) from the final rejection of claims 1-16, 21, and 22. We have jurisdiction pursuant to 35 U.S.C. Â§ 6(b). We affirm. 1 Filed January 13, 2005, titled "System and Method to Improve Hardware Pre-Fetching Using Translation Hints." The real party in interest is International Business Machines Corp. Br. 2. Appeal 2008-005107 Application 11/034,552 2 STATEMENT OF THE CASE The invention The invention relates to a system and method for improving hardware- controlled prefetching2 within a data processing system. The system includes a data prefetcher engine that prefetches data in advance of a demand for data in response to execution of an instruction. There appears to be no dispute that data prefetchers were known in the art. Operating systems usually support virtual (or effective) memory allocated in pages. A virtual page is then mapped to a physical page that is allocated out of real physical memory devices in the system. The computer must "translate" virtual (or effective) addresses to physical addresses that identify a real memory storage location. Page frame tables (PFTs) in memory hold a collection of page table entries (PTEs) which are accessed to translate effective addresses (EAs) employed by software executing within a processing unit into physical addresses (PAs). Spec. Â¶ [0020]. A translation lookaside buffer (TLB) is a table that stores copies of PTEs utilized to translate effective addresses (EAs) into physical addresses (PAs). Â¶ [0027]. The TLB is used so that the computer does not have to compute the physical address each time. A TLB can be "on demand," where a translation entry for translating an effective to physical address is computed when an instruction is executed for which there is no entry in the TLB. Appellant 2 The application and references use both "prefetch" and "pre-fetch" base term spellings. For consistency, we use the "prefetch" version except when quoting. Appeal 2008-005107 Application 11/034,552 3 describes a "translation pre-fetcher." "A TLB (or translation) pre-fetch engine speculatively retrieves page table entries utilized for effective-to- physical address translation from a page frame table and places the entries into a TLB (translation lookaside buffer)." Spec. Â¶ [0050]. One consequence of the effective-to-physical address mapping is that large application data structures that are contiguous in virtual address space are often mapped to non-contiguous physical pages. Since hardware- controlled prefetching typically uses physical addresses to identify access patterns and perform prefetching, such prefetching is usually halted at physical page boundaries (e.g., at 4KB boundaries), requiring multiple pattern identification steps to prefetch multi-page data structures. Spec. Â¶ [0009]. The disclosed TLB engine examines requests for contiguous effective addresses residing in separate physical memory pages and sends the pairs of physical addresses (the physical address of the first and second memory pages) to the hardware data prefetcher engine in the form of a "hint," where the "hint" offers the hardware prefetcher engine a suggestion of a physical page to which to transition after prefetching has completed on the present page. Â¶ [0050]. Appeal 2008-005107 Application 11/034,552 4 The claims Representative claim 1 is reproduced below: 1. A processor, comprising: a data pre-fetcher that pre-fetches data in advance of a demand for said data by execution of an instruction; and a translation pre-fetcher that pre-fetches a plurality of translation entries for translating effective to physical addresses in advance of a demand for said plurality entries to perform address translation, generates at least one hint of a memory region likely to be accessed and communicates said at least one hint to said data pre- fetcher, wherein said data pre-fetcher utilizes said at least one hint to perform pre-fetching of said data. The references Sharma US 6,412,046 B1 June 25, 2002 Ahmed US 6,490,658 B1 Dec. 3, 2002 Gokul B. Kandiraju and Anand Sivasubramaniam, Going the Distance for TLB Prefetching: An Application-driven Study, International Symposium on Computer Architecture, Anchorage, Alaska, Session 5: Memory Systems, 2002, pages 195-206 ("Kandiraju"). The rejections Claims 1-16 stand rejected under 35 U.S.C. Â§ 103(a) as unpatentable over Ahmed and Kandiraju. Claims 21 and 22 stand rejected under 35 U.S.C. Â§ 103(a) as unpatentable over Ahmed and Kandiraju, further in view of Sharma. Appeal 2008-005107 Application 11/034,552 5 CONTENTIONS The Examiner finds that prefetch logic block (PLB) 370 in Ahmed is a "data pre-fetcher," as claimed. Final Rej. 3. The Examiner finds that the Micro-TLB (translation lookaside buffer) in Ahmed is a translation fetcher for translating effective to physical addresses, but is not a "translation pre-fetcher that pre-fetches a plurality of translation entries for translating effective to physical addresses in advance of a demand for said plurality [of] entries to perform address translation," as recited in claim 1; i.e., the translation entries are fetched "on demand" rather than in advance of a demand. Id. at 4. The Examiner finds that Kandiraju teaches a "translation pre-fetcher" for the purpose of hiding TLB misses latency by prefetching translations. Id. The Examiner concludes that it would have been obvious to use a translation prefetcher as taught in Kandiraju in Ahmed for the stated advantage of hiding TLB misses. Id. at 4-5. The Examiner finds that a "Micro-TLB hit" in Ahmed is a "hint" which is used by the PLB data prefetcher 370 to perform prefetching of data. Id. at 4. Appellant argues that the Examiner finds Micro-TLB Hit line 342 and prefetch logic block (PLB) 370 to be the "translation pre-fetcher" which "generates at least one hint of a memory region likely to be accessed and communicates said at least one hint to said data pre-fetcher." Br. 3-4. The Examiner responds that Ahmed was not relied on to teach a "translation pre-fetcher" and that Kandiraju was relied on for this feature. Ans. 13. The Examiner states that the "Micro-TLB Hit" line 342 in Ahmed is a signal indicative of a hint to the data PLB 370. Id. Appeal 2008-005107 Application 11/034,552 6 Appellant argues that Micro-TLB in Ahmed is not a "translation pre-fetcher," as claimed. Br. 4. The Examiner again responds that Ahmed was not relied on to teach a "translation pre-fetcher" and that Kandiraju was relied on for this feature. Ans. 14. Appellant argues that the Examiner is unable to show any type of communication between a "translation pre-fetcher" and a "data pre-fetcher," as recited in claim 1. Br. 4. The Examiner responds that Appellant does not address the obviousness rationale in arguing that no single reference teaches communication between a "translation pre-fetcher" and a "data pre-fetcher." Br. 5. The Examiner repeats that it would have been obvious to combine the "translation prefetcher" in Kandiraju with the Micro-TLB in Ahmed to eliminate memory misses in TLB entries. Id. The Examiner cites the statement that "[t]he combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results." KSR Int'l Co. v. Teleflex Inc., 550 U.S. 398, 416 (2007). Appellant argues that, assuming the combination shows a "translation pre-fetcher" and a "data pre-fetcher," the claimed interaction between the two is still missing. It is argued that the combination shows, at most, that TLB prefetching exists and that a Micro-TLB (not any kind of prefetcher) sends indications of TLB hits on demand translations to a data prefetcher to assist in prefetching. Br. 5. It is argued that the combination of references does not teach a responsive relationship whereby a first-acting translation Appeal 2008-005107 Application 11/034,552 7 prefetcher serves a second-acting data prefetcher with hints to enable the second-acting prefetcher to prefetch data. Id. The Examiner responds that the proposed combination of Ahmed with the translation prefetcher of Kandiraju produces the claimed interaction. Ans. 17. Appellant argues that the substitution advocated by Examiner (substituting the translation prefetcher of Kandiraju for the on demand translation fetcher in Ahmed) is not a combination of familiar elements according to known methods that yields predictable results according to the statement in KSR. Reply Br. 3, 5. Appellant argues that substituting the translation prefetcher of Kandiraju for the on demand translation fetcher in Ahmed would render the modified system inoperative. Reply Br. 3. It is argued that no predictive algorithm for prefetching virtual-to-physical address translations is perfect and the substitution "would render the modified system inoperative since no mechanism for performing translation demand fetches would be present in the event a required translation is not found in the TLB." Id. at 3-4. Therefore, it is not a combination that yields predictable results. Id. at 4. It is argued that combining both a translation fetcher of Ahmed with the translation fetcher of Kandiraju would also fail to render claim 1 obvious because the hint would go from the Micro-TLB, which is not a translation prefetcher, to the data prefetcher. Reply Br. 4. It is also argued that utilizing both a translation fetcher as taught by Ahmed and a translation prefetcher as taught by Kandiraju would result in Appeal 2008-005107 Application 11/034,552 8 potentially conflicting hints specifying different paths of data addresses, which results in thrashing of the cache memory. Reply Br. 4. ISSUE Appellant does not argue the claims separately, so claim 1 is selected as representative of claims 1-16. See 37 C.F.R. Â§ 41.37(c)(1)(vii). Appellant does not argue the separate patentability of dependent claims 20 and 21, so arguments as to these claims are waived and claims 20 and 21 will stand or fall together with claim 1. The issue is: Has Appellant shown that the Examiner erred in concluding that one of ordinary skill in the art would have been motivated to substitute a "translation pre-fetcher" as taught by Kandiraju for the "on demand" translation fetcher in Ahmed for the advantages taught by Kandiraju and that such combination would meet claim limitations of claim 1? FACTS Ahmed Ahmed describes a data prefetch technique for prefetching data into a cache memory. Abstract. Data prefetching itself is stated to be a known technique for reducing memory access delays. Col. 1. Ahmed describes a cache memory system 300 in connection with Figure 3. Two cache memories are provided, a data cache (D-cache) 310 and a prefetch cache (P-cache) 320, which are both typically L1 caches. Col. 4, ll. 20-30. Appeal 2008-005107 Application 11/034,552 9 Specialized logic in prefetch logic block (PLB) 370 speculatively loads P-cache 320 with data in anticipation of data being later referenced by memory instructions. Col. 4, ll. 42-55. A translation lookaside buffer, Micro-TLB 340 contains virtual-to-physical address translations and is addressable by virtual address. Col. 5, ll. 32-44. The addresses in Micro-TLB 340 are predicted by instructions in the AX-pipeline 328. Col. 5, ll. 45-57. PLB 370 determines when and what type of data to prefetch. Col. 7, ll. 54-55. PLB 370 uses data, including data from the Micro-TLB 340, to make decisions on prefetching data for P-cache 320. Col. 7, ll. 58-62. In particular, PLB 370 uses information on whether there is a Micro-TLB hit to prefetch data. Col. 8, ll. 27-39. Kandiraju Kandiraju describes TLB prefetching where translation entries in a TLB are prefetched ahead of a demand for the entries. Abstract. Kandiraju describes several different mechanisms for prefetching translation entries. Pages 196-199. Kandiraju describes that data prefetchers were well known in the computer art as noted by several articles on data prefetch mechanisms, such as references [29, 8, 16, 12, 17, 20] and indicates that these techniques are investigated for TLB prefetching. Page 196, left col. Appeal 2008-005107 Application 11/034,552 10 ANALYSIS We agree with the Examiner's statement of the rejection in the Final Rejection and with the Examiner's response to Appellant's arguments in the Appeal Brief in the Response to Argument section of the Examiner's Answer (Ans. 12-18), as summarized above in the Contentions. In particular, Appellant's arguments that the Examiner is unable to show any type of communication between a "translation pre-fetcher" and a "data pre-fetcher," as recited in claim 1 (Br. 4) fails to address that this is an obviousness rejection based on using the translation prefetcher of Kandiraju in Ahmed. Data prefetchers were well known in the art as taught by Ahmed and also Kandiraju. Structure to fetch TLB translation entries "on demand" were also well known in the art as taught by Ahmed and also Kandiraju. We agree with the Examiner's finding that a "hit" to the Micro-TLB in Ahmed can be considered a "hint" by the data prefetcher PLB 370, i.e., the data prefetcher PLB 370 prefetches data for the caches based on some algorithm if there is a hit signal on Micro-TLB Hit line 342. Appellant does not dispute this finding. The "hint" is not defined in the claims. Thus, Ahmed discloses structure that "generates at least one hint of a memory region likely to be accessed and communicates said at least one hint to said data pre-fetcher, wherein said data pre-fetcher utilizes said at least one hint to perform pre-fetching of said data." Kandiraju teaches "a translation pre-fetcher that pre-fetches a plurality of translation entries for translating effective to physical addresses in advance of a demand for said plurality entries to perform address Appeal 2008-005107 Application 11/034,552 11 translation." We agree with the Examiner that it would have been obvious to use the translation prefetcher as taught by Kandiraju in Ahmed to achieve the described advantage of hiding all or some of the TLB miss costs. The use of the translation prefetcher does not alter the way the data prefetching is done in response to Micro-TLB hits. That is, the way the translation entries in the Micro-TLB are filled, either on demand or prefetched, is independent of mechanism for data prefetching. Appellant argues that no predictive algorithm for prefetching virtual- to-physical address translations is perfect and the substitution of a translation prefetcher for the "on demand" translation fetcher in Ahmed "would render the modified system inoperative since no mechanism for performing translation demand fetches would be present in the event a required translation is not found in the TLB." Reply Br. 3-4. This is a new argument in the Reply Brief and is not in response to the Examiner's citation of KSR in the Answer. Appellant apparently contends that a translation prefetcher can only operate by predicting which translation entries to prefetch and is unable to fetch translation entries if there is a miss, i.e., if the entry is not found in the TLB. This underestimates the knowledge and level of ordinary skill in the computer art. A translation prefetcher must necessarily provide for fetching translation entries when a translation is not found in the TLB. Thus, substituting the translation prefetcher of Kandiraju into Ahmed would still require that the system be able to fetch translation entries when they are not found in the TLB. This is discussed, for example, in Kandiraju: "On a TLB miss, if the translation also misses in the prefetch buffer, it is demand Appeal 2008-005107 Application 11/034,552 12 fetched and a prefetch is initiated for the next virtual page translation (stride = 1) from the page table." Page 197, right col. (emphasis added). Note that this describes demand fetching and also prefetching. Appellant argues that combining both a translation fetcher of Ahmed with the translation prefetcher of Kandiraju would also fail to render claim 1 obvious because the hint would go from the Micro-TLB, which is not a translation prefetcher, to the data prefetcher. Reply Br. 4. This argument seems to misapprehend how the system works. The Micro-TLB is a table that stores translation entries for effective-to-physical addresses. Appellant's system uses a TLB to store the translation entries although it is not recited in claim 1. The TLB is filled by a mechanism that either fetches translation entries "on demand" as taught in Ahmed or by prefetches entries as taught in Kandiraju. If there is a "hit" in the Micro-TLB, it is output on the Micro-TLB Hit line 342 to the prefetch logic block (PLB) 370 in Ahmed. This happens regardless of whether the entries in the Micro-TLB are filled by an "on demand" fetcher or a translation prefetcher, i.e., the mechanism for providing translation entries to the TLB is independent of the mechanism for indicating a "hit" in the TLB which is used by the PLB 370. Thus, employing the translation prefetcher mechanism as taught by Kandiraju would not affect the working of the Micro-TLB and would result in the claimed subject matter. Appellant also argues that utilizing both a translation fetcher as taught by Ahmed and a translation prefetcher as taught by Kandiraju would result in potentially conflicting hints specifying different paths of data addresses, Appeal 2008-005107 Application 11/034,552 13 which results in thrashing of the cache memory. Reply Br. 4. As discussed in the preceding paragraph, the Micro-TLB hold translation entries and the "hint" is based on whether there is a "hit" in the Micro-TLB. This happens regardless of whether the entries in the Micro-TLB are filled by an "on demand" fetcher or a translation prefetcher, i.e., the mechanism for providing translation entries to the TLB is independent of the mechanism for indicating a "hit" in the TLB which is used by the PLB 370. There would not be two separate and independent hints as argued because the hints are based on hits to the single TLB. Moreover, Appellant's own system must use both an "on demand" fetcher in case of a miss and the translation prefetcher, so if there is a problem with the combination of Ahmed and Kandiraju, there is also a problem with Appellant's system. Therefore, we are not persuaded of error in the Examiner's rejection. CONCLUSION Appellant has not shown that the Examiner erred in concluding that one of ordinary skill in the art would have been motivated to substitute a "translation pre-fetcher" as taught by Kandiraju for the "on demand" translation fetcher in Ahmed for the advantages taught by Kandiraju and that such combination would meet all limitations of claim 1. Accordingly, the rejection of claims 1-16, 20, and 21 under 35 U.S.C. Â§ 103(a) is affirmed. Appeal 2008-005107 Application 11/034,552 14 Requests for extensions of time are governed by 37 C.F.R. Â§ 1.136(b). See 37 C.F.R. Â§ 41.50(f). AFFIRMED rwk DILLON & YUDELL LLP 8911 N. CAPITAL OF TEXAS HWY., SUITE 2110 AUSTIN, TX 78759