Eckert, Yasuko et al.Download PDFPatent Trials and Appeals BoardMay 20, 20202019000257 (P.T.A.B. May. 20, 2020) Copy Citation UNITED STATES PATENT AND TRADEMARK OFFICE UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 15/079,543 03/24/2016 Yasuko Eckert 1458-140364 8974 109712 7590 05/20/2020 Advanced Micro Devices, Inc. c/o Davidson Sheehan LLP 6836 Austin Center Blvd. Suite 320 Austin, TX 78731 EXAMINER GE, JIN ART UNIT PAPER NUMBER 2616 NOTIFICATION DATE DELIVERY MODE 05/20/2020 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): AMD@DS-patent.com docketing@ds-patent.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE ____________ BEFORE THE PATENT TRIAL AND APPEAL BOARD ____________ Ex parte YASUKO ECKERT and NUWAN JAYASENA ____________ Appeal 2019-000257 Application 15/079,543 Technology Center 2600 ____________ Before CARL L. SILVERMAN, JAMES W. DEJMEK, and STEPHEN E. BELISLE, Administrative Patent Judges. BELISLE, Administrative Patent Judge. DECISION ON APPEAL Appellant1 appeals under 35 U.S.C. § 134(a) from a Non-Final Rejection of claims 1–20. Appeal Br. 4–10. We have jurisdiction under 35 U.S.C. § 6(b). We REVERSE. 1 Throughout this Decision, we use the word “Appellant” to refer to “applicant” as defined in 37 C.F.R. § 1.42 (2017). Appellant identifies the real party in interest as Advanced Micro Devices, Inc. Appeal Br. 1. Appeal 2019-000257 Application 15/079,543 2 STATEMENT OF THE CASE The Claimed Invention Appellant’s invention generally relates to “techniques for employing a hierarchical register file in a graphics processing unit (GPU) of a processor in order to increase the number of in-flight wavefronts that can be processed by the GPU.” Spec. ¶ 10. “[W]avefronts” are “sets of [computer] threads,” and “in-flight” wavefronts are “wavefronts that are executing, or ready to be executed,” at GPU compute units at a given point of time. Spec. ¶ 2. According to the Specification: [A] top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is [a] active, [b] predicted to be become active (predicted-active), or [c] inactive (e.g., stalled waiting for results of a load instruction). The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file. The GPU thereby supports efficient execution of active wavefronts and rapid resumption of execution when inactive wavefronts return to active status. The hierarchical register file therefore enables a higher number of in-flight wavefronts in the GPU without consuming an undesirably large amount of circuit area and power. Spec. ¶ 10 (emphasis added). Appeal 2019-000257 Application 15/079,543 3 Claim 1, reproduced below, is illustrative of the subject matter on appeal: 1. A processing system comprising: a processor to couple to a first memory implementing a first register file and to couple to a second memory implementing a second register file, the processor comprising: a graphics processing unit (GPU) to execute a plurality of wavefronts, the GPU comprising an active wavefront predictor coupled to an inactive wavefront detector, GPU configured, in response to identifying a wavefront activity status for at least one of the wavefronts, to perform at least one of: when the wavefront activity status is identified as an active wavefront ready for execution at the GPU, storing execution data for the at least one wavefront at the first register file; when the wavefront activity status is identified, by the active wavefront predictor, as predicted to transition from an inactive wavefront to an active wavefront within a threshold number of clock cycles, transferring execution data for the at least one wavefront from the second register file to the first register file; and when the wavefront activity status is identified as an inactive wavefront awaiting execution based on the inactive wavefront detector detecting a high-latency instruction at an instruction pipeline of the processor, storing execution data for the at least one wavefront at the second register file. Appeal Br. 12 (Claims App.). Appeal 2019-000257 Application 15/079,543 4 The Applied References The Examiner relies on the following references as evidence of unpatentability of the claims on appeal: Xu et al. (“Xu”) WO 2014/183287 A1 Nov. 20, 2014 Gebhart et al., A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors, ACM Transactions on Computer Systems (TOCS), vol. 30, pp. 8:1– 8:38 (pp. A:1–A:38) (2012) (“Gebhart”). The Examiner’s Rejections The Examiner made the following rejections of the claims on appeal: Claims 1–8, 10–18, and 20 stand rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Gebhart. Non-Final Act. 5–17. Claims 9 and 19 stand rejected under 35 U.S.C. § 103 as being unpatentable over Gebhart and Xu. Non-Final Act. 18–20. ANALYSIS2 Appellant disputes, inter alia, the Examiner’s findings that Gebhart anticipates the pending independent claims, namely claims 1, 10, and 20. Appeal Br. 4–8; Reply Br. 1–5. Gebhart relates generally to “two complementary techniques to improve datapath energy efficiency [in graphics processing units (GPUs)]: multi-level scheduling and hierarchical register files.” Gebhart A:2. 2 Throughout this Decision, we have considered Appellant’s Appeal Brief filed May 17, 2018 (“Appeal Br.”); Appellant’s Reply Brief filed October 10, 2018 (“Reply Br.”); the Examiner’s Answer mailed August 10, 2018 (“Ans.”); the Non-Final Office Action mailed January 5, 2018 (“Non- Final Act.”); and Appellant’s Specification filed March 28, 2016 (“Spec.”). Appeal 2019-000257 Application 15/079,543 5 Gebhart discloses “[m]ulti-level scheduling partitions threads into two classes: (1) active threads that are issuing instructions or waiting on relatively short latency operations, and (2) pending threads that are waiting on long memory latencies.” Id. Gebhart also discloses “[h]ierarchical [r]egister [f]iles replace the monolithic register file with a multi-level register file hierarchy,” and “[e]ach level of the hierarchy has increasing capacity and corresponding increasing access energy.” Id. Gebhart further discloses a two-level “warp” or wavefront scheduler responsible for the multi-level scheduling, and which “partitions warps into an active set eligible for execution and an inactive pending set.” Id. at A:7; see id. at A:6–A:8. According to Gebhart, “[w]hen the active set has a free entry, the scheduler moves the next ready warp from the pending set to the active set,” using “round robin scheduling among warps in the pending set when choosing which warp to activate.” Id. at A:8. To serve as an anticipatory reference, “the reference must disclose each and every element of the claimed invention, whether it does so explicitly or inherently.” In re Gleave, 560 F.3d 1331, 1334 (Fed. Cir. 2009). The Examiner finds Gebhart anticipates claim 1, and, as relevant here, the limitation of “when the wavefront activity status is identified, by the active wavefront predictor, as predicted to transition from an inactive wavefront to an active wavefront within a threshold number of clock cycles, transferring execution data for the at least one wavefront from the second register file to the first register file.” Ans. 6–8. In doing so, the Examiner cites broadly to substantial portions of Gebhart, namely Sections 1, 3, and 4, Abstract, and Figures 4 and 8 (Ans. 7), and states: Appeal 2019-000257 Application 15/079,543 6 Gebhart et al. disclose a two-level warp scheduler in the GPU is used to identify warp (wavefront)’s status such as active warp, pending warps based on latency operations of the thread then assign the identified warp (ready or will be ready soon) to active warp in upper level register file (local shared memory) or assign the identified warp (when they consume a value produced by a long-latency operation) to pending warps in lower register file (main memory), or move a suspended warp from the active set to the pending set (main memory), or move the next ready warp from the pending set to the active set (local memory) each cycle using a rotating priority, further disclose the two-level warp scheduler performs these (predict and detect) functions to move the pending warp to active warp each cycle using a rotating priority after identify the pending warp will be active (predict function) or move the active warp to pending warp after identify the active warp will be inactive (detect function). Ans. 7. Appellant argues “[n]owhere does Gebhart refer to or discuss a ‘predicted to transition’ wavefront or warp where such a wavefront is ‘predicted to transition from’ inactive to active ‘within a threshold number of clock cycles,’ as in claim 1.” Reply Br. 3; see Appeal Br. 5–7. Appellant also argues “the Examiner provides no evidence that [the] Examiner’s interpretation of ‘predict,’ ‘predicting,’ and ‘predictor’ as in claims 1–20 is consistent with the ordinary meaning of the term,” namely (according to Appellant) “to take some form of available or historical information and take an action to accommodate or adapt to a future event or condition” or “to foretell on the basis of an observation or scientific reasoning.” Reply Br. 4– 5 (citing Spec. ¶ 21); see id. at 5 (“There is no teaching or suggestion of use of historical data, scientific reasoning, or mathematical relationship in identifying warps in Gebhart with respect to ‘scheduling’ or ‘ready’ or ‘ready soon.’”). We find Appellant’s argument persuasive, and agree that Appeal 2019-000257 Application 15/079,543 7 the Examiner has not provided sufficient evidence or technical reasoning to show how Gebhart explicitly or inherently discloses the limitation at issue (recited above). For example, the Examiner neither construes “predicted to transition” nor sufficiently explains how or why Gebhart’s “round robin scheduling” explicitly or inherently discloses “predict[ing]” wavefront transition from inactive to active within a threshold number of clock cycles. See Ans. 2–9; Non-Final Act. 5–8; Gebhart A:8. At best, the Examiner leaves us to speculate as to how or why Gebhart explicitly or inherently discloses the limitation of “when the wavefront activity status is identified, by the active wavefront predictor, as predicted to transition from an inactive wavefront to an active wavefront within a threshold number of clock cycles, transferring execution data for the at least one wavefront from the second register file to the first register file,” as recited in claim 1; and to how or why the skilled artisan would at once envisage Appellant’s claimed arrangement or combination from Gebhart’s disclosure. See Blue Calypso, LLC v. Groupon, Inc., 815 F.3d 1331, 1341 (Fed. Cir. 2016). We will not resort to speculation or assumptions to cure the deficiencies in the Examiner’s fact finding and reasoning. See In re Warner, 379 F.2d 1011, 1017 (CCPA 1967); Ex parte Braeken, 54 USPQ2d 1110, 1112 (BPAI 1999) (unpublished) (“The review authorized by 35 U.S.C. [§] 134 is not a process whereby the [E]xaminer . . . invite[s] the [B]oard to examine the application and resolve patentability in the first instance.”). Because we find this issue dispositive here, we do not address Appellant’s other arguments. Accordingly, constrained by the present record, we do not sustain the Examiner’s rejection under 35 U.S.C. § 102(a)(1) of independent claim 1. Appeal 2019-000257 Application 15/079,543 8 For similar reasons, we do not sustain the Examiner’s rejection under 35 U.S.C. § 102(a)(1) of independent claims 10 and 20, which recite commensurate limitations. We also do not sustain the Examiner’s rejection under 35 U.S.C. § 102(a)(1) of claims 2–8 and 11–18, which depend therefrom. In addition, because the Examiner has not persuasively shown how the other cited art, namely Xu, remedies the deficiency in Gebhart (see Ans. 12–13), we do not sustain the Examiner’s rejection under 35 U.S.C. § 103 of claims 9 and 19, which depend from independent claims 1 and 10, respectively. DECISION SUMMARY In summary: Claims Rejected 35 U.S.C. § Reference(s)/ Basis Affirmed Reversed 1–8, 10–18, 20 102(a)(1) Gebhart 1–8, 10–18, 20 9, 19 103 Gebhart, Xu 9, 19 Overall Outcome 1–20 REVERSED Copy with citationCopy as parenthetical citation