Autonomous Solutions, Inc.Download PDFPatent Trials and Appeals BoardFeb 10, 20212020005399 (P.T.A.B. Feb. 10, 2021) Copy Citation UNITED STATES PATENT AND TRADEMARK OFFICE UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 15/458,524 03/14/2017 Thomas M. Petroff ASI.10019US03 8149 145627 7590 02/10/2021 Sanders IP Law 240 N. East Promontory Suite 200 Farmington, UT 84025 EXAMINER HEFLIN, BRIAN ADAMS ART UNIT PAPER NUMBER 3628 NOTIFICATION DATE DELIVERY MODE 02/10/2021 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): docket@sandersiplaw.com eofficeaction@appcoll.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE ____________________ BEFORE THE PATENT TRIAL AND APPEAL BOARD ____________________ Ex parte THOMAS M. PETROFF __________________ Appeal 2020-005399 Application 15/458,524 Technology Center 3600 ____________________ Before JAMES P. CALVE, CYNTHIA L. MURPHY, and BRADLEY B. BAYAT, Administrative Patent Judges. CALVE, Administrative Patent Judge. DECISION ON APPEAL STATEMENT OF THE CASE Pursuant to 35 U.S.C. § 134(a), Appellant1 appeals from the decision of the Examiner to reject claims 1–20, which are all of the pending claims. We have jurisdiction under 35 U.S.C. § 6(b). We REVERSE. 1 “Appellant” refers to “applicant” as defined in 37 C.F.R. § 1.42. Appellant identifies Autonomous Solutions Inc. as the real party in interest. Appeal Br. 3. Appeal 2020-005399 Application 15/458,524 2 CLAIMED SUBJECT MATTER The claimed vehicle dispatching method uses linear programming to generate an optimum schedule for multiple vehicles traveling to multiple locations and a reinforcement learning algorithm to set policies for various environmental situations that may arise. See Spec. ¶¶ 2, 16, 26, 27. Claim 1 and 14 are independent. Claim 1 recites: 1. A method for dispatching a plurality of autonomous vehicles operating in a work area among a plurality of destination locations and a plurality of source locations comprising: in one or more processors, implementing linear programming that takes in an optimization function and constraints to generate an optimum schedule for optimum production, the optimum schedule defining the number of trips taken along paths between the destination locations and the source locations to achieve the optimum production; in the one or more processors, utilizing a reinforcement learning algorithm that takes in the optimum schedule as input and performs an offline simulation of possible environmental states that could occur within the optimum schedule, before at least one of the possible environmental states has occurred in the real world, by choosing one possible action for each possible environmental state and by assigning a reward value obtained by taking the action at each possible environmental state during the offline simulation; in the one or more processors, developing a policy for each possible environmental state based on at least the reward value and time, the policy being associated with a preferred action; in the one or more processors, associating a state in the work area with one of the possible environmental states and accessing the preferred action associated with the policy for the associated possible environmental state; and in a transmitter, providing instructions to the autonomous vehicles to follow the preferred action. Appeal 2020-005399 Application 15/458,524 3 REJECTIONS Claims 1–3, 11–16, and 20 are rejected under 35 U.S.C. § 103(a) as being unpatentable over Cohen (US 2003/0069680 A1, pub. Apr. 10, 2003), Tesauro (US 2007/0203871 A1, pub. Aug. 30, 2007), Benda (US 2005/ 0197876 A1, pub. Sept. 8, 2005), Morika (US 2006/0155664 A1, pub. July 13, 2006), and Baker (US 6,351,697 B1, iss. Feb. 26, 2002). Claims 4, 5, and 17 are rejected under 35 U.S.C. § 103(a) as being unpatentable over Cohen, Tesauro, Benda, Morika, Baker, and Burns (US 6,393,362 B1, iss. May 21, 2002). Claims 6–8 and 18 are rejected under 35 U.S.C. § 103(a) as being unpatentable over Cohen, Tesauro, Benda, Morika, Baker, and Hiroshi (JP 2006/320997, pub. Nov. 30, 2006). Claims 9, 10, and 19 are rejected under 35 U.S.C. § 103(a) as being unpatentable over Cohen, Tesauro, Benda, Morika, Baker, and Andreasson (US 2004/0073764 A1, pub. Apr. 15, 2004). ANALYSIS Claims 1–3, 11–16, and 20 Rejected over Cohen, Tesauro, Benda, Morika, and Baker The Examiner rejects independent claims 1 and 14 based on teachings of Cohen, Tesauro, Benda, Morika, and Baker. Final Act. 3–12, 15–16. The Examiner relies on Cohen to teach an optimization program that generates an optimum schedule for dispatching vehicles in a work area for optimum production and also anticipates possible environmental states. Final Act. 3– 12, 15–16. The Examiner cites Tesauro to teach a reinforcement learning algorithm and Benda to teach linear programming to optimize a schedule. Id. at 7–10 (citing Tesauro ¶¶ 24, 25, 41 and Benda ¶¶ 63, 66, 74). Appeal 2020-005399 Application 15/458,524 4 The Examiner determines that it would have been obvious to a skilled artisan to incorporate Tesauro’s teachings of using a reinforcement learning algorithm to perform offline simulations of possible environmental states to choose an action for an environmental state and assign a reward value “in order to help avoid poor performance issues typically associated with live on-line training.” Final Act. 8 (citing Tesauro ¶ 41). The Examiner determines that it would have been obvious to a skilled artisan to incorporate Benda’s teachings of using a linear program to define an optimum schedule for the trips taken along paths between destination and source locations to achieve optimum production “in order to improve capacity utilization when carrying loads/items to and from a destination.” Id. at 9–10 (citing Benda ¶ 15). Appellant argues that paragraph 24 of Tesauro, which is cited in the Office Action, applies a reward-based Reinforcement Learning Algorithm to training data, but Tesauro does not apply a reinforcement learning algorithm to an optimum schedule, as claimed. Appeal Br. 82. Appellant also argues that Tesauro uses a reinforcement learning algorithm to generate “behavior rules or mappings of computing system states to management actions” but Tesauro does not apply a reinforcement learning algorithm to an optimum schedule generated by linear programming as recited in claims 1 and 14. Id. at 7–8, 13. Appellant asserts a skilled artisan would not have combined Benda’s linear programming and Tesauro’s reinforcement algorithm with Cohen to produce the subject matter recited in claims 1 and 14 or known how to do so with a reasonable expectation of success. See id. at 9–10, 13. 2 Refers to the pages of the Appeal Brief as filed on Feb. 18, 2020, without page numbers. Appeal 2020-005399 Application 15/458,524 5 A claim composed of several elements is not proved obvious merely by demonstrating that each element was known in the prior art. KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 418 (2007). It is important to identify a reason that would have motivated a skilled artisan to combine the prior art elements in the claimed way. Id. “[R]ejections on obviousness grounds cannot be sustained by mere conclusory statements; instead, there must be some articulated reasoning with some rational underpinning to support the legal conclusion of obviousness.” Id. (quoting In re Kahn, 441 F.3d 977, 988 (Fed. Cir. 2006)). An obviousness inquiry must show that a skilled artisan would have been motivated to combine the prior art to achieve the claimed invention and would have had a reasonable expectation of success in doing so. In re Warsaw Orthopedic, Inc., 832 F.3d 1327, 1333 (Fed. Cir. 2016). Here, the reason to combine the teachings of Tesauro with Cohen–– “to help avoid poor performance issues typically associated with live on-line training” (Final Act. 8)––is not supported by a rational underpinning because Cohen does not perform live on-line training or analyze such data sources. Tesauro relates to automated management of hardware and software components of data processing systems to perform self-management actions in response to rapidly changing situations by developing policies to allocate computing resources dynamically. Tesauro ¶¶ 1, 2, 22, Abstract. Tesauro applies a reinforcement learning algorithm to training data to learn a value function that maps a system state into a selected action to allocate computing resources in a data center. Id. ¶¶ 14, 19–37. The off-line application of a reward-based algorithm to such log data can learn high-quality management policies without poor performance issues of live on-line training. Id. ¶ 41. Appeal 2020-005399 Application 15/458,524 6 Tesauro’s off-line application of the reward-based algorithm on log data to avoid poor performance issues associated with live on-line training (Tesauro ¶ 41) does not provide a reason to combine Tesauro with Cohen. Cohen dispatches vehicles for strip mining. Cohen ¶¶ 35–37. Cohen does not use training data, live training, or offline simulations to allocate data center computer resources. In short, the Examiner has not explained why a skilled artisan seeking to improve Cohen’s vehicle dispatching system for strip mining to handle environmental issues such as road closures or vehicle or shovel interruptions (Cohen ¶¶ 2–4, 9, 35–37) would have considered Tesauro’s application of a reinforcement learning algorithm to training data to develop high-quality management policies for computer systems (Tesauro ¶¶ 1–5, 22–25, 41). Tesauro’s off-line action-reward simulations of training data to avoid poor performance issues associated with live on-line training does not provide a reason to use such a reinforcement learning algorithm to resolve environmental issues in Cohen’s strip mining system for optimum scheduling results as claimed and as Cohen desires. Cohen 36; Ans. 4–5. Accordingly, we do not sustain the rejection of independent claims 1 and 14 or their respective dependent claims 2–13 and 15–20. Dependent Claims 4–10 and 17–19 The Examiner’s reliance on Burns, Hiroshi, and Andreasson to teach limitations of claims 4–10 and 17–19 does not cure the deficiencies noted above as to claims 1 and 14 from which these claims depend. See Final Act. 17–24. Thus, we do not sustain the rejection of claims 4–10 and 17–19. Appeal 2020-005399 Application 15/458,524 7 CONCLUSION In summary: Claims Rejected 35 U.S.C. § Reference(s)/Basis Affirmed Reversed 1–3, 11–16, 20 103(a) Cohen, Tesauro, Benda, Morika, Baker 1–3, 11–16, 20 4, 5, 17 103(a) Cohen, Tesauro, Benda, Morika, Baker, Burns 4, 5, 17 6–8, 18 103(a) Cohen, Tesauro, Benda, Morika, Baker, Hiroshi 6–8, 18 9, 10, 19 103(a) Cohen, Tesauro, Benda, Morika, Baker, Andreasson 9, 10, 19 Overall Outcome 1–20 REVERSED Copy with citationCopy as parenthetical citation