Patent Application 18304508 - AUTOMATED DEFECT CLASSIFICATION AND DETECTION

Title: AUTOMATED DEFECT CLASSIFICATION AND DETECTION

Application Information

Invention Title: AUTOMATED DEFECT CLASSIFICATION AND DETECTION
Application Number: 18304508
Submission Date: 2025-05-19T00:00:00.000Z
Effective Filing Date: 2023-04-21T00:00:00.000Z
Filing Date: 2023-04-21T00:00:00.000Z
National Class: 382
National Sub-Class: 155000
Examiner Employee Number: 91433
Art Unit: 2667
Tech Center: 2600

Rejection Summary

102 Rejections: 0
103 Rejections: 4

Cited Patents

The following patents were cited in the rejection:

US 0147127🔗
US 0204111🔗

Office Action Text

DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Drawings
The drawings are objected to because Fig. 1 comprises an element (e.g., 110) that is connected to nothing via an arrow. It is not clear what 110 is connected and/or moves to. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that use the word “means” or “step” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function. Such claim limitation(s) is/are: “a feature extractor module adapted to generate,” “a region proposal module adapted to identify,” “a detection module adapted to detect,” and “a segmentation module adapted to predict” in claims 1-20. Note that a means-plus-function or step-plus-function without sufficient structure recited in a method claim invokes 35 U.S.C. 112(f). See Media Rights v. Capital One. However, this application is directed to a computer-implemented method using ensemble learning, which requires a computer having a processor, memory, etc. with an machine learning architecture. Therefore, the claims are not being interpreted under 35 U.S.C. 112(f).
Because this/these claim limitation(s) is/are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 15 and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 15 recites the limitation “improve.” The limitation renders the claim indefinite because the term “improve” is relative and/or subjective. The term “improve” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. It is not clear how and to what extent the ground truth labels are improved.
A claim that requires the exercise of subjective judgment without restriction renders the claim indefinite. In re Musgrave, 431 F.2d 882, 893, 167 USPQ 280, 289 (CCPA 1970). Claim scope cannot depend solely on the unrestrained, subjective opinion of a particular individual purported to be practicing the invention. Datamize LLC v. Plumtree Software, Inc., 417 F.3d 1342, 1350, 75 USPQ2d 1801, 1807 (Fed. Cir. 2005)); see also Interval Licensing LLC v. AOL, Inc., 766 F.3d 1364, 1373, 112 USPQ2d 1188 (Fed. Cir. 2014).
For the purpose of further examination, the limitation has been interpreted as “modifying.”

Claim 18 recites the limitation “preferably.” The phrase “preferably” renders the claim indefinite because it is unclear whether the limitations following the phrase are part of the claimed invention. See MPEP § 2173.05(d).
For the purpose of further examination, the limitation has been interpreted as “comprising

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

35 U.S.C. 101 requires that a claimed invention must fall within one of the four eligible categories of invention (i.e. process, machine, manufacture, or composition of matter) and must not be directed to subject matter encompassing a judicially recognized exception as interpreted by the courts. MPEP 2106. The four eligible categories of invention include: (1) process which is an act, or a series of acts or steps, (2) machine which is an concrete thing, consisting of parts, or of certain devices and combination of devices, (3) manufacture which is an article produced from raw or prepared materials by giving to these materials new forms, qualities, properties, or combinations, whether by hand labor or by machinery, and (4) composition of matter which is all compositions of two or more substances and all composite articles, whether they be the results of chemical union, or of mechanical mixture, or whether they be gases, fluids, powders or solids. MPEP 2106(I).
Claim 20 is rejected under 35 U.S.C. 101 as not falling within one of the four statutory categories of invention because the broadest reasonable interpretation of the instant claims in light of the specification encompasses transitory signals. Transitory signals are not within one of the four statutory categories (i.e. non-statutory subject matter). See MPEP 2106(I). Claims directed toward a non-transitory computer readable medium may qualify as a manufacture and make the claim patent-eligible subject matter. MPEP 2106(I). Therefore, amending the claims to recite a “non-transitory computer-readable medium” would resolve this issue.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 4, 5, 12-17, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (“A Data-Flow Oriented Deep Ensemble Learning Method for Real-Time Surface Defect Inspection,” IEEE Transactions on Instrumentation and Measurement, Volume: 69, Issue: 7, July 2020, published 09 December 2019), in view of Zadeh et al. (US 2018/0204111 A1), hereinafter referred to as Liu and Zadeh, respectively.
Regarding claim 1, Liu teaches a computer-implemented training method for defect detection, classification and segmentation in image data (Liu Title & Abstract: “Real-time surface defect inspection”), the method comprising:
providing an ensemble of learning structures, each learning structure comprising a feature extractor module adapted to generate a feature map from an input image, a region proposal module adapted to identify regions of interest in the input image based on the generated feature map, a detection module adapted to detect defects in each one of the identified regions of interest in the input image and to predict a defect class and defect location associated with each one of the detected defects, and a segmentation module adapted to predict an instance segmentation mask for each detected and classified defect in each one of the identified regions of interest in the input image, wherein each feature extractor module comprises a convolutional neural network (Liu Abstract: “this article proposes a new deep ensemble learning method”; Liu pg. 4685 left column: “This localization and classification vector, prediction of relative offset and object class based on a default bounding box, is produced from 3, 4, 5, and 7 convolutional blocks with a feature map in size of 165 x 110, 83 x 55, 42 x 28, and 11 x 7”; Liu Fig. 4: shows the learning model architecture; Liu pg. 4685 right column: “The area enclosed by the prediction box rectangle is defined as (5), where b(xk) denotes the prediction bounding box coordinate value”; Liu pg. 4688 left column: “The performance of defect localization can be quantitatively evaluated by the precision-recall (PR) curve and the receiver operating characteristic (ROC) curve”; Liu Fig. 13: shows the defect segmentations and labels);
individually training each learning structure of said ensemble with a set of training images from an image dataset, wherein images of the image dataset comprise ground truth class labels and ground truth locations in respect of defects contained therein, and at least a subset of the training images comprises ground truth instance segmentation labels in respect of defects contained therein (Liu pg. 4683 right column: “The data set samples are collected from our designed test system, which comprises 9010 images of 41 336 SMT material samples … 6000 images are set for training and 3000 images for testing … A specific defect type in this data set is checked and labeled manually beforehand”; Liu Table 1: shows the class labels; Liu pg. 4685 left column: “A batch of k-means++ is done to find the best clustering, as illustrated in (1), where GT_Boxi is the ground truth angular point for the bounding box in the pixel coordinate”; Liu pg. 4687 left column: “The training strategy of the deep learning-based submodels is set as follows. The input image size is 300 x 300 pixels … The images and annotation data are transformed into the hdf5 file before training”; Liu pg. 4688 left column: “Compared to ground truth annotations, predictions can be classified as true positive, true negative, false positive, and false negative, which is based on the total defect types and overlaps with the ground truth boxes”; Liu Fig. 7: shows the training process);
validating each learning structure of said ensemble with a set of validation images from the image dataset to obtain a prediction score for each learning structure (Liu Abstract: “In order to validate the proposed method, an inspection bench test system, as a part of a real industrial surface mount technology production line, is designed and fabricated … the method is validated on the data collected from the manufacturer”; Liu eq. (7) & pg. 4685 right column: “Classk, Confk are the prediction class and its confidence score”; Liu pg. 4687 left column: “Experiment indicates that the training and validation loss decline with little test error reduction after the minimal LR setting”); and
combining predictions from the selected learning structures of the ensemble of learning structures, using a parametrized ensemble voting structure, wherein parameters of the ensemble voting structure are optimized on the set of validation images (Liu pg. 4685 left column: “A. Weight Adjustment. With respect to the streaming data scene characteristics, a dynamic distribution discrepancy identifier is designed to quantitatively assess the data set characteristic using the weight control factor … The clustering result is computed to assess the data set distribution changes for the dynamic changing imbalanced data”; Liu pg. 4685 right column: “The voting weight value of each submodel is controlled by the distribution discrepancy identifier. The predicted value is finally obtained through a weighted voting mechanism” – dynamic weighting optimizes the parameter).
However, Liu further teaches that the classification results are compared with a threshold score and the results not meeting the threshold requirement are reassigned (Liu eq. (7) & pg. 4685 right column: “Classk, Confk are the prediction class and its confidence score … Equation (7) represents the reassignment of the confidence score in the soft nonmaximum suppression (S-NMS [37]) compute, in which Si denotes the raw predicted confidence score, Nt represents the IOU threshold for doing confidence reassignment”).
However, Liu does not appear to explicitly teach selecting the learning structures of said ensemble of learning structures whose prediction score exceeds a predetermined threshold score.
Pertaining to the same field of endeavor, Zadeh teaches selecting the learning structures of said ensemble of learning structures whose prediction score exceeds a predetermined threshold score (Zadeh ¶¶2112: “we minimize the number of fuzzy rules, for efficiency, e.g. using rule pruning, rule combination, or rule elimination … we eliminate the rules with low number of training samples or low reliability” – pruning is a selective method used in boosting; Zadeh ¶¶2129: “For the aggregation method (also called ensemble learning, or boosting, or mixture of experts), we have a learning which tries to replicate the function independently (not jointly), and then combine and put them together later, e.g. combining different solutions, e.g. detecting eye and detecting nose, so that in combination, we can reliably detect the face later … we have the Boosting method, where we enforce the decorrelation (not by chance), e.g. by building one hypothesis at a time, for a good mixture, sequentially”).
Liu and Zadeh are considered to be analogous art because they are directed to machine learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the deep ensemble learning method (as taught by Liu) to select learning structures (as taught by Zadeh) because the combination is more efficient and reliable (Zadeh ¶¶2112).

Regarding claim 4, Liu, in view of Zadeh, teaches the method of claim 1, wherein the ensemble voting structure is configured to perform a weighted average voting with respect to predictions about the defect classes (Zadeh ¶¶2129: “For the aggregation method (also called ensemble learning, or boosting, or mixture of experts) … we take an average or weighted average, and for classification or binary cases, we take a vote or weighted vote. For the aggregation method, we have 2 types: (a) After-the-fact situation (where we already have the solutions, and then we combine them)”).

Regarding claim 5, Liu, in view of Zadeh, teaches the method of claim 4, further comprising:
determining weight parameters for the weighted average voting by a search algorithm or a boosting algorithm (Zadeh ¶¶2112 & ¶¶2129 discussed above).

Regarding claim 12, Liu, in view of Zadeh, teaches the method of claim 1, further comprising:
denoising the images of the image dataset (Zadeh ¶¶1802: “once we define the ‘noise’, as what the noise is in that context or environment, then we can define the filter that reduces that noise, which sets the goals or tasks for our optimization”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the deep ensemble learning method (as taught by Liu) to select learning structures (as taught by Zadeh) because the combination is more reliable by removing unwanted data (Zadeh ¶¶1802).

Regarding claim 13, Liu, in view of Zadeh, teaches a computer-implemented method for detecting and classifying defects in image data, comprising the steps of:
providing a machine learning model comprising an ensemble voting structure, optimized according to the method of claim 1 (Liu Abstract & pg. 4685 right column discussed above; refer to the rejection of claim 1 above), and
an ensemble of learning structures, trained and selected according to the method of claim 1 (Liu Abstract & pg. 4683 right column discussed above; refer to the rejection of claim 1 above); and
processing at least one test image with the provided machine learning model to obtain predictions about defect localizations, defect classes and defect instance segmentation masks in said at least one test image (Liu Abstract, pg. 4685 left column, Fig. 4, pg. 4688 left column, & Fig. 13 discussed above; refer to the rejection of claim 1 above).

Regarding claim 14, Liu, in view of Zadeh, teaches the method of claim 13, further comprising denoising the at least one test image prior to processing it with provided machine learning model (Zadeh ¶¶1802 discussed above).

Regarding claim 15, Liu, in view of Zadeh, teaches the method of claim 13, further comprising at least one of the following steps (Note that only one of the alternative limitations is required by the claim language):
notifying a user if none of the learning structures of the ensemble of learning structures has been selected; recommending a user to provide a larger set of training images and/or improve at least one of the ground truth class labels, the ground truth locations, and the ground truth instance segmentation labels in respect of defects contained in the images of the image dataset, provided that the prediction score is above the predetermined threshold score and below a predetermined target score; and modifying the feature extractor module of at least one learning structure of the ensemble of learning structures, provided the prediction score corresponding to the at least one learning structure is smaller than the predetermined threshold score, and retraining the at least one learning structure with the modified feature extractor module with the set of training images (Liu eq. (7) & pg. 4685 right column: “Classk, Confk are the prediction class and its confidence score … Equation (7) represents the reassignment of the confidence score in the soft nonmaximum suppression (S-NMS [37]) compute, in which Si denotes the raw predicted confidence score, Nt represents the IOU threshold for doing confidence reassignment” – the prediction score is compared to a threshold and values not meeting the threshold requirement are reassigned).

Regarding claim 16, Liu, in view of Zadeh, teaches the method of claim 13, wherein processing at least one test image comprises uploading the at least one test image from a local client unit to a central server unit, applying the provided machine learning model, stored on the server unit, to the at least one uploaded test image, and sending at least predictions about defect localizations, defect classes defect instance segmentation masks in said at least one test image from the server unit back to the local client unit (Zadeh ¶¶1986: “the reader/renderer sends information to QStore or a server, when for example, the user enters annotation on a resource such as a portion of the image … a local service or process running on the user's device provide a local QStore or Z-web on the user's device, e.g., giving local access to the user's auto-annotated photo albums, using other database (e.g., email or contact) to automatically build the relationship links … the local QStore or Z-web may be synchronized with those on the network (or Cloud)”; Zadeh ¶¶2782: “a user's computing device sends or uploads an image to a server (e.g., a merchant server or website). In one embodiment, the user captures the image via built-in camera on the computing device (e.g., a mobile device) or from an album repository on the device. In one embodiment, the user via the computing device provides a URI for the image (e.g., residing in a cloud or network) to the server and the image is uploaded to the server based on the URI. In one embodiment, the image includes meta tags (e.g., GPS information, time/date, camera information, and/or annotations) or such information is uploaded/pulled/pushed separately to the server. In one embodiment, the server transmits the image (and/or meta data) to an analyzer and search platform (server) to determine the features of the image and find a match to the image based on those features (and/or the meta data and/or other criteria) with catalog items in the same or other merchants' catalogs”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the deep ensemble learning method (as taught by Liu) to transmit data (as taught by Zadeh) because the combination enables mobile applications (Zadeh ¶¶2783).

Regarding claim 17, Liu, in view of Zadeh, teaches the method of claim 14, wherein processing at least one test image comprises uploading the at least one test image from a local client unit to a central server unit, applying the provided machine learning model, stored on the server unit, to the at least one uploaded test image, and sending at least predictions about defect localizations, defect classes defect instance segmentation masks in said at least one test image from the server unit back to the local client unit (Zadeh ¶¶1986 & ¶¶2782 discussed above).

Regarding claim 19, Liu, in view of Zadeh, teaches a data processing device comprising a processor configured to perform the method of claim 1 (Liu pg. 4683 right column: “The experimental environment for training and validation of the proposed model is as follows: deep learning framework: Caffe, Keras+Tensorflow, Windows10, Intel Xeon CPU Silver 4114 clocked at 2.20 GHZ, 64-GB RAM, and TITANXP GPU with 12-GB memory”; refer to the rejection of claim 1 discussed above).

Regarding claim 20, Liu, in view of Zadeh, teaches a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the steps of claim 1 (Liu pg. 4683 right column discussed above; refer to the rejection of claim 1 discussed above).

Claim(s) 6 and 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (IEEE Transactions on Instrumentation and Measurement, Volume: 69, Issue: 7, July 2020, published 09 December 2019), in view of Zadeh et al. (US 2018/0204111 A1), and further in view of Solovyev et al. (“Weighted boxes fusion: Ensembling boxes from different object detection models,” arXiv:1910.13302v3 [cs.CV], 6 Feb 2021), hereinafter referred to as Liu, Zadeh, and Solovyev, respectively.
Regarding claim 6, Liu, in view of Zadeh, teaches the method of claim 5, wherein the defect location corresponds to a bounding box for the defect and the ensemble voting structure is configured to perform weighted box fusion (WBF) with respect to predictions about the defect bounding boxes.
Pertaining to the same field of endeavor, Solovyev teaches performing weighted box fusion (WBF) with respect to predictions about the defect bounding boxes (Solovyev Abstract: “In this work, we present a novel method for combining predictions of object detection models: weighted boxes fusion. Our algorithm utilizes confidence scores of all proposed bounding boxes to construct the averaged boxes”; Solovyev pg. 1 right column: “we propose a novel Weighted Boxes Fusion (WBF) method for combining predictions of object detection models … uses confidence scores of all proposed bounding boxes to construct the average boxes”).
Liu, in view of Zadeh, and Solovyev are considered to be analogous art because they are directed to object detection. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective deep ensemble learning method (as taught by Liu, in view of Zadeh) to perform WBF (as taught by Solovyev) because the combination improves the quality of combined predicted rectangles (Solovyev pg. 1 right column).

Regarding claim 7, Liu, in view of Zadeh, teaches the method of claim 1, wherein the defect location corresponds to a bounding box for the defect (Liu pg. 4685 left column: “The localization and classification vector, prediction of relative offset and object class based on a default bounding box … GT_Boxi is the ground truth angular point for the bounding box in the pixel coordinate”).
However, Liu, in view of Zadeh, does not appear to explicitly teach that the ensemble voting structure is configured to perform weighted box fusion (WBF) with respect to predictions about the defect bounding boxes.
Pertaining to the same field of endeavor, Solovyev teaches that the defect location corresponds to a bounding box for the defect and the ensemble voting structure is configured to perform weighted box fusion (WBF) with respect to predictions about the defect bounding boxes (Solovyev Abstract & pg. 1 right column discussed above).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective deep ensemble learning method (as taught by Liu, in view of Zadeh) to perform WBF (as taught by Solovyev) because the combination improves the quality of combined predicted rectangles (Solovyev pg. 1 right column).

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (IEEE Transactions on Instrumentation and Measurement, Volume: 69, Issue: 7, July 2020, published 09 December 2019), in view of Zadeh et al. (US 2018/0204111 A1), Solovyev et al. (arXiv:1910.13302v3 [cs.CV], 6 Feb 2021), and further in view of Su et al. (US 2019/0147127 A1), hereinafter referred to as Liu, Zadeh, Solovyev, and Su, respectively.
Regarding claim 8, Liu, in view of Zadeh and Solovyev, teaches the method of claim 7, but does not appear to explicitly teach that the defects are lithography defects of a resist mask and the image data comprises scanning electron microscopy images of said resist mask.
Pertaining to the same field of endeavor, Su teaches that the defects are lithography defects of a resist mask and the image data comprises scanning electron microscopy images of said resist mask (Su ¶¶0034: “The measuring device may comprise an optical measurement device configured to measure a physical parameter of the substrate, such as a scatterometer, a scanning electron microscope, etc.”; Su ¶¶0035: “A defect can be in a resist image or an etch image (i.e., a pattern transferred to a layer of the substrate by etching using the resist thereon as a mask) … a portion or a characteristic of the image is calculated, and one or more defects or hot spots are identified based on the portion or the characteristic”; Su ¶¶0046: “Exemplary models of supervised learning include decision trees, ensembles (bagging, boosting, random forest), k-NN, linear regression, naive Bayes, neural networks, logistic regression, perceptron, support vector machine (SVM), relevance vector machine (RVM), and/or deep learning”; Su ¶¶0124: “proximity effects may arise from diffusion and other chemical effects during post-exposure bake (PEB), resist development, and etching that generally follow lithography”).
Liu, in view of Zadeh and Solovyev, and Su are considered to be analogous art because they are directed to defect detection. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective deep ensemble learning method using WBF (as taught by Liu, in view of Zadeh and Solovyev) to use SEM to image and detect defects in lithography resist mask (as taught by Su) because the combination SEM can detect nanostructures (Su Fig. 10).

Claim(s) 9, 10, 11, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (IEEE Transactions on Instrumentation and Measurement, Volume: 69, Issue: 7, July 2020, published 09 December 2019), in view of Zadeh et al. (US 2018/0204111 A1), and further in view of Su et al. (US 2019/0147127 A1), hereinafter referred to as Liu, Zadeh, and Su, respectively.
Regarding claim 9, Liu, in view of Zadeh, teaches the method of claim 1, but does not appear to explicitly teach that the defects are lithography defects of a resist mask and the image data comprises scanning electron microscopy images of said resist mask.
Pertaining to the same field of endeavor, Su teaches that the defects are lithography defects of a resist mask and the image data comprises scanning electron microscopy images of said resist mask (Su ¶¶0034, ¶¶0035, ¶¶0046, & ¶¶0124 discussed above).
Liu, in view of Zadeh, and Su are considered to be analogous art because they are directed to detect detection. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective deep ensemble learning method (as taught by Liu, in view of Zadeh) to use SEM to image & detect defects in lithography resist mask (as taught by Su) because the combination SEM can detect nanostructures (Su Fig. 10).

Regarding claim 10, Liu, in view of Zadeh and Su, teaches the method of claim 9, wherein defects include at least one of: line collapse, single line bridge, thin line bridge, or multi-line bridge (Su ¶¶0035: “a method of predicting defects or hot spots in a device manufacturing process. A defect can be a systematic defect such as necking, line pull back, line thinning, out of specification CD, overlapping and/or bridging”).

Regarding claim 11, Liu, in view of Zadeh and Su, teaches the method of claim 9, further comprising:
denoising the images of the image dataset (Zadeh ¶¶1802: “once we define the ‘noise’, as what the noise is in that context or environment, then we can define the filter that reduces that noise, which sets the goals or tasks for our optimization”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the deep ensemble learning method (as taught by Liu) to denoise the images because denoising improves detection results.

Regarding claim 18, Liu, in view of Zadeh, teaches an inspection system for detecting and classifying defects, the inspection system comprising an imaging apparatus, and a processing unit, the processing unit being configured to receive image data relating to the object under test from the imaging apparatus, wherein the processing unit is programmed to execute the method of claim 1 (Liu pg. 4683 right column discussed above; refer to the rejection of claim 1 above).
However, Liu, in view of Zadeh, does not appear to teach classifying lithography defects in resist masks of a semiconductor device under test and using a scanning electron microscope.
Pertaining to the same field of endeavor, Su teaches classifying lithography defects in resist masks of a semiconductor device under test and using a scanning electron microscope (Su ¶¶0034, ¶¶0035, ¶¶0046, & ¶¶0124 discussed above).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective deep ensemble learning method (as taught by Liu, in view of Zadeh) to use SEM to image & detect defects in lithography resist mask (as taught by Su) because the combination SEM can detect nanostructures (Su Fig. 10).

Allowable Subject Matter
Claims 2-3 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 2, the prior art of record teaches that it was known at the time the application was filed to use the method of claim 1.
However, the prior art, alone or in combination, does not appear to teach or suggest augmenting images of the set of training images with soft-pixel segmentation labels in respect of defects that are devoid of ground truth instance segmentation labels, wherein the soft-pixel segmentation labels correspond to the instance segmentation masks predicted by the ensemble of learning structures.

Claim 3 is objected to for the same reason as claim 2 discussed above due to dependency.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SOO J SHIN whose telephone number is (571)272-9753. The examiner can normally be reached M-F; 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached at (571)272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Soo Shin/Primary Examiner, Art Unit 2667