Text Recognition

The SAPML framework includes drop-in UI components for capturing text and displaying a preview of the text observations. FUITextRecognitionView is a subclass of UIView, which can be embedded anywhere in the view hierarchy, providing flexibility to control the frame of the view. FUITextRecognitionViewController is a convenience controller that embeds the FUITextRecognitionView in its entirety. Use this controller if it is not necessary to customize the frame of the FUITextRecognitionView.

Initialization and Configuration

Simply instantiate FUITextRecognitionViewController by using default constructor. FUITextRecognitionView embeded in the controller can be accessed as follows. Finally, present the controller.

let textRecController = FUITextRecognitionViewController()
let textRecView = textRecController.recognitionView
...
textRecController.onClose = {
    self.dismiss(animated: true)
}
present(UINavigationController(rootViewController: textRecController), animated: true)

Customize the appearance of the FUITextRecognitionView by changing the captureMaskPath or changing the properties of the OverlayView.

recognitionView.captureMaskPath = {
    let path = UIBezierPath(roundedRect: CGRect(x: 0, y: 0, width: 350, height: 220), cornerRadius: 4)
    path.lineWidth = 1
    return path
}()

recognitionView.overlayView.strokeColor = UIColor.white.cgColor
recognitionView.overlayView.backgroundColor = UIColor.black.withAlphaComponent(0.20).cgColor

Implementing observationHandler

FUITextRecognitionView is a general-purpose text recognizer which returns observations by calling observationHandler. Such observations are all the texts that was recognized within captureMaskPath but the desired texts are only sub part of the observations. Such filtering code can be written in observationHandler. The following example uses NSDataDetector to detect phone number of the observations. However, NSRegularExpression can also be used to run custom regex.

let detector = try! NSDataDetector.init(types: NSTextCheckingResult.CheckingType.phoneNumber.rawValue)
recognitionView.observationHandler = { [weak self] observations in

    let matches = detector?.matches(in: observations) ?? nil

    recognitionView.showTexts(for: observations, with: matches)

    if matches != nil {
        for match in matches! where match.resultType == .phoneNumber && match.phoneNumber != nil {
            DispatchQueue.main.asyncAfter(deadline: .now() + 0.8) {
                self?.textField.value = match.phoneNumber!
                self?.dismiss(animated: true)
            }
            return true
        }
    }

    return false
}

Alternatively, SAPMLTextObservationTopology can be used to get adjacent observations. Refer to Observation Topology for more information.

Using a Custom MLModel

Custom VNCoreMLRequest will be performed on the captured video frames. In these cases observationHandler won’t be called. Observations need to be handled in completionHandler provided to VNCoreMLRequest.

let mlmodel = <#custom model#>
let vnmodel = try? VNCoreMLModel(for: mlmodel)
let request = VNCoreMLRequest(model: vnmodel!, completionHandler: <#observation Handler#>)

recognitionView.requests = [request]
  • The FUITextRecognitionView can be used to perform text recognition using device camera. Video captured from device camera would be shown in entire bounds. Use captureMaskPath to define the sub-frame of the view, in which the text would be recognized. If captureMaskPath is set to nil, the text in full screen will be reconized. Rest of the frame would be dimmed by overlayView. This default behavior can be changed by setting view and layer properties of overlayView. All the observations, that is, recognized text blocks and corresponding bounding boxes, would be returned back to observationHandler. To provide hint to user to what texts are recognized, use showTexts to show the observations on top of captured video.

    Call startSession to start capturing the video. This is normally called when view appears. Since text recognition model are memory and cpu intensive, it won’t be run until user has stabilized the device camera. After that observationHandler would be called for every video frame with observations. observationHandler can be used to perform any validations and filtering on the observations. Validate if it is a credit card number if observations have 16 numeric digits in total and discard any other observations, for example. Once the filtered observations meet the desired criteria, call stopSession to stop capturing the video. Make sure to call stopSession in case view disappears.

    When initializing FUITextRecognitionView, you can specify the style you want. Default style is singleField style. singleField provides a default-sized capturing box for capturing infomation of a single field (E.g. phone number or email address). The other one is multi-field style which provides a large-sized capturing box for capturing information of mutiple fields (E.g. business card).

    Example Initialization and Configuration

    
    let recognitionView = FUITextRecognitionView()
    
    recognitionView.observationHandler = { observations
    
       let filteredObservations = <#filter out unwanted text#>
    
       //show text over captured video to provide feedback to user to what is scanned
       recognitionView.showTexts(for: filteredObservations)
    
       let areObservationsValid = <#filteredObservations meets desired criteria#>
    
       if areObservationsValid {
           DispatchQueue.main.async {
               //place captured text in text field
               textField.text = filteredObservations.map { $0.value }.joined(separator:" ")
           }
           //on return true, session would be automatically stoped and observationHandler would no longer be called
           return true
       }
       return false
    }
    
    //start session to capture frames from back camera
    recognitionView.startSession()
    
    
    See more

    Declaration

    Swift

    open class FUITextRecognitionView : UIView
  • The FUITextRecognitionViewController is convenience controller embedding FUITextRecognitionView view which captures video from device back camera and performs text recognition. For more example to how to configure text recognition check out the FUITextRecognitionView documentation.

    ## Example Initialization and Configuration

    
     textRecController = FUITextRecognitionViewController()
     //additional phone number detector to filter out observations
     let detector = try! NSDataDetector.init(types: NSTextCheckingResult.CheckingType.phoneNumber.rawValue)
    
     var frameCount = 10 //capture the observations after 10 frames
     textRecController.recognitionView.observationHandler = { observations in
        //use NSDataDetector extensions to run matches on observations
        let matches = detector.matches(in: observations)
        textRecController.recognitionView.showTexts(for: observations, with: matches)
        frameCount -= 1
        if frameCount < 0 {
            DispatchQueue.main.async {
                var phoneNumber: String?
                for match in matches where match.resultType == .phoneNumber {
                        phoneNumber = match.phoneNumber
                }
                textView.text = phoneNumber ?? ""
                dismiss(animated: true)
            }
            return true
        }
        return false
    }
    
     textRecController.onClose = {
        dismiss(animated: true)
     }
    
     present(UINavigationController(rootViewController: textRecController), animated: true)
    
    See more

    Declaration

    Swift

    open class FUITextRecognitionViewController : UIViewController
  • Information about regions of text detected and corresponding string value.

    See more

    Declaration

    Swift

    public class SAPMLTextObservation : VNRectangleObservation