This article has been archived and is no longer being updated. It may not work with the most recent OS versions.

Scanner Tutorial for macOS

Jul 26 2016

Use NSScanner to analyze strings from natural form to computer languages. In this NSScanner tutorial, you’ll learn how to extract information from emails. By Hai Nguyen.

Leave a rating/review

Sign up/Sign in

With afree Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!

Create account

Already a member of Kodeco?Sign in

Save for later

Sign up/Sign in

With afree Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!

Create account

Already a member of Kodeco?Sign in

Share

Share this

Twitter

Facebook

Email

Update 9/25/16: This tutorial has been updated for Xcode 8 and Swift 3.

Update note: This tutorial has been updated to Swift by Hai Nguyen. The original tutorial was written by Vincent Ngo.

Update 9/25/16: This tutorial has been updated for Xcode 8 and Swift 3.

Update note: This tutorial has been updated to Swift by Hai Nguyen. The original tutorial was written by Vincent Ngo.

NSScannerFeatureImage

In these days ofbig data, data is stored in a multitude of formats, which poses a challenge to anyone trying to consolidate and make sense of it. If you’re lucky, the data will be in an organized, hierarchical format such asJSON,XML, orCSV. Otherwise, you might have to struggle with endless if/else cases. Either way, manually extracting data is no fun.

Thankfully, Apple provides a set of tools that you can use to analyze string data in any form, from natural to computer languages, such asNSRegularExpression,NSDataDetector orScanner. Each of them has its own advantages, butScanner is by far the easiest to use yet powerful and flexible. In this tutorial, you’ll learn how to extract information from email messages with its methods, in order to build a macOS application that works like Apple Mail’s interface as shown.

Completed-Final-Screen

Although you’ll be building an app for Mac,Scanner is also available on iOS. By the end of this tutorial, you will be ready to parse text on either platform.

Before getting things started, let’s first see whatScanner is capable of!

Scanner Overview

Scanner‘s main functionality is to retrieve and interpret substring and numeric values.

For example,Scanner can analyze a phone number and break it down into components like this:

// 1.let hyphen = CharacterSet(charactersIn: "-")// 2.let scanner = Scanner(string: "123-456-7890")scanner.charactersToBeSkipped = hyphen// 3.var areaCode, firstThreeDigits, lastFourDigits: NSString?scanner.scanUpToCharacters(from: hyphen, into: &areaCode)          // Ascanner.scanUpToCharacters(from: hyphen, into: &firstThreeDigits)  // Bscanner.scanUpToCharacters(from: hyphen, into: &lastFourDigits)    // Cprint(areaCode!, firstThreeDigits!, lastFourDigits!)// 123 - area code// 456 - first three digits// 7890 - last four digits

Here’s what this code does:

  1. Creates an instance ofCharacterSet namedhyphen. This will be used as the separator between string components.
  2. Initializes aScanner object and changes itscharactersToBeSkipped default value (whitespace and linefeed) tohyphen, so the returning strings will NOT include any hyphens.
  3. areaCode,firstThreeDigits andlastFourDigits will store parsed values that you get back from the scanner. Since you cannot portSwift nativeString directly toAutoreleasingUnsafeMutablePointer, you have to declare these variables as optionalNSString objects in order to pass them into the scanner’s method.
    1. Scans up to the first character and assigns the values in front of the hyphen character intoareaCode.
    2. Continues scanning to the second and grabs the next three digits intofirstThreeDigits. Before you invokescanUpToCharactersFromSet(from:into:), the scanner’s reading cursor was at the position of the first found-. With the hyphen ignored, you get the phone number’s second component.
    3. Finds the next-. The scanner finishes the rest of the string and returns a successful status. With no hyphen left, it simply puts the remaining substring intolastFourDigits.
  1. Scans up to the first character and assigns the values in front of the hyphen character intoareaCode.
  2. Continues scanning to the second and grabs the next three digits intofirstThreeDigits. Before you invokescanUpToCharactersFromSet(from:into:), the scanner’s reading cursor was at the position of the first found-. With the hyphen ignored, you get the phone number’s second component.
  3. Finds the next-. The scanner finishes the rest of the string and returns a successful status. With no hyphen left, it simply puts the remaining substring intolastFourDigits.

That’s allScanner does. It’s that easy! Now, it’s time to get your application started!

Getting Started

Download thestarter project and extract the the contents of the ZIP file. OpenEmailParser.xcodeproj in Xcode.

You’ll find the following:

  • DataSource.swift contains a pre-made structure that sets up thedata source/delegate to populate atable view.
  • PostCell.swift contains all the properties that you need to display each individual data item.
  • Support/Main.storyboard contains aTableView with a custom cell on the left hand-side and aTextView on the other.

You’ll be parsing the data of 49 sample files incomp.sys.mac.hardware folder. Take a minute to browse though to see how it’s structured. You’ll be collecting items likeName,Email, and so on into a table so that they are easy to see at a glance.

Note: The starter project uses table views to present the data, so if you’re unfamiliar with table views, check out ourmacOS NSTableView Tutorial.

Build and run the project to see it in action.

Starter-Initial-Screen

The table view currently displays placeholder labels with[Field]Value prefix. By the end of the tutorial, those will be replaced with parsed data.

Understanding the Structure of Raw Samples

Before diving straight into parsing, it’s important to understand what you’re trying to achieve. Below is one of the sample files, with the data items you’ll be retrieving highlighted.

Data-Structure-Illustration

In summary, these data items are:

  • From field: this consists of the sender’s name and email. Parsing it can be tricky since the name may come before the email or vice versa; it might even contain one piece but not the other.
  • Subject,Date,Organization andLinesfields: these have values separated by colons.
  • Message segment: this can contain cost information and some of these following keywords:apple,macs,software,keyboard,printer,video,monitor,laser,scanner,disks,cost,price,floppy,card, andphone.

Scanner is awesome; however, working with it can feel a bit cumbersome and far less “Swifty”, so you’ll convert the built-in methods like the one in the phone number example above to ones that return optionals.

Navigate toFile\New\File… (or simply pressCommand+N). SelectmacOS > Source > Swift File and clickNext. Set the file’s name toScanner+.swift, then clickCreate.

OpenScanner+.swift and add the following extension:

extension Scanner {    func scanUpToCharactersFrom(_ set: CharacterSet) -> String? {    var result: NSString?                                                           // 1.    return scanUpToCharacters(from: set, into: &result) ? (result as? String) : nil // 2.  }    func scanUpTo(_ string: String) -> String? {    var result: NSString?    return self.scanUpTo(string, into: &result) ? (result as? String) : nil  }    func scanDouble() -> Double? {    var double: Double = 0    return scanDouble(&double) ? double : nil  }}

These helper methods encapsulate some of theScanner methods you’ll use in this tutorial so that they return an optionalString. These three methods share the same structure:

  1. Defines aresult variable to hold the value returned by the scanner.
  2. Uses a ternary operator to check whether the scan is successful. If it is, convertsresult toString and returns it; otherwise simply returnsnil.
  • scanDecimal(_:)
  • scanFloat(_:)
  • scanHexDouble(_:)
  • scanHexFloat(_:)
  • scanHexInt32(_:)
  • scanHexInt64(_:)
  • scanInt(_:)
  • scanInt32(_:)
  • scanInt64(_:)
Note: You can do the same to otherScanner methods like you did above and save them to your arsenals:
  • scanDecimal(_:)
  • scanFloat(_:)
  • scanHexDouble(_:)
  • scanHexFloat(_:)
  • scanHexInt32(_:)
  • scanHexInt64(_:)
  • scanInt(_:)
  • scanInt32(_:)
  • scanInt64(_:)

Simple, right? Now go back to the main project and start parsing!

Mark complete

Sign up/Sign in

With afree Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!

Create account

Already a member of Kodeco?Sign in

Hai Nguyen
Leave a rating/review

Sign up/Sign in

With afree Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!

Create account

Already a member of Kodeco?Sign in

Save for later

Sign up/Sign in

With afree Kodeco account you can download source code, track your progress, bookmark, personalise your learner profile and more!

Create account

Already a member of Kodeco?Sign in

This content was released on Jul 26 2016. The official support period is 6-months from this date.

Use NSScanner to analyze strings from natural form to computer languages. In this NSScanner tutorial, you’ll learn how to extract information from emails.

Comments
Share

Share this

Twitter

Facebook

Email

Contributors

Hai Nguyen

Author

Over 300 content creators.Join our team.