4. Python coremltools to convert Keras to Core ML

Apple’s coremltools convert existing Keras models to Core ML. It worked well with one exception: arbitrarily (not fixed) sized images.

The top variable sized layer had to be replaced with a fixed size layer to better work with Core ML. Variable sized input has come to Core ML, but support and documentation still has a ways to go.

Source

Now that we have our Core ML pipeline, a crowd classifier feeding into a crowd predictor, let’s see it in action.

5. Why an Xcode Playground for Core ML became a macOS App (Playgrounds are too brittle)

I was under the impression that the fastest way to a functional prototype was an Xcode Playground. Inspired by Create ML, the Xcode playground promised to be a quick and easy prototype for the iOS application.

When it works, it’s great.

A significant downside to playgrounds however, is its inability to be an Xcode target, meaning it is unable to have linker and build dependencies. As you use CocoaPods or Cartography for third party library management, this will break you, and you’ll end up with spurious errors like the follow:

error: Couldn't lookup symbols:

type metadata for CrowdCountApiMac.FriendlyClassification

...

__swift_FORCE_LOAD_$_swiftCoreMedia

__swift_FORCE_LOAD_$_swiftCoreAudio

CrowdCountApiMac.FriendlyPredictor.DensityMapWidth.unsafeMutableAddressor : Swift.Int

...

The good news is that porting over to a macOS application is relatively straightforward and will fix all of this. A macOS app is an Xcode target and can therefore link frameworks and binaries, allowing reliable compilation and linking.

6. Swift Architecture: RxSwift and MVVM

RxSwift drove out an MVVM (Model-View-ViewModel) architecture, where Apple’s ViewControllers are the View. So really M(VC)VM. But these ViewController’s are lean, and merely drive the View or glue the VM.

There are many articles on this topic. Feel free to google for more information.

7. Performance: Better than expected

While extracting frames from the camera in real time, classification took ~100ms and crowd counting taking ~3 seconds. Good to see that iPhone X GPU being put to good use.

Word of advice: do not do your own image and MLMultiArray manipulation. Use Apple’s Vision API, such as VNImageRequestHandler, that makes better use of hardware.

8. Promising Future for Edge AI

The ability to run sophisticated neural networks, in this case a multi-column CNN, on your iPhone rather than the cloud, and in real time, is a watershed moment.

Sure, this already existed in applications like Prisma and Google translate, but the ease of development will present many more opportunities.

With millions of iPhones in use, it’ll be easier than ever to crowd source data from your willing users to improve your model, and create a virtuous cycle that will improve the product:

usage -> more data -> improved usage -> more data -> improved usage -> ...

This proof of concept shows that it’s doable, and that, in and of itself, is a milestone. Add in the performance and the iPhone’s ubiquity, and we have a promising future.

Discuss this post on Hacker News.