Machine Learning Inference Performance

AIMark 3

AIMark makes use of various vendor SDKs to implement the benchmarks. This means that the end-results really aren’t a proper apples-to-apples comparison, however it represents an approach that actually will be used by some vendors in their in-house applications or even some rare third-party app.

鲁大师 / Master Lu - AIMark 3 - InceptionV3 鲁大师 / Master Lu - AIMark 3 - ResNet34 鲁大师 / Master Lu - AIMark 3 - MobileNet-SSD 鲁大师 / Master Lu - AIMark 3 - DeepLabV3

Unfortunately for the Black Shark 2, the devices lacked the proper drivers to properly run AIMark, and the benchmark repeatedly crashed upon starting the benchmarks. We had the same issue on the OnePlus 7 Pro, pointing out to some software incompatibility.

AIBenchmark 3

AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.

We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initialising the test to actually executing the computation.

AIBenchmark 3 - NNAPI CPU

We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.

AIBenchmark 3 - 1 - The Life - CPU/FP AIBenchmark 3 - 2 - Zoo - CPU/FP AIBenchmark 3 - 3 - Pioneers - CPU/INT AIBenchmark 3 - 4 - Let's Play - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/INT AIBenchmark 3 - 8 - Blur iT! - CPU/FP

In AI Benchmark’s CPU workloads, the Black Shark 2 ends up with a bit of a odd spread of scores. In the shorter running benchmarks the phone is getting relatively average inference times, while on the longer running tests for some reason the BS2 falls behind other S855 devices. In fact it looks like the BS2 is landing as amongst the worse off S855 devices in the latter listed tests.

AIBenchmark 3 - NNAPI INT8

AIBenchmark 3 - 1 - The Life - INT8 AIBenchmark 3 - 2 - Zoo - Int8 AIBenchmark 3 - 3 - Pioneers - INT8 AIBenchmark 3 - 5 - Masterpiece - INT8 AIBenchmark 3 - 6 - Cartoons - INT8

AIBenchmark 3 - NNAPI FP16

AIBenchmark 3 - 1 - The Life - FP16 AIBenchmark 3 - 2 - Zoo - FP16 AIBenchmark 3 - 3 - Pioneers - FP16 AIBenchmark 3 - 5 - Masterpiece - FP16 AIBenchmark 3 - 6 - Cartoons - FP16 AIBenchmark 3 - 9 - Berlin Driving - FP16 AIBenchmark 3 - 10 - WESPE-dn - FP16

AIBenchmark 3 - NNAPI FP32

AIBenchmark 3 - 10 - WESPE-dn - FP32

In the INT8, FP16 and FP32 accelerated tests which make use of acceleration blocks such as the Hexagon DSP and the GPU, we see the Black Shark 2 perform very well and in line with other Snapdragon 855 devices.

Overall, the Black Shark 2 is a good performer in the machine learning inferencing benchmarks, but like other devices, it’s not quite the very best in every regard, pointing out that the vendor could have improved upon its performance by keeping the software stack more up to date with what Qualcomm is offering, a widespread issue that I expect to persist over the next years as the ecosystem quickly evolves.

System Performance GPU Performance - Worst of S855
POST A COMMENT

63 Comments

View All Comments

  • Andrei Frumusanu - Thursday, September 26, 2019 - link

    You don't seem to understand, the gaming mode on the BS2 makes absolutely zero difference. It still throttles to the performance levels published here.

    As for the OPPO, that phone didn't throttle in either mode so I don't understand what you're on about. The performance mode it has is not like that of other phones and it's not a natural operational mode of the phone.
    Reply
  • s.yu - Thursday, September 26, 2019 - link

    If we could achieve a new consensus on testing battery life in default mode and performance in performance mode, then I suppose something like that could be reasonable, but many other devices would have to be retested. Reply
  • Andrei Frumusanu - Thursday, September 26, 2019 - link

    I'm testing all devices in their performance modes both for performance and battery - the only exception to this rule until now has been the Reno as its performance mode just blatantly ignores normal DVFS operation. Reply
  • s.yu - Thursday, September 26, 2019 - link

    Right, that also works. Reply
  • melgross - Thursday, September 26, 2019 - link

    The point is. That it can’t maintain that speed without burning something out. Reply
  • melgross - Thursday, September 26, 2019 - link

    Anandtech is the only site that does it correctly. The others don’t do real reviews. Reply
  • brucethemoose - Wednesday, September 25, 2019 - link

    Or, more importantly, heating up your (non replacable) battery and degrading its useful life. Reply
  • s.yu - Wednesday, September 25, 2019 - link

    It should all be run in whatever default mode available out of fairness, and comparability of results. Reply
  • Total Meltdowner - Wednesday, September 25, 2019 - link

    The specs on these phones are all so boring.

    Give me an 8000mAH battery and a screen that isn't a downgrade from an S6 Active.

    How about some ports or SOMETHING that separates these devices from one another? 0 innovation. It's just a bunch of chinese garbage.
    Reply
  • plsbugmenot - Thursday, September 26, 2019 - link

    I think you'll enjoy the ROG Phone 2 review, then. Reply

Log in

Don't have an account? Sign up now