Q-NOTE QN-7000HX Technical Information Page 1

Browse online or download Technical Information for Tablets Q-NOTE QN-7000HX. Q-NOTE QN-7000HX Technical information User Manual

  • Download
  • Add to my manuals
  • Print
  • Page
    / 28
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 0
XAPP1206 v1.1 June 12, 2014 www.xilinx.com 1
© Copyright 2014 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the
United States and other countries. AMBA, AMBA Designer, ARM, ARM1176JZ-S, CoreSight, Cortex, and PrimeCell are trademarks of ARM in the EU and other countries. All
other trademarks are the property of their respective owners.
Summary Xilinx
®
Zynq
®
-7000 All Programmable SoC is an architecture that integrates a dual-core ARM
®
Cortex™-A9 processor, which is widely used in embedded products. Both ARM Cortex-A9
cores have an advanced single instruction, multiple data (SIMD) engine, also known as NEON.
It is specialized for parallel data computation on large data sets. This document explains how to
use NEON to improve software performance and cache efficiency, thus improving NEON
performance generally.
Introduction Generally speaking, a CPU executes instructions and processes data one-by-one. Typically,
high performance is achieved using high clock frequencies, but semiconductor technology
imposes limits on this. Parallel computation is the next strategy typically employed to improve
CPU data processing capability. The SIMD technique allows multiple data to be processed in
one or just a few CPU cycles. NEON is the SIMD implementation in ARM v7A processors.
Effective use of NEON can produce significant software performance improvements.
Document Content Overview
Technical information provided in the this document includes:
Before You Begin: Important Concepts
This document provides the following information you need to be more effective when
optimizing your code:
Software Optimization Basics
NEON Basics
Software Performance Optimization Methods
This document describes four ways to optimize software performance with NEON :
Using Using NEON Optimized Libraries
As Cortex-A9 prevails in embedded designs, many software libraries are optimized for
NEON and have performance improvements. This document lists those libraries which are
frequently used by the community.
Using Compiler Automatic Vectorization
GCC, the popular open source compiler, can generate NEON instructions with proper
compilation options. However, the C language does not excel at expressing parallel
computations. You might need to modify your C code to add compiler hints. Lab 1 provides
a hands-on example.
Using NEON Intrinsics
Usually, the compiler handles simple optimizations well (optimizations such as register
allocation, instruction scheduling, etc.). However, you might need to use NEON intrinsics
when the compiler fails to analyze and optimize more complex algorithms. Moreover, some
NEON instructions have no equivalent C expressions, and intrinsics or assembly are the
Application Note: Zynq-7000 AP SoC
XAPP1206 v1.1 June 12, 2014
Boost Software Performance on Zynq-7000
AP SoC with NEON
Author: Haoliang Qin
Page view 0
1 2 3 4 5 6 ... 27 28

Summary of Contents

Page 1 - AP SoC with NEON

XAPP1206 v1.1 June 12, 2014 www.xilinx.com 1© Copyright 2014 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Vivado, Zynq,

Page 2 - Examples and

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 10All techniques for programing with NEON listed above are discus

Page 3 - Concepts

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 11In addition to these open source libraries, NEON is also popula

Page 4 - X-Ref Target - Figure 1

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 12code from standard C or C++. However, in practice, this optimiz

Page 5

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 13Figure 4 shows the setting optimization flags.The compiler migh

Page 6 - NEON Basics

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 14• Use suitable data typesAn example of a standard dot product a

Page 7 - NEON and VFP

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 15Using the techniques above, you can modify the C source code to

Page 8 - NEON Instruction

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 16produce a result different from non-NEON optimized code for flo

Page 9 - NEON Benefits

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 17Tab l e 6 can give developers some basic ideas about NEON types

Page 10 - Optimization

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 18Accessing Two D-registers of a Q-registerThis can be done using

Page 11 - Introduction

Software Performance Optimization MethodsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 19The disadvantage is obvious. First, it is difficult to maintain

Page 12

Software Examples and LabsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 2only options. So, in some situations, you might need to re-write time-critical

Page 13 - C Code Modifications

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 20Therefore, when writing assembly instruction

Page 14 - Use the Restrict Keyword

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 21Generally, loading and storing multiple inst

Page 15 - Use Suitable Data Types

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 22A graphical demonstration of VST3 is shown i

Page 16 - Using NEON Intrinsics

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 23The algorithm used is still the dot product

Page 17

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 24this is at line 67 of the source file benchm

Page 18

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 25In Figure 6, you can see the 16-byte arrays

Page 19 - Instruction Scheduling

Boost NEON Performance by Improving Memory Access EfficiencyXAPP1206 v1.1 June 12, 2014 www.xilinx.com 26Fully associative cache can solve this issue

Page 20 - Efficiency

SummaryXAPP1206 v1.1 June 12, 2014 www.xilinx.com 27ii < 8; ii++, rresult += N, ra += N) for (ki = 0, rb = &b[ko][jo];ki < 8

Page 21 - LDMIA R10!, { R0-R3, R12 }

Revision HistoryXAPP1206 v1.1 June 12, 2014 www.xilinx.com 287. Cortex™-A Series Programmer's Guidesilver.arm.com/download/download.tm?pv=129601

Page 22 - X-Ref Target - Figure 5

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 3Before You Begin: Important ConceptsBefore addressing specific optimi

Page 23

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 4Figure 1 shows the single core Cortex-A9 processor block diagram.One

Page 24

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 5number can be different from the overall performance. The best way to

Page 25 - ,QGH[2IIVHW%\WH

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 6cores, and one private timer (32-bit) for each core. The timers have

Page 26 - X-Ref Target - Figure 7

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 7Figure 2 compares SIMD parallel add with 32-bit scalar add.To achieve

Page 27

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 8Cortex-A9 processor. For floating-point operation, VFP can support bo

Page 28 - Disclaimer

Before You Begin: Important ConceptsXAPP1206 v1.1 June 12, 2014 www.xilinx.com 9NEON Performance LimitWhen using NEON to optimize software algorithms

Comments to this Manuals

No comments