Warning: main(/www/www/htdocs/style/globals.php) [function.main]: failed to open stream: No such file or directory in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 1

Warning: main() [function.include]: Failed opening '/www/www/htdocs/style/globals.php' for inclusion (include_path='.:/www/www/common:/www/www/php/lib/php') in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 1

Warning: main(/www/www/htdocs/style/header.php) [function.main]: failed to open stream: No such file or directory in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 8

Warning: main() [function.include]: Failed opening '/www/www/htdocs/style/header.php' for inclusion (include_path='.:/www/www/common:/www/www/php/lib/php') in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 8

Migrating to QNX Neutrino for ARMv6-Processor-Based Boards

Overview

This technote provides the details about the processes you should use when setting up QNX Neutrino for boards that support the ARMv6 processor.

You must use the procnto-v6, the QNX Neutrino microkernel for an ARMv6 processor.

The support for ARMv6 architecture processors (ARM11, OMAP2) is now provided by:

the libstartup.a library, which initializes the ARMv6 MMU features
the QNX Neutrino microkernel procnto-v6, which makes use of the ARMv6 MMU

The procnto-v6 microkernel takes advantage of the ARMv6 MMU's physically-tagged cache to remove the 32 MB address space restriction imposed by the previous ARM MMU architecture. Note the following:

The per-process virtual address space is now 2 GB.
There is no special global memory region, so the shm_ctl() function no longer has any special ARM-specific behavior when using procnto-v6.

The default configuration for qcc still causes the compiler to generate ARMv4 instructions. This ensures that all code compiled with this configuration can run on any supported ARM processor.

Additional compiler flags are required to instruct the compiler to generate ARMv6 instructions. Using such code on non-ARMv6 processors may cause undefined instruction exceptions (generating a SIGILL signal).

BSP configuration

The libstartup CPU detection and configuration process includes the following changes since the 6.3.0 release:

armv_chip - for information about this structure, see armv_chip.
armv_cache - for information about this structure, see armv_cache.
armv_pte - for information about this structure, see armv_pte.
setup() - for information about this function, see setup().

`armv_chip`

Purpose: The armv_chip structure describes the configuration for a particular CPU.

cpuid

Contains bits 15:0 of the CP15 main ID register.

The armv_list[] array defined in armv_list.c contains a list of all supported CPUs, and the arm_chip_detect() function iterates through this array to match bits 15:0 of the ID register.

A BSP can override the library's armv_list.c to provide a customized list of supported CPUs, for example to specify armv_chip structures that aren't implemented in libstartup, or to restrict the list to the processor(s) implemented by the target board.

name

The textual name of the processor.

mmu_cr_set

Specifies which bits to set in the CP15 MMU control register when the MMU is enabled in vstart().

mmu_cr_clr

Specifies which bits to clear in the CP15 MMU control register when the MMU is enabled in vstart().

cycles

The number of CPU cycles taken by the arm_cpuspeed.c calibration loop.

cache

A pointer to an armv_cache structure describing the cache configuration.

power

A pointer to the CPU-specific power callout.

If no power callout is specified, the kernel's idle loop simply busy-loops, and the sysmgr_cpumode() call fails with ENOSYS.

flush and deferred

Pointers to the CPU-specific callouts used by procnto to handle unmapping pages.

The flush callout is used to flush the cache and TLB when unmapping a page. This is called for each page in a region being unmapped.

The deferred callout is used after all pages in a region have been unmapped, and can be used to perform any actions that were not performed by the flush callout.

For example, if the MMU doesn't support flushing the instruction cache by virtual address, the deferred callout can be used to flush the instruction cache after all pages have been unmapped to reduce the cost of flushing.

pte

A pointer to the default page table configuration

pte_wa

A pointer to the page table configuration for write-allocate cache behavior.

If you specify the -wa option, the pte_wa configuration is used. If the CPU does not support write-allocate caching, set this to 0, and the default pte values will be used instead.

pte_wb

A pointer to the page table configuration for write-back cache behavior.

If you specify the -wb option, the pte_wb configuration is used. If the CPU doesn't support write-back caching, set this to 0, and the default pte values will be used instead.

pte_wt

A pointer to the page table configuration for write-through cache behavior.

If you specify the -wt option, the pte_wt configuration is used. If the CPU doesn't support write-through caching, set this to 0, and the default pte values will be used instead.

setup

Point to a function that performs additional CPU-specific initialization.

`armv_cache`

Purpose: The armv_cache structure describes the CPU caches.

dcache_config: Describes the data cache. This is required only if the CPU doesn't implement the CP15 cache-type register.
If the CPU does implement the CP15 cache-type register, set this to 0, so that the startup library will use arm_add_cache() to determine the cache register configuration based on the CP15 cache-type register.
dcache_rtn: Manage the data cache with the help of a callout.
icache_config: Describes the instruction cache. This is required only if the CPU doesn't implement the CP15 cache type register. If the CPU does implement the CP15 cache-type register, set this to 0, so that the startup library will use arm_add_cache() to determine the cache register configuration based on the CP15 cache-type register.
icache_rtn: Manage the instruction cache with the help of a callout.

`armv_pte`

Purpose: The armv_pte structure describes the MMU page table encodings.

upte_ro: >User mode read-only pages.
upte_rw: User mode read-write pages.
kpte_ro: Kernel mode read-only pages.
kpte_rw: Encoding for kernel mode read-write pages.
mask_nc: Non-cacheable mappings.
l1_pgtable: L2 page table pointer with L1 descriptor.
kscn_ro>: Kernel mode L1 read-only section mapping.
kscn_rw: Kernel mode L1 read-write section mapping.
kscn_cb: Cacheable section mapping.

setup()

Purpose: The setup() function performs any CPU-specific initialization.

For ARMv6, there is a generic function, armv_setup_v6(), that performs generic ARMv6 initialization:

checks for VFP (vector floating point) functionality (see the Supporting Vector Floating Point Functionality for ARM Processors technote) and sets the CPU_FLAG_FPU, if necessary
sets up the MMU for procnto-v6

The armv_setup_v6() function must be called by any CPU-specific setup function for an ARMv6 CPU after it has performed its CPU-specific actions.

Behavior of `procnto-v6` shm_ctl()

The ARMv6 procnto-v6 removes the 32 MB process address space limit:

The per-process address space is now 2 GB.
There is a limit of 256 address spaces imposed by the MMU ASID register.

The procnto-v6 microkernel doesn't implement the ARM-specific global memory region implemented by the non-ARMv6 procnto. This means that shm_ctl() no longer has any ARM-specific special behavior. The shm_ctl() function exhibits the following:

Objects created with shm_ctl() are always mapped into the per-process address space. These mappings are inherited across a fork() function.
In contrast, the non-ARMv6 behavior causes some mappings to be placed in the global memory region, and these global mappings aren't inherited across a fork() function.
The SHMCTL_PHYS flag maps the specified physical address range.
This differs from the non-ARMv6 procnto behavior, where the mapping is placed in the global address space to allow mappings that would not otherwise fit into the 32 MB per-process address space.
The SHMCTL_ANON maps anonymous memory. This is identical to the non-ARMv6 procnto behavior.
The SHMCTL_ANON|SHMCTL_PHYS flag maps physically contiguous anonymous memory. This is identical to the non-ARMv6 procnto behavior.
The SHMCTL_GLOBAL flag is ignored, since all mappings are placed in the per-process address space.
In order to achieve the same effect as the non-ARMv6 SHMCTL_GLOBAL, it is necessary to assign a known address to the mapping, then use MAP_FIXED to mmap() the shared memory object at the same address in each process. You can do this in a number of ways:
- The first process to map uses a regular mmap() where the system assigns the virtual address. In this case, the address must be communicated in some way to other processes that wish to map the object. These subsequent mappings must use MAP_FIXED.
- The mapping is assigned a predefined virtual address that is used by all processes by using MAP_FIXED.
The SHMCTL_PRIV and SHMCTL_LOWERPROT flags are ignored.
All mappings are created in the per-process address space with user mode access protections.
For the non-ARMv6 procnto, these flags are used to optimize performance by indicating that the mapping doesn't require updating the global memory page tables on context switches to implement per-process protection.

If code must run on both ARMv6 and non-ARMv6 processors, you must check the __cpu_flags value at runtime to select the correct implementation. For example:

if (__cpu_flags & ARM_CPU_FLAG_V6) {
   /*
   * Code for ARMv6 processor only
   */
   } else {
     /*
     * Code for non-ARMv6 processor only
     */
     }

Using ARMv6 instructions

By default, qcc provides only ARMv4 instructions. This ensures that all compiled code will run on any supported ARM processor.

The ARMv6 processor introduces a number of new instructions that may provide performance benefits for certain code. For example, DSP algorithms can take advantage of the new media instructions.

This requires the correct gcc and binutils versions that implement ARMv6 migration:

gcc 4.2
binutils 2.18

There are a number of ways you can optimize ARMv6 operations:

implement the ARMv6-specific operations in assembler files
modify the compiler flags so that the compiler can use ARMv6 instructions where appropriate.
If you're using the QNX recursive Makefile structure:
- Compile globally for ARMv6, and add the following to the lowest level Makefile:
```
   CCFLAGS += -march=armv6
       
```
- Compile specific object files only for ARMv6, and add the following to the file common.mk:
```
   CCFLAGS_<name>_arm = -march=armv6
   CCFLAGS += $(CCFLAGS_$(basename $@)_$(CPU))
```

The object files, libraries, and binaries that are compiled to use ARMv6 instructions can only run on a target with an ARMv6 CPU. On a non-ARMv6 CPU, this causes an undefined instruction exception (SIGILL signal) or may result in unpredictable behavior.

Warning: main(/www/www/htdocs/style/footer.php) [function.main]: failed to open stream: No such file or directory in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 433

Warning: main() [function.include]: Failed opening '/www/www/htdocs/style/footer.php' for inclusion (include_path='.:/www/www/common:/www/www/php/lib/php') in /www/www/docs/6.4.1/neutrino/technotes/migrating_to_ARMv6.html on line 433