This technote provides the details about the processes you should use when setting up QNX Neutrino for boards that support the ARMv6 processor.
You must use the procnto-v6, the QNX Neutrino microkernel for an ARMv6 processor.
The support for ARMv6 architecture processors (ARM11, OMAP2) is now provided by:
The procnto-v6 microkernel takes advantage of the ARMv6 MMU's physically-tagged cache to remove the 32 MB address space restriction imposed by the previous ARM MMU architecture. Note the following:
The default configuration for qcc still causes the compiler to generate ARMv4 instructions. This ensures that all code compiled with this configuration can run on any supported ARM processor.
Additional compiler flags are required to instruct the compiler to generate ARMv6 instructions. Using such code on non-ARMv6 processors may cause undefined instruction exceptions (generating a SIGILL signal).
The libstartup CPU detection and configuration process includes the following changes since the 6.3.0 release:
Purpose: The armv_chip structure describes the configuration for a particular CPU.
The armv_list[] array defined in armv_list.c contains a list of all supported CPUs, and the arm_chip_detect() function iterates through this array to match bits 15:0 of the ID register.
A BSP can override the library's armv_list.c to provide a customized list of supported CPUs, for example to specify armv_chip structures that aren't implemented in libstartup, or to restrict the list to the processor(s) implemented by the target board.
If no power callout is specified, the kernel's idle loop simply busy-loops, and the sysmgr_cpumode() call fails with ENOSYS.
The flush callout is used to flush the cache and TLB when unmapping a page. This is called for each page in a region being unmapped.
The deferred callout is used after all pages in a region have been unmapped, and can be used to perform any actions that were not performed by the flush callout.
For example, if the MMU doesn't support flushing the instruction cache by virtual address, the deferred callout can be used to flush the instruction cache after all pages have been unmapped to reduce the cost of flushing.
If you specify the -wa option, the pte_wa configuration is used. If the CPU does not support write-allocate caching, set this to 0, and the default pte values will be used instead.
If you specify the -wb option, the pte_wb configuration is used. If the CPU doesn't support write-back caching, set this to 0, and the default pte values will be used instead.
If you specify the -wt option, the pte_wt configuration is used. If the CPU doesn't support write-through caching, set this to 0, and the default pte values will be used instead.
Purpose: The armv_cache structure describes the CPU caches.
If the CPU does implement the CP15 cache-type register, set this to 0, so that the startup library will use arm_add_cache() to determine the cache register configuration based on the CP15 cache-type register.
Purpose: The armv_pte structure describes the MMU page table encodings.
Purpose: The setup() function performs any CPU-specific initialization.
For ARMv6, there is a generic function, armv_setup_v6(), that performs generic ARMv6 initialization:
The armv_setup_v6() function must be called by any CPU-specific setup function for an ARMv6 CPU after it has performed its CPU-specific actions.
The ARMv6 procnto-v6 removes the 32 MB process address space limit:
The procnto-v6 microkernel doesn't implement the ARM-specific global memory region implemented by the non-ARMv6 procnto. This means that shm_ctl() no longer has any ARM-specific special behavior. The shm_ctl() function exhibits the following:
In contrast, the non-ARMv6 behavior causes some mappings to be placed in the global memory region, and these global mappings aren't inherited across a fork() function.
This differs from the non-ARMv6 procnto behavior, where the mapping is placed in the global address space to allow mappings that would not otherwise fit into the 32 MB per-process address space.
In order to achieve the same effect as the non-ARMv6 SHMCTL_GLOBAL, it is necessary to assign a known address to the mapping, then use MAP_FIXED to mmap() the shared memory object at the same address in each process. You can do this in a number of ways:
All mappings are created in the per-process address space with user mode access protections.
For the non-ARMv6 procnto, these flags are used to optimize performance by indicating that the mapping doesn't require updating the global memory page tables on context switches to implement per-process protection.
If code must run on both ARMv6 and non-ARMv6 processors, you must check the __cpu_flags value at runtime to select the correct implementation. For example:
if (__cpu_flags & ARM_CPU_FLAG_V6) { /* * Code for ARMv6 processor only */ } else { /* * Code for non-ARMv6 processor only */ }
By default, qcc provides only ARMv4 instructions. This ensures that all compiled code will run on any supported ARM processor.
The ARMv6 processor introduces a number of new instructions that may provide performance benefits for certain code. For example, DSP algorithms can take advantage of the new media instructions.
This requires the correct gcc and binutils versions that implement ARMv6 migration:
There are a number of ways you can optimize ARMv6 operations:
If you're using the QNX recursive Makefile structure:
CCFLAGS += -march=armv6
CCFLAGS_<name>_arm = -march=armv6 CCFLAGS += $(CCFLAGS_$(basename $@)_$(CPU))
The object files, libraries, and binaries that are compiled to use ARMv6 instructions can only run on a target with an ARMv6 CPU. On a non-ARMv6 CPU, this causes an undefined instruction exception (SIGILL signal) or may result in unpredictable behavior. |