DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

How does AI transform chaos engineering from an experiment into a critical capability? Learn how to effectively operationalize the chaos.

Data quality isn't just a technical issue: It impacts an organization's compliance, operational efficiency, and customer satisfaction.

Are you a front-end or full-stack developer frustrated by front-end distractions? Learn to move forward with tooling and clear boundaries.

Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.

Related

  • From Code to Customer: Building Fault-Tolerant Microservices With Observability in Mind
  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Mutation Testing: The Art of Deliberately Introducing Issues in Your Code
  • Architecting for Resilience: Strategies for Fault-Tolerant Systems

Trending

  • Lessons Learned in Test-Driven Development
  • What is Microsoft Fabric for Azure Cloud (Beyond the Buzz) and How It Competes with Snowflake and Databricks
  • What They Don’t Teach You About Starting Your First IT Job
  • Jakarta WebSocket Essentials: A Guide to Full-Duplex Communication in Java

Debugging ARM Cortex-M0+ Hard Faults

If you're hittingg hard faults with your ARM Cortex-M0+, it could be a problem with unaligned flash blocks. Eliminate your variables and narrow your scope when debugging.

By 
Erich Styger user avatar
Erich Styger
·
Jan. 16, 17 · Tutorial
Likes (0)
Comment
Save
Tweet
Share
7.1K Views

Join the DZone community and get the full member experience.

Join For Free

To me, one of the most frustrating things working with ARM Cortex-M cores are the hard fault exceptions. I have lost several hours this week debugging and tracking an instance of a hard fault on an ARM Cortex-M0+ device.

For background, I’m porting a project (NXP Kinetis KW40Z160) to Eclipse and the GNU tool chain for ARM. The application seems to run fine after downloading it with the debugger, but it crashes with a hard fault if either I do a ‘restart’ with the debugger or if I do reset the microcontroller with a SYSRESETREQ (see How to Reset an ARM Cortex-M with Software).

Interestingly, it crashed during startup in the ANSI library, in the _init() function:

Next assembly step will cause a hard fault

The next assembly step will cause a hard fault

Pushing the registers in _init() will cause the hard fault exception:

triggered hard fault

Triggered hard fault

Interestingly, that same code with the same registers/stack/etc. works fine the first time, but it breaks down later, after the application is running.

The step was to check the usual suspects:

  • Checking the output of the hard fault handler? It didn't show anything usable.
  • Stack pointer not aligned? No, that’s fine, and it works with the same values the first time.
  • ARM/Thumb mode? Nope, all in Thumb mode.
  • Stack readable/writable? Yes, and no special protection or other kinds of things were used.
  • Interrupt priorities? Nope, it happens with interrupts disabled.
  • Is it a problem with the debugger? Nope, tried both Segger and P&E, same for both, and it happens with and without a debugger attached or used.

Well, that’s the point where I ran out of ideas. What remains is the desperate attempt to ask colleagues, which returned a similar list of points as above. So no progress.

Internet Help?

Next: search the Internet. Lots of other people have issues with hard faults, too. At least this revealed an interesting article about imprecise hard faults. Not only was it very interesting reading, but it helped me start digging deeper. Unfortunately, it only applies to Cortex-M3/M4 and not to the M0+. Anyway, I have added an option to the hard fault handler so I can deal with this better in other projects:

Hard Fault Handler with option to disable write buffer

Hard Fault Handler with the option to disable the write buffer

I still had no clue. What I ended up trying was to do a ‘binary’ search: disabling portions of the application to find out if there is something causing the problem. Yes, it's time-consuming, but that was my only option. So I turned on/off parts of the application to find out what part of the application was causing the problem.

And indeed, after hours of searching, I narrowed it down to the flash programming: The application is storing data internally by reprogramming a 1 KByte flash page, and for this, it erases a 1K block in flash. In the linker file, it was allocated like this:

  .nvm :
  {
    . = ALIGN(8);
    NV_STORAGE_START_ADDRESS = .;
    . += 1024;
    NV_STORAGE_END_ADDRESS = .;
  } > m_text


At first, erasing it seemed to work fine, but then the microcontroller would crash after a restart.

Looking closer at the linker file, I finally spotted the problem: It should have been aligned to the 1k Flash block size:

  .nvm :
  {
    . = ALIGN(1024); /* must be 1k aligned! */
    NV_STORAGE_START_ADDRESS = .;
    . += 1024;
    NV_STORAGE_END_ADDRESS = .;
  } > m_text


With this change, the problem went away

Summary

Dealing with hard faults on ARM is not easy. This particular one was caused by erasing a flash memory block that was not aligned. It seems to me that somehow the internals of the processor got screwed up when that happened. The challenge was that it then crashed in a way not very closely related to the cause of the problem. These kinds of problems are not easy to find and solve.

What usually works best, and what worked in this case, is trying to reduce the problem: Have a ‘working’ code and have a ‘failing’ version. Try to eliminate all variables (board, debugger, power supply, host machine) and then reduce the problem to narrow it down as much as possible.  In any case, I hope this post can help others. And yes, some luck helps accelerate that process.

Happy debugging!

Links

  • Resetting ARM with software: https://mcuoneclipse.com/2015/07/01/how-to-reset-an-arm-cortex-m-with-software/
  • Debugging Hardfaults: https://mcuoneclipse.com/2012/11/24/debugging-hard-faults-on-arm-cortex-m/
  • Hard Fault Debugging Helper: https://mcuoneclipse.com/2012/12/28/a-processor-expert-component-to-help-with-hard-faults/
  • A way to detect imprecise hard faults: https://community.nxp.com/docs/DOC-103810
Fault (technology) Arm (geography)

Published at DZone with permission of Erich Styger, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • From Code to Customer: Building Fault-Tolerant Microservices With Observability in Mind
  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Mutation Testing: The Art of Deliberately Introducing Issues in Your Code
  • Architecting for Resilience: Strategies for Fault-Tolerant Systems

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: