“Please help me do this on an FPGA”

The question you shouldn’t ask!

A common refrain on many of the internet’s finest help forums and newsgroups is “I’m trying to do x using an FPGA, help!” And very often “x” is a task which would be more optimally (by many different measures!) be performed in another way. But there is a common assumption that if a task is “intensive” then FPGAs are the answer. One recent example was asking how to implement face-detection in FPGA. It quickly became apparent the the poster didn’t actually know how to perform face-detection at all, so adding FPGAs to the equation was not a great help!

For a quick answer to the question “why use an FPGA?”, I’ll reproduce this list that I used in a lecture to a class of undergraduates:

Use an FPGA if one or more of these apply:

you have hard real-time deadlines (measured in μs or ns)
you need more than one DSP (lots of arithmetic and parallelisable)
a suitable processor (or DSP) costs too much (money, weight, size, power)

And for students, there’s one more:

Because your assignment tells you to :) Although ideally it’ll be something that is at least representative of a reasonable FPGA task (not a traffic light controller or vending machine!)

What to do instead?

Software’s easy

Writing software, I’d hazard a guess that even amongst embedded software engineers (those that work at the really low level, not writing code for PCs) many of them don’t really know what their target processor looks like under the hood. They just click compile, wait maybe 10 seconds, and test. And that’s great, it makes for a very productive development environment. When you are sufficiently abstracted from the architecture, there’s enough performance from the tools and chips that you don’t (often) have to think hard about how to implement things, you can just get on with the interesting bit, creating your application.

FPGAs hurt

In comparison, FPGAs are painful to use – don’t get me wrong, the software tools and the silicon architectures have improved massively over the last few years – but compared to writing software, it’s a completely different realm. You have to be much more aware of the architecture of your device, know much more about how the tools operate, wait ages for them to run (and think fundamentally differently about algorithms and implementations). FPGA code takes tens of minutes to compile, and it’s much easier to push up against the performance limits, and then have to mess around with your code to make it more recognisable to the tools you are using.

Choosing an implementation

My advice is always “Avoid using an FPGA unless you have to”. And I say this as a great advocate of FPGAs!

If you can do it in Octave or Matlab on a PC, do so. In fact, even if you end up somewhere else, start from there so you can understand the problem properly.

If you don’t have enough processing power, make use of a GPU.

If that solution costs too much (in money, power, size, weight terms) then you’ll have to get cleverer. Start thinking about microcontrollers. They’re well-tooled up and very powerful. You can have an 80MHz 32-bit ARM for a few pounds (or Euros. Or Dollars) these days, you can do an awful lot with that.

If you’re still struggling for processing power, think about a DSP. But be careful – analyse what you are trying to do very carefully. Figure out which bits will suit a DSP (lots of multiplying and adding in parallel with memory moving) – suddenly you have to know your architecture, just to decide if it’s feasible. Be careful about memory bandwidth, caches are not magic and if your code requires data reads or writes that are randomly scattered about, expect to lose some performance.

The next stage on might be multiple DSPs… and once you start considering multiple DSPs, it might finally be time to think about an FPGA. The downside is you are responsible for so much more of the architecture. Floating-point maths is becoming a sensible option, but you’ll still want to look at the trade-off between development time using floating point and the device-size cost and power savings that come from using a fixed point implementation. You can take advantage of your knowledge of data access patterns to tune the memory controller – in fact you’ll probably have to – yet more grovelling around in the details. Add to this, the fact that it’s a lot harder to hire good FPGA people than DSP people (and they are harder to get than microcontroller people), and help on the internet can be harder to come by. You development time will lengthen as you build simulation models of the hardware you are talking to and have to debug them. And the hour-long build times will try your patience.

But if you have good reason to, go for it!

FPGAs are really well suited to

many image and radar processing tasks especially when cost, power and space constrained. (Disclosure: I wrote the first article)
financial analysis (when time constrained)
seismic analysis (lots of money at stake, the faster you process, the more processing you can do, and the less risk to your drilling)
hard-real-time, low-latency deadlines – single-digit microsecond response times to stimulus. See the second page of this flyer – I’ve worked on this project too.

14 comments

I saw on StackOverflow that you accomplished a user-space access of a PCI device. I am trying the same thing. Would you have time for some guidance?

Thanks,
Bruce

Martin says:

2013-12-21 at 11:12

Email reply sent :)

Reply

Hello,

i found your site while searching about image processing. I work with VHDL and my Diplom Thesis was a MPPT for Solar Systems, but i have no idea about Image Processing.

Can you please give out sources about tutorials and books?

I will be very helpfull.

Best regards,
Mohsen Abbasi

hi martin!
i think you’ve got a good point on why avoid fpga’s.
however, here in our school, we are tasked to create a standalone device that we’d use fpga as our processor. see, our device is all about hand gesture to speech and speech to text device. our adviser provided us to use fpga since it could support stand alone projects, especially on its memory part.
my concern here is that, we are trying to use simulink because it has some convert to hdl method on it… however, do you think that this project is feasible?

Martin says:

2014-08-10 at 14:29

There are other ways to create standalone devices than FPGAs. Converting a hand gesture to speech is possibly within the capability of an Arduino or BeagleBone, depending on how you are detecting the hand gesture…

Simulink to HDL is a usable flow, but it’s not magic, you will need to completely understand what you are trying to do and how you are going to do it before you can complete the project… just like with any other tools!

Reply
1. khin says:
  
  2015-03-14 at 21:31
  
  I used sparten 3e starter kit.i want to DISPLAY the PWM frequency output on Oscilloscope?which outputpin i have to used ?I used J1 and J2 .the signal is distorted .why? How ?
  
  Reply

hello, im making a face detection project in vhdl. im accessing an array three times in one process, it takes too long to synthesize. do you have any suggestions?

Martin says:

2015-04-21 at 16:42

Accessing an array multiple times shouldn’t be a problem, unless it’s big and therefore can’t fit in a RAM block. If you need big arrays, you need to use RAM blocks and then be careful to only address them as many times as the physical hardware allows in each clock cycle. If you only want multiple reads, you could potentially parallelise many blocks with the same data in, but that is usually not the best way to do it. You need to look at your algorithm very carefully and make sure it fits well with the target architecture.

Writing efficient code for FPGAs (much more than processors except in extremely constrained environment) requires great understanding of the low level architecture.

Reply

hello, im making a face detection project in vhdl. i am accessing the same array 4 times in one process, it is taking ages to synthesize. what should i do?

I used sparten 3e starter kit.i want to DISPLAY the PWM frequency output on Oscilloscope?which outputpin i have to used ?I used J1 and J2 .the signal is distorted .why? How ?

I enjoyed the post but the final link to second page of this flyer has rotted.

Martin says:

2015-05-21 at 18:52

Thanks for pointing that out – my employer has had a re-arrange of everything… I shall have to get them to add some redirects!

Reply

Hi Martin,
Interesting thoughts on FPGA. I cannot argue with your thoughts on when (or when not) to use them.

The thing that I think is an interesting point is that if you have an function to deliver
1. the the software engineer decides when to start the function and step through it and get the result.
2. the hardware engineer just lets the function run continually and looks for the function result whenever the result is needed.

Subtle difference but fundamental.

Martin says:

2022-02-02 at 18:33

Indeed… this is an often overlooked point. It becomes second nature to hardware-types very quickly and then they cannot understand the questions of software-types which comes from the other context :)

Reply

Why use an FPGA?