So onwards and upwards (and back down the other side) .... Time to have a go at making the LED strips that will provide the display surface for the game. I'd
already decided I wanted each game strip to be made up of three 5 metre, 150 LED strings mounted side by side and I thought that it might
be nice to stagger the centre strip, placing each of its LEDs midway
between the pairs of LEDs on either side. I thought that this might
help give an illusion of denser pixels and also make it easier to
animate the chevron style shapes I've been thinking about for the
game.
I did some eBay surfing
looking for suitable construction materials and found an adhesive
backed roll of 5mm thick, 75mm wide, solid neoprene rubber tape..
This seemed perfect for mounting the 3 LED strips side by side on the
sticky side with the remaining exposed adhesive and LED strips with a
layer of clear sticky tape. After a bit more searching I decided to
try some signmakers masking tape (100mm wide low-tack adhesive paper
tape) instead since I thought this would give a nice diffusion of the
LED colours.
Sticking down the LED
strips went really well. I didn't remove their backing tape since
there wasn't much point, they stuck down fine with the backing still in place
and it keeps me a few options open if it all went wrong. Must confess I
did get in a bit of an angry mess with the paper tape, which was
frustratingly difficult to lay down on top of the adhesive in one go and was all too easy to crease or tear. I did end up with some joins,
which I wasn't too happy about, but after applying a layer of clear
sticky tape to the whole strip it didn't look so bad after all, and
once I fired up one of the strips the effect was actually pretty
good!
I applied themultiplexing approach I described before, using a modified version of
the Adafruit Neopixel Library running on an Arduino Uno. I had
replaced my original Toshiba 4514 multiplexer chip with a higher
speed Texas Instruments CD74HC4514EN and it seemed to work fine
driving the 3 strips together. I had fun trying some nice particle
system sketches before having a go at some graphics for the Hammer
Pong game.
I tried animating a “puck”
shooting along the strip, which all seemed to work fine.... except I
was a little bit disappointed with the maximum speed I was getting. I
could not see any problems in the code, so I got out the calculator:
150 pixels per strip
x 24 bits per pixel
x 3 strips
x ~1.25us per bit
= ~13.5ms to refresh
all three strips
= ~74 frames per second
Hmm....
OK, for video, 74fps
would be pretty good! However moving the puck one pixel at a time
means the fastest it can tun the length of the 150 pixel strip is
about 2 seconds. It gets even worse if we add in the other strip the maximum frame update
rate would be halved, and the distance doubled.. That would mean 8 seconds for the
puck to reach the opposing player if it moved 1 pixel distance per
frame. A bit slow. Hmmmmmm...
Yeah of course this is
easily solved by making the puck image move more than one pixel between
frames, which would be the usual way of doing things. The problem is
that the LEDs are very bright - just like I want them to be - and
persistence of vision effects make position jumps between the frames really obvious - you see the image frozen in several locations
along the strip, spoiling the sense of fluid movement. I really want
single pixel per frame motion to make it look smooth, dang!
So what to do about
it? Well the WS2812 protocol sets some base restrictions: The data
rate is fixed at 800kbps, so we cannot update faster than 1.25us per bit. Also
we have to refresh the entire strip at once due to the serial nature
of the load operation (we can't just load pixels that have changed
and leave the others). So updating a 150 LED strip will always take a minimum of 4.5ms and
there is nothing we can do about that (other than cutting the strip
into smaller lengths and addressing them separately maybe... but I don't
want to go there!).
But, we do have the
possibility of loading the data to all the strips in parallel - so
there is no specific reason why we can't load all 6 strips in the
same 4.5ms cycle. So, what could stop us doing this?
Well firstly we will
need to render all the data into a memory buffer before the strip
update (at this data rate we will not have the time to render images
on the fly) . Yikes.. thats going to be a lot of memory (by
microcontroller standards).. 6 strips x 150 pixels x 3 bytes per pixel
= 2700bytes... already more than the 2k RAM on the Atmega328
microcontroller (sad face)
We could reduce this
using a lookup table (“palette”) of colour values and storing
the palette index for each pixel instead of the 24 bit RGB colour.
Lets say we have an 8 bit palette index (up to 256 colours) with
perhaps 64 colours actually defined in that 8 bit colour-space.
6 strips x 150 pixels x
1 byte per pixel = 900bytes
plus palette; 64
colours x 3 bits per colour = 192bytes
=1092 bytes total
This is much more
doable - and we can save more memory by avoiding storing the palette
in RAM.. e.g. by using PROGMEM data stored in the much more spacious
32k FLASH. However the next problem is processing speed...
To be honest I have
never had to be so concerned about performance at this level before,
ever. But when we are talking about bit-banging at 800kHz every CPU
cycle counts. For an ATMEGA328 running at 16Mhz each clock cycle is
62.5ns. Now that *is* pretty fast, but we have to bang these bits
pretty fast too. Reading the assembly language code for the Neopixel
library really shows how careful the timing of this stuff needs to be.
However, if we can
update a single strip by writing LOW and HIGH byte values to an 8 bit port
register at this data rate, there is really no reason we cannot
update 6 strips (or even 8 - one for each port bit) at the same time ithout breaking
a sweat.. same number of bits to load, right? The extra overhead will
be preparing the next port byte value, where we'll need to load data from 6 different memory addresses (one per strip) instead of just one - and might need an palette lookup
for each one too.
Reading from this
really useful blog post (http://cpldcpu.wordpress.com/2014/01/14/light_ws2812-library-v2-0-part-i-understanding-the-ws2812/) we do have up to 9us of idle time to play
with between bits. This might make it all possible
on an ATMEGA328 with some tight assembly language code, but you know
what, maybe its time to look at something with a bit more grunt. Like
an ARM board....
I've ordered an Arduino
Due as a start. I may yet try to get it working on the Uno, but the comparatively huge amount of memory and much faster CPU speed of the Due
does cure a few headaches. Just need to wait for it to arrive now -
watch this space.....
The BeagleBone Black's Cortex processor has a built-in feature for these kinds of problems. Inside the MCU, there are two 32-bit Programmable Real-time Units(PRU) that run a small assembly instruction set designed for interfacing with hardware. Because they run alongside the main processor, you don't have to worry about it eating up clock cycles or missing a beat.
ReplyDeleteThere are already a few libraries out there for driving WS281x strips with these PRU's with example projects driving over 500 meters of LED strips. See http://trmm.net/Category:LEDscape and http://www.nycresistor.com/2013/07/27/ledscape/
Interesting to know.. Thanks! I am not familiar with the Beaglebone but had looked at the PSOC4 for similar reasons. On this project I have it driving 6 strips in parallel over 1 update cycle using a C language routine on an Arduino Due, although interrupts are disabled during updates which caused some head scratching with the intended MIDI/Serial comms
DeleteYeah, trying to bit-bang multiple timing sensitive protocols on a single processor can get really tough.
DeleteBecause each of the processes has its own necessary running and idle time, you can usually fit in your own code during the breaks. When you have to deal with multiple protocols, though? Unless you can get them to synchronize, they end up overlapping and failing.
Have you tried offloading the device communications to a separate device while keeping the Arduino for the big picture processing?
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete