Optimizing Arduino Program Storage Space

Optimizing Arduino Program Storage Space

0x00 Abstract

When we generally develop Arduino programs, we write code based on functional requirements. Once the program meets these requirements after testing, development stops, and we only modify the code when functional requirements change or bugs are found. In real project development, besides ensuring functionality and bug-free code, we also require continuous improvement in program execution efficiency. Specifically, the requirements are as follows:

(1) The program and data should occupy as little device storage space as possible, whether in ROM or RAM;

(2) The execution speed of the program should be as fast as possible, with the premise of ensuring the accuracy and stability of program execution;

(3) Reduce the overall power consumption of the system, that is, consume as little electricity as possible to ensure energy savings;

In simple terms, we hope to execute the program using the least device storage space while consuming the least power, making our products simpler, cheaper, and more reliable. In this tutorial, we will use the low-level code of atmega2560 to achieve the same functionality as before, ensuring that the program occupies less space and runs faster, as commonly used Arduino functions are just wrappers around these low-level functions for convenience.

After code optimization, the program’s space usage can be reduced by about 50%. This tutorial will also help us better understand the underlying implementation mechanism of Arduino programs.

0x01 View Blink Program

The first program developed on Arduino is usually the blink program that controls the LED on pin D13 to flash. The source code is as follows:

// the setup function runs once when you press

// reset or power the board

void setup() {

// initialize digital pin LED_BUILTIN as an output.

pinMode(LED_BUILTIN, OUTPUT);

}

// the loop function runs over and over again forever

void loop() {

// turn the LED on (HIGH is the voltage level)

digitalWrite(LED_BUILTIN, HIGH);

delay(1000);

// turn the LED off by making the voltage LOW

digitalWrite(LED_BUILTIN, LOW);

delay(1000);

}

This source code shows the entire process of blinking. First, in the setup() function, we initialize and configure pin 13 as an output. Then, in the loop() function, we continuously change the output voltage. Setting it to high lights the LED, while setting it low turns it off, with a delay function added to allow us to see the state changes. Without the delay, the program would run at full speed, and we would only see the LED in the on state.

Optimizing Arduino Program Storage Space

Next, let’s take a closer look at the compilation log of this source code:

Optimizing Arduino Program Storage Space

It can be seen that the compiled source code occupies 1462 bytes, while the maximum available Flash space for Arduino Mega2560 is 253952 bytes, 253952/1024 = 248 KB. Below is the statistical information for the Arduino Mega2560 development board for better understanding:

Optimizing Arduino Program Storage Space

The difference between Flash Memory and EEPROM is that Flash Memory is a type of long-life non-volatile memory (it can retain stored data even without power). Data is deleted not byte by byte but in fixed blocks (note: NOR Flash stores bytes). Block sizes typically range from 256KB to 20MB. Flash Memory is a variant of electronically erasable programmable read-only memory (EEPROM), with EEPROM allowing byte-level erase and rewrite rather than entire chip erasure, while most Flash Memory chips require block erasure.

0x02 Optimize pinMode() Function

According to the compilation log, the original blink program’s compiled binary size is 1462 bytes. Writing a program to control the LED blinking consumes such a large storage space, so for more complex functionalities, the program size will likely exceed Flash limits. Therefore, we need to minimize this size to enable writing larger programs and achieving more complex functionalities.

Let’s first check the size occupied by the pinMode() function. By commenting it out and recompiling, we find that the binary file size is only 1384 bytes, which is 80 bytes smaller than the original 1462 bytes.

Optimizing Arduino Program Storage Space

Let’s take a look at how the pinMode() function is implemented. The implementation source code is in the Arduino IDE software directory:

~/Software/arduino-1.8.4/hardware/arduino/avr/cores/arduino/wiring_digital.c:

void pinMode(uint8_t pin, uint8_t mode)

{

uint8_t bit = digitalPinToBitMask(pin);

uint8_t port = digitalPinToPort(pin);

volatile uint8_t *reg, *out;

if (port == NOT_A_PIN) return;

// JWS: can I let the optimizer do this?

reg = portModeRegister(port);

out = portOutputRegister(port);

if (mode == INPUT) {

uint8_t oldSREG = SREG;

cli();

*reg &= ~bit;

*out &= ~bit;

SREG = oldSREG;

} else if (mode == INPUT_PULLUP) {

uint8_t oldSREG = SREG;

cli();

*reg &= ~bit;

*out |= bit;

SREG = oldSREG;

} else {

uint8_t oldSREG = SREG;

cli();

*reg |= bit;

SREG = oldSREG;

}

}

From this function, we can see that pinMode simply sets a specific bit of a port to 1. Next, we need to clarify which bit D13 is connected to on which port. We can follow the code and eventually find in~/Software/arduino-1.8.4/hardware/arduino/avr/variants/mega/pins_arduino.h that PWM13 is connected to which port:

const uint8_t PROGMEM

digital_pin_to_port_PGM[] = {

//PORTLIST

//—————————–

PE , // PE 0 ** 0 ** USART0_RX

PE , // PE 1 ** 1 ** USART0_TX

PE , // PE 4 ** 2 ** PWM2

PE , // PE 5 ** 3 ** PWM3

PG , // PG 5 ** 4 ** PWM4

PE , // PE 3 ** 5 ** PWM5

PH , // PH 3 ** 6 ** PWM6

PH , // PH 4 ** 7 ** PWM7

PH , // PH 5 ** 8 ** PWM8

PH , // PH 6 ** 9 ** PWM9

PB , // PB 4 ** 10 ** PWM10

PB , // PB 5 ** 11 ** PWM11

PB , // PB 6 ** 12 ** PWM12

PB , // PB 7 ** 13 ** PWM13

PJ , // PJ 1 ** 14 ** USART3_TX

PJ , // PJ 0 ** 15 ** USART3_RX

PH , // PH 1 ** 16 ** USART2_TX

PH , // PH 0 ** 17 ** USART2_RX

PD , // PD 3 ** 18 ** USART1_TX

PD , // PD 2 ** 19 ** USART1_RX

PD , // PD 1 ** 20 ** I2C_SDA

PD , // PD 0 ** 21 ** I2C_SCL

PA , // PA 0 ** 22 ** D22

PA , // PA 1 ** 23 ** D23

PA , // PA 2 ** 24 ** D24

PA , // PA 3 ** 25 ** D25

PA , // PA 4 ** 26 ** D26

PA , // PA 5 ** 27 ** D27

PA , // PA 6 ** 28 ** D28

PA , // PA 7 ** 29 ** D29

PC , // PC 7 ** 30 ** D30

PC , // PC 6 ** 31 ** D31

PC , // PC 5 ** 32 ** D32

PC , // PC 4 ** 33 ** D33

PC , // PC 3 ** 34 ** D34

PC , // PC 2 ** 35 ** D35

PC , // PC 1 ** 36 ** D36

PC , // PC 0 ** 37 ** D37

PD , // PD 7 ** 38 ** D38

PG , // PG 2 ** 39 ** D39

PG , // PG 1 ** 40 ** D40

PG , // PG 0 ** 41 ** D41

PL , // PL 7 ** 42 ** D42

PL , // PL 6 ** 43 ** D43

PL , // PL 5 ** 44 ** D44

PL , // PL 4 ** 45 ** D45

PL , // PL 3 ** 46 ** D46

PL , // PL 2 ** 47 ** D47

PL , // PL 1 ** 48 ** D48

PL , // PL 0 ** 49 ** D49

PB , // PB 3 ** 50 ** SPI_MISO

PB , // PB 2 ** 51 ** SPI_MOSI

PB , // PB 1 ** 52 ** SPI_SCK

PB , // PB 0 ** 53 ** SPI_SS

PF , // PF 0 ** 54 ** A0

PF , // PF 1 ** 55 ** A1

PF , // PF 2 ** 56 ** A2

PF , // PF 3 ** 57 ** A3

PF , // PF 4 ** 58 ** A4

PF , // PF 5 ** 59 ** A5

PF , // PF 6 ** 60 ** A6

PF , // PF 7 ** 61 ** A7

PK , // PK 0 ** 62 ** A8

PK , // PK 1 ** 63 ** A9

PK , // PK 2 ** 64 ** A10

PK , // PK 3 ** 65 ** A11

PK , // PK 4 ** 66 ** A12

PK , // PK 5 ** 67 ** A13

PK , // PK 6 ** 68 ** A14

PK , // PK 7 ** 69 ** A15

};

Following the code may be a bit troublesome. The simplest way is to refer to the Arduino Mega2560 pinMap, which shows that D13 (also called PWM13) is connected to bit 7 of Port B. Below is a complete pinMap diagram:

Optimizing Arduino Program Storage Space

Setting the direction of an I/O pin on Atmel AVR is quite simple. Each pin belongs to a port, and each bit in an I/O port can be either input or output. The direction of each individual pin is determined by the bit in the associated data direction register (DDRx).

Thus, we can directly set the bit of this register to configure D13 as output mode using a macro definition bitSet(value, bit) implemented in:~/Software/arduino-1.8.4/hardware/arduino/avr/cores/arduino/Arduino.h:

Optimizing Arduino Program Storage Space

By replacing pinMode with bitSet, we find that the binary program size is reduced by 78 bytes. BitSet only occupies 2 bytes. Imagine if there are 10 pinModes in the program, using bitSet could save 780 bytes instantly. When this program is uploaded to the Arduino Mega2560 board, the effect is the same as pinMode, but the Flash space used is reduced by 78 bytes:

Optimizing Arduino Program Storage Space

0x03 Optimize Output Pin Code

In the blink program’s loop(), we light up the LED by setting D13 to high, wait for 1 second for visibility, then set D13 to low to turn off the LED, and wait for another second to observe the off state. This cycle creates the blinking effect. We can see that this part of the code is clear and easy to read, but the implementation is somewhat clumsy.

The AVR chip was designed with the need for toggling pins in mind. The input pins address register (PINx) allows us to toggle the output pin status by writing 1 to the corresponding bit. This means that if the current DDRB bit 7 is high, writing 1 to PINB bit 7 will set DDRB bit 7 to low, and writing 1 again will set it back to high. By writing 1, we can continuously toggle the corresponding DDRB bit’s state.

We can optimize digitalWrite(LED_BUILTIN, HIGH/LOW) to bitSet(PINB, 7) to directly manipulate the register, reducing the binary program storage size to 808 bytes, which is a reduction of 576 bytes. This is quite significant, indicating that the implementation of digitalWrite() is very storage-intensive. If there are 10 digitalWrites in the program, using bitSet could save 2.8K of storage space:

Optimizing Arduino Program Storage Space

0x04 Optimize Delay Function

(1) If strict timing of 1 second is not considered, a simple way to add a delay is to use a for loop. The code is as follows:

Optimizing Arduino Program Storage Space

We can see that the program’s storage space is now 660 bytes, which saves 148 bytes compared to using the delay() function. However, when this code is uploaded to the Arduino Mega2560 board, the LED does not show the on-off phenomenon, indicating that this delay function is ineffective. Even when the loop count is increased from 30000 to 300000, the delay effect is still not observed. So, what is the reason?

The compiler recognizes that the for loop is empty and determines it as unnecessary code, optimizing it out of the final program during compilation. The compiler’s role is not only to compile your source code but also to optimize the execution speed of your code, removing what it considers redundant.

If you really want to use such an empty loop to increase the delay, you need to use a flag to explicitly inform the compiler not to optimize this empty loop. This flag is the keyword volatile, which tells the compiler not to make any assumptions about this variable and not to optimize it away.

Optimizing Arduino Program Storage Space

(2) To accurately delay for 1 second, we can use Timer0, which starts running after power-up. The interrupt handler function TIMER0_OVF_vect associated with this timer increments an unsigned long variable timer0_millis. This timer generates an interrupt every 1ms, incrementing timer0_millis by 1, allowing us to track how many milliseconds have passed. If you do not reset this unsigned long variable, it will eventually overflow. Let’s check the value range of basic data types in Arduino:

Optimizing Arduino Program Storage Space

The maximum value for timer0_millis is 4294967295. If it increments by 1 every 1ms, it will take:

Total seconds = 4294967295/1000 = 4294967.295

Total hours = 4294967.295/3600 = 1193.046470833

Total days = 1193.046470833/24 = 49.710269618

This means that if the Arduino Mega2560 is powered on now and timer0_millis starts incrementing by 1 every 1ms, it will take about 50 days for this counter to overflow from 4294967295 back to 0, then continue counting.

Therefore, if your program uses this timer0_millis value, be aware that your program may encounter errors about every 50 days, such as when using functions like delay() or millis(), as these functions rely on this timer0_millis. millis() returns the total time the program has been running since power-up, which is essentially the value of timer0_millis, but remember this value will cause errors after about 50 days.

Next, let’s modify the delay function in the blink program to use timer0_millis for timing. Since timer0_millis is already defined in~/Software/arduino-1.8.4/hardware/arduino/avr/cores/arduino/wiring.c, we need to declare it as an external variable in our local code, as shown below:

Optimizing Arduino Program Storage Space

Note that modifying the value of timer0_millis will cause millis() and delay() to function incorrectly, but this does not affect our current experiment. To summarize, the storage space has been reduced from the original 1462 bytes to only 702 bytes after optimization:

Space reduction relative to the original = (1462-702)/1462 = 52%

0x05 Reference

[1]. Arduino Mega2560[OL].

https://www.arduino.cn/thread-17938-1-1.html

[2]. Dale Wheat, translated by Weng Kai. Arduino Technology Insider[M]. Beijing: People’s Posts and Telecommunications Publishing House. 2015. 83-112.

[3]. Flash Memory Concept[OL].

https://baike.baidu.com/item/%E9%97%AA%E5%AD%98/108500?fromtitle=flash%20memory&fromid=3740729&fr=aladdin

[4]. Arduino Mega2560 Pin Mapping[OL].

https://www.arduino.cc/en/Hacking/PinMapping2560

[5]. Basic Data Types in Arduino[OL].

https://www.cnblogs.com/lulipro/p/7672954.html

0x06 Feedback

If you have any questions while following the tutorial, please follow the official WeChat account of ROS Classroom and send me a message to provide feedback. I will handle messages daily! Of course, if you happen to want to tip ROS Classroom, I would greatly appreciate it. A tip of 30 yuan will also invite you into the ROS Classroom WeChat group to learn and communicate with more like-minded partners!

Leave a Comment

Your email address will not be published. Required fields are marked *